Digital Archive Sabbatical

This blog is for anyone interested in or experienced with digital archives and institutional repositories, especially in science and technology libraries.

Sunday, July 24, 2005

Fedora discussion

Friday the DRC-Dev team discussed the underpinnings of Fedora, the platform selected for creating the Digital Resource Commons. The presentation method itself was interesting, using Elluminate software available through the Ohio Learning Network (OLN) to push information on a whiteboard to remote users and allow discussion and text messaging among participants. The process was definitely successful.

We learned about Fedora's digital object model, with four components: a DOI or handle as identifier; methods to disseminate or view the object; content; and system metadata. And there are four content types: managed, external (like a URL), redirects to other sites, and XML. The overall architecture consists of an interface (web service plus OAI provider); application logic in Java; storage, a relational database management system (RDBMS).

There are more acronyms and terms to learn and/or review:
digital object serialization defined by XML schema
extensible = can associate services with objects
extensible object model
DOI and handle
OAI-DC
SOAP-based versus web-based
web service
server container package

University of Virginia is deveping for digital archive application. See http://www.lib.virginia.edu/digital/resndev/fedora.html .
Their archive using Fedora? http://www.lib.virginia.edu/digital/collections/image/

A May 2005 Users Group conference hosted 110 implementers with objects as diverse as streaming data from a temperature sensor.

Wednesday, July 20, 2005

Faculty presentation

Today I presented a brief description of my academic leave activities to the University Libraries Faculty. The information is summarized in a power-point presentation.

Tuesday, July 19, 2005

USC Digital Archive

Today I reviewed the USC Digital Archive. It's been redesigned! I remember serving on the committee that started the redesign back in the fall. They have gone live with it. You can browse collections, and search across collections. You can even search within specified collections with the Advanced Search. You can scroll through images. You can see the metadata for the images. And of course you can get a larger view of the images. It's fun to see something I was working on come to fruition!

Friday, July 08, 2005

DRC-Dev homework

I can see that keeping up with the DRC-Dev group will require some review on my part. After the conference call today I made a list of all those feisty acronyms and terms with which I need to be more conversant. Things like

  • Metabuddy
  • VRA 4
  • DC (Dublin Core) and FGDC
  • CEN
  • Getty Crosswalk
  • Shibboleth
  • DMAP (descriptive metadata application profile)
  • Luna crosswalk - CWA
  • other crosswalks - CDWA XML, etc
  • Handles and handle servers
  • RDF (resource description framework)

I covered many of these way back in October but they didn't "stick" yet.... We are still debating about multiple schemas and interfaces for users - how can we be complex but simplify for use? We are to look at interfaces of other places. We should stress the needed search functionality and design to that need.

Friday, July 01, 2005

DRC-Dev Team

On April 22 I wrote that it looked like OhioLINK would be using Fedora to build the Digital Resource Commons rather than Documentum. That's indeed what's happening. Fedora is an open access software developed at Cornell. (Interestingly, the Cornell librarians at ASEE reported they are not using Fedora at Cornell for their digital archive....) The University of Virginia is using Fedora as its platform though, and has done much development work.

Fedora out of the box is very "raw," a basic structure upon which to build. It is supposedly very robust, allowing capabilities not provided by other open access software such as DSpace. For example it allows not only cross-collection searching, but more importantly, the specification of object to object relationships. It enables mixing object types (videos, text, data sets, simulations, etc.) and varying uses (e-publishing, repositories, exhibitions, portfolios, etc.).

Peter Murray is heading the OhioLINK DRC-Development Team meetings. The conference calls are interesting, with participants from across Ohio, plus developers at OhioLINK itself. As Peter stated, the change from using a proprietary software such as Documentum to an open access softare such as Fedora signals a change in OhioLINK's approach to system development. OhioLINK members have become engaged in the development process via the DRC-Dev team. The project can be viewed at http://drc-dev.ohiolink.edu/wiki .

Today the DRC-Dev conference call struggled with the concept of object-specific application profiles and where the application profile for the metadata should reside - with the object? with a collection? Having different metadata sets for different types of objects means having different ingestion profiles for entering data into the system. This can get confusing, depending on who is doing the inputting. A trained team of "ingesters" is different than occasional inputting by faculty participating in a department repository. We want adequately to describe various objects while at the same time supporting global retrieval of all content.

I will be participating on the development team, lending what insight I can relating to

  • supporting teaching and research needs in terms of function
  • contributing content
  • helping to track relevant technologies
  • sharing ideas on what the DRC should look like
  • planning to use the DRC in my work.

The DRC will be the platform where I hope to put the engineering repository. It will be a while before it's ready....