Digital Archive Sabbatical

This blog is for anyone interested in or experienced with digital archives and institutional repositories, especially in science and technology libraries.

Tuesday, November 30, 2004

Winter

Here’s how you get winter in Los Angeles. You go up. The Saturday before Thanksgiving it rained and a cold front came through. The Sunday morning temperature dropped a whopping 5 degrees to 45. This did not daunt us northerners; son Eric and I set out for rock climbing in the San Gabriel Mountains. But just 45 minutes up Route 2, after passing rocks and other debris in the road from wind and rain, we encountered signs requiring chains or snow tires on drive wheels. We were not so equipped, so we parked the car and instead hiked five miles up the road to Mt. Wilson Observatory.

We must have reached an elevation of 5,700 feet; the temperature had dropped to 30 and the wind was relentless. It was snowy and icy underfoot, but we proceeded with determination in anticipation of warmth, a bathroom, and a tour of the observatory. Alas, the observatory was closed due to the weather. But the view was spectacular. The cold front had cleared the air, revealing a breathtaking panaroma of Los Angeles. We could see south to the ocean beyond San Pedro and Long Beach to the left of Palos Verdes, and the ocean beyond Santa Monica to the right of Palos Verdes. Even more remarkable, we could see the islands of Santa Catalina and maybe even San Clemente beyond Palos Verdes!

The day after Thanksgiving was warmer and the entire family drove to the Observatory, hoping for another panaromic view. But alas, the city lay under a white cover. Across the mountains far to the east however we could see the Cogshill Dam, where Eric and I bicycled two weeks before. We also got to see the 100” telescope which Hubble used in his pioneering work on the Big Bang and expansion of the universe.

Deep Dive

One day the week before Thanksgiving the user studies group watched a video of Deep Dive, an ABC production of Nightline with Ted Koppel. Deep Dive investigates the design process as illustrated by the company IDEO. The company was founded by Dave Kelley, a Stanford Mechanical Engineering professor, and creates 90 new products a year. They are experts on the design process. See their web site at www.ideo.com.

What intrigued the group at USC was the IDEO culture, which emphasizes creativeness to the point of zany, with a few guiding principles such as one conversation at a time in group discussion, focus, no critiques. IDEO would call their approach “focused chaos.” Employees are free to generate ideas and solutions within the confines of customer needs. One important concept is to fail often to succeed sooner.

The model of giving free rein to the creation of ideas and solutions is one that the user studies group hoped to apply to their work at USC, releasing their creative potential from the confines of a more structured process. It’s similar to the way The Incredibles was created, by investing in an excellent team and relying on their talent to produce the final product.

I was struck by the similarity between IDEO ground rules and techniques (such as posting sticky-notes with any and every idea and using team judges to sift through the ideas) and the strategic planning process recently used at UC’s University Libraries.

Friday, November 19, 2004

Helicopter

Helicopters warn that this cloistered house is an island in a Teeming City. Sometimes they are checking traffic on "the 10" which is close to the north. But when the search lights come on and pan the area over and over I know they are looking for something sinister. One evening as I walked home from the USC Tram stop in the dark, I noticed one circling over the house. Its beam almost hit me in the face. I rushed through the gate and felt relieved to close it behind me. Now (Friday night) I hear another chopper overhead. But no matter - I am safe in my Secret Garden.

Wednesday, November 17, 2004

SMART charts

Back to process and project management. Wayne was tasked with assessing all the archive projects (collections) in the offing and prioritizing them. There are about 30 of them. Today he showed me a technique called SMART Chart for developing his priorities. It comes from the business environment.

On the first sheet of an Excel workbook:

  • List in the left column relevant criteria or attributes, such as status of the metadata, status of the content, whether the collection was previously available in the legacy system, whether there will be copyright issues affecting publishing, and so forth.
  • Place the collections across the top row.
  • Rate the readiness of each collection based on the criteria using, in this case, a scale of 1 to 10, with 10 being the highest weight.

On the second sheet:

  • Match the criteria against each other by listing them in the left column and also across the top row.
  • In each intersecting cell give a 0-4 rating, where 0 represent "much less important" and 4 represents "much more important" and 2 is neutral - neither more nor less important
  • In cells where a criterion is matched against itself, enter a 2.
  • For the remaining cells, enter an importance rating. For example: is it more important or less important that the metadata are ready versus the collection has already been available (and users are accustomed to having access).
  • If a 3 is entered for A versus B, then a 1 is entered for B versus A so that both ratings total 4.
  • Add the ratings for each row (for each criterion) to get a total importance rating.

On the third sheet:

  • List the criteria in the left column, as in Sheet 1.
  • List the collections on the top row, as in Sheet 1.
  • Again match each criterion against each collection, this time multiplying the total criterion rating from Sheet 2 by the collection value on Sheet 1. (A macro can be written to do this.)
  • The column totals show the number of cumulated points for each collection, with the highest totals receiving the highest priority.

Voila! Your decisions are made!


Imaging

Today I tried my hand at using some of the fancy equipment in the imaging studio at USC. Matt Gainer and Giao (pronounced Yiao) Luong are digitizing USGS topo maps of California from the early 1900's. They have a Sinar P 4x5 camera on a huge 9' camera stand with a BetterLite Super 6K2 Scanback device in it. The camera faces straight down to a 32"x46" black platform that is large enough to handle the maps, posters and other large formats. BetterLite has its own scanning software (ViewFinder) that controls the camera and Scanback, captures the image, renders a tiff file, and sends it to a Mac with Adobe Photoshop for post-processing and quality control (cropping, rotating and converting etc.). KinoFlo daylight-balanced fluorescents (2 sets of 4Bank) shine on the black platform. A QP color target card is scanned with the image with a white/grey/black color scale to help assesss color and tonal values.

Once the image is created, it is examined for focus around the edges, saved, and reviewed to make sure the color is OK. (Careful color evaluation is not necessary for the topos. Images created for the USC Fisher Gallery collection on the other hand will require critical color evaluation.) The images are stored on portable hard drives (Firewire) until they are tfp'd to the staging server. (Portable hard drives make it easy to move work from one person and/or station to another.)

Renditions are later created on the staging server according to the specifications of the collection. Typical derivative renditions of the images include thumbnails, 256x256 jpegs for a quick view, 1024x1024 jpegs, and MrSID compressed images where zooming is required, such as with these topo maps. Tiffs are stored on tape as the archival digital images but not used in the public interface. The imaging process to create content runs parallel to the creation of metadata describing the images. Metadata is sometimes pre-existing, sometimes created from scratch, or sometimes migrated from Excel spreadsheets, as with the Seaver collection. Eventually the metadata and content are brought together. (Part of my work for the last two weeks has been using the ingest system to create metadata for the LA Examiner photos and adding links to the photo images.)


Not in Cincinnati

Yesterday morning as I was reaching "home" from my early morning run to USC and workout at the Lyon Recreation Center, I spotted a couple of school boys on top of a wall surrounding the yard of a nearby house. Their friends on the sidewalk were catching oranges which the boys were stealing from a tree that spread above the wall. Do orange trees grow in Cincinnati and bear fruit in November? Would I have been running in shorts?

Last night I performed with the USC Concert Orchestra, a campus community orchestra made up of non-music majors, faculty, staff, and a few music majors for extra support. The fun piece on the concert was Dvorak's New World Symphony. Elizabeth Pitcairn, violin soloist, played Zigeunerweisen. I was surprised at the size and strength of the orchestra. There must have been 30 violins! UC tried to start such an orchestra back in the '70's but it flopped. I noticed recently a sign on the board in Baldwin Hall seeking players to try again. In a university the size of UC it should be possible, but the infrastructure needs to be there. USC's infrastructure features a conductor who teaches conducting and works well with the students, access to rehearsal rooms and concert halls complete with stands and other equipment, 1 semester credit for students who participate, access to music faculty who sometimes do sectionals , and a student manager (TA perhaps?). Sharon Lavey started 7 years ago with a handful of players. Now there must be at least 80 if not more.

Sunday, November 14, 2004

Did I see rain?

I believe there were some drops this week, even a few on Saturday morning, the day of USC Homecoming! But it all cleared up for game time. I was advised to stay clear of campus during homecoming, so I went in the opposite direction - on a bike ride in the San Gabriel mountains with son Eric. We drove east of Pasadena on the 210 to Azusa and then headed north on 39 to the "trail head." The trail was actually a paved service road along the San Gabriel river leading about 8 miles up to the Cogswell dam. I thought I was weak from running in pancake-flat Los Angeles but on the return trip I realized that we had climbed a significant amount. This web site gives an idea of where we were: http://www.nearfield.com/~dan/sports/bike/mountain/sgwf/ .

Friday night I heard the Calder Quartet in concert at Los Angeles Harbor College, as part of the South Bay Chamber Music series. There was a repeat performance today at Pacific Unitarian Church in Palos Verdes. What a view from the church! a panoramic vista of Los Angeles and the bay, looking north all the way to the mountains in the north. I heard both performances - of Haydn, Mozart and Beethoven op. 130 including the Grosse Fuge. The fugue boggles the mind; it's impossible to decipher all the voices at once. The Quartet is sounding excellent - very polished, great ensemble, lots of energy, and much finesse.

I finished reading Life of Pi, another art work to boggle the mind. The symbolism and allegory stretch the imagination, just as Beethoven does. At the end you shake your head and say to yourself: did I get that?

The many faces of digital archives

This past week I continued to work on the Los Angeles Examiner photo archive. I noticed that in the early 1950's "Commies" were in the news, as were Korean War soldiers and families.

Happily this work was interspersed with meetings! They apologize about all the meetings, but it's nothing new for me, except that I can attend as in interested by-stander, without responsibility. The CIS (Collection Information System) group is project based, charged with moving the existing (legacy) digital archive to the new platform based on Documentum. Their weekly meetings, conducted by the project manager Tim Stanton, serve to make sure each subgroup is on task and to handle any problems along the way.

The subgroups work on thesaurus management, metadata and migration of it to the new system, the web user interface and testing thereof for usability, the Contributor Module for ingesting new documents into the system by staff and/or faculty themselves, and production (modifying the system architecture and documenting it). These meetings give me a chance to hear input from these various components of the project.

I notice that process is almost as important as content. The focus on project management is intense, with deadlines strongly adhered to and reporting required frequently. This culture is fairly new I am told, replacing a former casual culture that did not always produce timely results. While the goal of project management is noble and probably necessary given the complexity and size of the project, I feel a certain tension between time demanded for process versus time needed to "do the work."

Another meeting was DIM, Digital Imaging and Management. These are the people I am working with right now. They meet to discuss status of imaging projects, metadata, migration of metadata to the new system, needs of new projects coming into the system, etc.

I had occasion to see one new project firsthand. Friday morning Wayne and I met with people at the Seaver Center of Western History Research at the Natural History Museum south of Exposition Boulevard. I had never been in the building. It's a great museum; I hope to get back there as a tourist. At the Seaver Center they are digitizing photos from various collections depicting western history. Then they will be added to the USC Digital Archive. USC already has received some of the images, but the topic Friday was the descriptive data and how it would be migrated from the Center's Excel spreadsheets to the Archive. Wayne had prepared a sample of records showing conversion from an Excel record to a record in Dublin Core (with all the descriptive tags) as it would appear in the USC Digital Archive public interface. He then reviewed the intermediate steps of mapping each Excel column to a Dublin Core element such as title, subject, date, etc. And finally he showed how the record would look in XML for import into the CIS Oracle system.

The Seaver Center staff were having to make decisions about number of subject headings, use of synonyms (include as subject headings? as part of descriptions? or just let future thesaurus modules create the synonyms?), whether size of glass plate negatives or photos was something users needed to search on or sort by, forms of names, etc. They had to decide whether to allow several objects (photo, negative, glass photonegative) to be on the same record, or whether each object needed its own record. There was also discussion about ID numbers and whether a user could recognize the appropriate ID or "call" number needed to enable staff to find the originals if necessary. They also had to discuss copyright notice and whether a statement with each image or one for the collection would be sufficient. There are many details involved in making sure the final records will look good, be searchable, and provide staff as well as users with sufficient information about each record.

So it was a full week of learning. I am still sifting through it all, trying to make sure I understand how everything fits together. It is confusing because they are dealing with two systems, the legacy BRS-based home-grown system and the new Documentum-based system. The initial Seaver collections will come in under the legacy system to move the project along, rather than wait for the new system to be ready. Migration to the new system will take place later.


Saturday, November 06, 2004

Sunny California

Have you ever heard this expression, sunny California? Now I understand it. Rain has been forecast all week, but try as forecasters might, it has not rained. In fact, there has not been a single cloud in the sky all week...until this morning.

I woke up and saw clouds! OMG (oh my gosh) as my son would say! But by the time I finished my laundry mid-morning, the clouds had blown east. I see only blue sky now.

Temps 50-70 daily. Students practice outside, eat outside, meet class outside. Bicycles everywhere, and parked outside university buildings like a Japanese train station. My housemates eat breakfast on the back deck enjoying their coffee and the early morning sun. My windows are open day and night, with no screens! I understand the lure of sunny California.

Creating metadata

I have a new assignment, to get some hands-on experience. I am working with Wayne Shoaf, who is in charge of metadata for the digital archive here. (Wayne, a horn player who graduated from Oberlin, was once in charge of the Schoenberg collection here. When the collection was moved to Vienna, he had the opportunity to try out the new location for 6 months, after which trial he chose to return to sunny CA.)

One collection that is being "ingested" into the archive is from the Los Angeles Examiner newspaper, photos from the 1950's. The photo negatives have already been scanned and given accession numbers using PixArc software. My job is to find the corresponding descriptive information written on the negative jackets (a student has already photocopied these and put them in a notebook) and enter the information into the database using an ingest form. The ingest form is homegrown, and will eventually be migrated to another based on Documentum. USC is having to do a lot of customization to get Documentum to function as they need it to.

This job is like cataloging, having to make sure the proper fields are present, formats are consistent, and that anticipated user terms will pull up desired records. While I am not coding in XML, I am creating content in metadata fields that have already been defined for this collection.

An interesting consequence of this assignment is learning about newspaper photo collections and Los Angeles history. You see the same phenomena in most cities - burglaries, accidents, fires, abandoned children, etc. But you also see lots of actors and actresses in the news, whether it be posing at a special event or getting a divorce.



Tuesday, November 02, 2004

Not like home

It's an unusual circumstance. The sun rises early here, on the eastern side of the time zone. So do I, as a result. I am out the door on my run before the traffic spews exhaust and back in plenty of time for shower, dressing, breakfast and a walk to campus.

The problem is, I am getting here early! The office does not open until 9. At 9 I am usually the first one here. Many librarians come late and work late - perhaps because of traffic. Right now I am killing time in the Leavey Library Information Commons waiting for the offices to open. The work day is 7.5 hours. At the end of the work day I leave at 5, take the USC Tram home, and have plenty of time for dinner and after-dinner activity. How can this be?

Just like home

On my 15-minute walk to USC every day, I pass by the Max Kade Institute on Hoover. Saturday I had a chance to peek inside, after trying my rusty German on the people standing outside. It has an inviting living room and dining room, which serve also as a library. Upstairs students live. Apparently Max Kade wanted to support students coming from abroad.

Then further toward the USC campus is Hebrew Union College! It took me a week to realize I was walking by it every day. I plan to visit soon. I know they have exhibits. It used to be that students could not complete their rabbinical studies here, but had to transfer to Cincinnati, New York, or Jerusalem. But now, according to an acquaintance in Cincinnati, students can complete all their studies here in LA.

Cin City Shootings

I am reminded that LA is the home of cinema every day. A man lying on the sidewalk downtown with fake blood on the cement, with a friend taking shots. A couple shooting with camera on tripod through a window pane into someone's house. Five trucks lined up on Exposition Blvd Sunday morning outside the Mudd Hall of Philosophy, dragging equipment in. And today a truck outside a women's shelter next to the compound of the Second Church of Christ Scientist. Not to mention random activities on campus. In fact the libraries even have a Student Filming Policy at http://www.usc.edu/isd/libraries/about/facilities_usage/student_filming_policy/ ! It's a way of life here.

Monday, November 01, 2004

Weekend in LA

I discovered Exposition Park on my Sunday morning run. I had seen the Natural History Museum building along Exposition Blvd, south of campus. Little did I know that it was the tip of an iceberg. There is a science school, California Science Center, the stadium where USC plays football, an arena, nature trail, rose garden, fieds of grass for impromptu soccer and much more.

Sunday at 5 son Eric played a cello recital at the Colburn School of Performing Arts. It was an awesome program, each piece more challenging than the next. Fitting that on Halloween the program ended with the Cassado's Dance of the Green Devil. I thought the performance full of fire and lyricism, in graceful juxtaposition. It was great to hear him, after more than a year.

Laundry and preparing for a reception with food after the recital ate the remainder of my weekend time.