Brought to you by the Massachusetts Historical Society

"I have nothing to do here, but to take the Air, enquire for News, talk Politicks and write Letters."

John Adams to Abigail Adams, 30 June 1774

Showing posts with label Schema. Show all posts
Showing posts with label Schema. Show all posts

Monday, June 6, 2011

Saltonstall as "Adams Lite": Tastes Great, Less Filling

Mr. [John Quincy] Adams is now the rising sun, and of course finds many idolaters. You can hardly conceive the strange appearance he makes--so cold--so unbending rigid muscles amidst such smiles, such good humor, gaiety--and among his own guests too. It seems a miracle that he has ever been chosen President of the U. States. Is it an invincible proof of his eminent merit, or the result of a singular concurrence of fortunate circumstances? Mrs. A. is the very antipode--if you will allow the term to be applied to a lady....And I fear that she has a Courtier's hart [sic]--or like him is heartless.

(Letter from Leverett Saltonstall to his wife Mary, 24 Feb. 1825)

This little teaser is just my way of introducing the Saltonstall family papers slip file project, another grant-funded paper-to-digital conversion modeled on the Adams family catalog. We here at the Massachusetts Historical Society have come to call the Saltonstall slip file "Adams Lite" because it is both much smaller (only 3,000 slips) and much more straightforward than its unwieldy counterpart. Both catalogs consist of a series of paper slips describing individual documents: author, recipient, date, place, length, etc. But unlike Adams, this slip file describes papers in a single collection at the MHS: the Saltonstall family papers, Ms. N-2232. The entire data set fits in a single xml file, and all of the information has been entered, controlled, and verified by one person. And lastly, because publication of the Saltonstall papers was completed years ago, this database requires only one static interface--basically a searchable item-level collection guide. The Adams Papers Editorial Project, on the other hand, is ongoing, so that database requires two interfaces: one static, for public use; the other dynamic, to be edited by Adams Papers staff.

The Saltonstall database fulfills a requirement of the original Adams slip file grant, awarded in the fall of 2008, which specified that that project would serve as a prototype for similar projects. The Saltonstall conversion has been in the works since the beginning of 2010, and many MHS staff members have contributed to the project. Laura Lowell processed the Saltonstall family collection, and our digital team of Nancy Heywood, Laura Wulf, and Peter Steinberg have digitized, transcribed, and marked up many individual items for presentation on the web. I was responsible for building the database, using the Adams slip file as a model.

Mary Claffey's work on the Adams slip file laid most of the groundwork for me. Rather than reinvent the wheel, I literally copied and pasted her schema and revised her tags to suit the needs of the Saltonstall family papers, scaling it down by deleting unnecessary elements and adding or repurposing others. Our web developer, Bill Beck, is designing an attractive and user-friendly interface, also modeled on the Adams slip file, and Laura Lowell's collection guide to the Saltonstall family papers will link to the database. It was a lot of fun to work with so many other members of the department; everyone brought their own strengths to the project.

The Saltonstall family, like the Adams family, is chock full of prominent and interesting people, spanning several generations. Leverett, Sr. (quoted above) was a member of the Massachusetts House of Representatives, the Massachusetts Senate, and the U.S. House of Representatives, and he served as the first mayor of Salem, Mass. His wife, Mary Elizabeth (Sanders) Saltonstall, was the daughter of Thomas Sanders, a well-known Salem merchant. Leverett's great-grandson, also named Leverett, was the governor of Massachusetts during World War II and a U.S. senator for over 20 years.

Keep an eye on the MHS website for further information about both projects.

Wednesday, August 12, 2009

Phase 2 Timeline

Our project to digitize the Adams Papers Control File began in January 2009. We originally planned on spending a few short months on proofreading before moving into encoding. However, proofreading 109,348 slips, one by one, has taken a little longer than we anticipated. This phase of the work is vitally important, though, and we have continued doggedly pursuing our final reels. We have found important corrections and updates and have begun entering those changes into the XML files now. The input of corrections has been folded into first phase of encoding and so far is going smoothly.

The first seven months of the project were also devoted to schema development (see Master Encoding Guide) and this summer we secured the services of an excellent XSL consultant to write an XSL transformation to convert our abbreviated vendor schema into the full schema and populate much of the consistent data automatically. The XSLT's have been very helpful and we hope to build on them to automatically generate other data as we work through the initial encoding.

Thus our schedule for 2009:
  • January-August: proofreading (project manager, proofreader, EAD coordinator)
  • March-June: schema development (project manager and web developer)
  • July-August: XSL development and contract work (project manager, web developer, and consultant)
  • August-December: encoding level 1 (project manager, encoder, EAD coordinator)
  • September-December: XSL development (project manager and web developer)

Still Proofreading... but Encoding Begins!

It has been a busy couple of months since the last post. While continuing with the proofreading, which has taken much longer than originally estimated, we have developed a full schema (also in RelaxNG) for encoding the data and have hired a consultant to develop some nifty XSL transformations to move the data from our short vendor schema to the full schema. Following this post I will begin uploading the master encoding guide that provides a detailed narrative of each element and examples of the mark-up.

Friday, April 24, 2009

Phase 1 Continues

Since late February, we have been downloading our completed XML files from the vendor and preparing them for proofreading. During this phase of work, the focus has been to check the accuracy of the transcriptions against the good old paper slips. After some hemming and hawing, our committee decided that paper-to-paper proofreading was still the best method, so we have run the XML files through an XSL transformation, producing a fresh paper copy that closely matches the original paper files. The major difference, of course, is that our new paper copies can fit about six records to a page. So, for the past six weeks (and continuing until it's done) several of us are spending the bulk of our time proofreading about 100,000 tiny slips of paper against about 20,000 larger pieces of paper. This stage, while tedious, is an important first step before the full encoding and data improvement.

For the project manager, sitting down with hundreds of records everyday has been enormously helpful in developing the full schema. After much discussion, investigation, and trial and error, the project committee decided to develop a home-grown schema. A full discussion of the evolution of the schema and its latest iteration will follow in the next post--stay tuned!