Despite the lack of posting, we have all been plugging away at Encoding Level 1 and have just completed the initial phase of mark-up on all 110,000 records. As it stands now, we have 42 XML files that have been proofread and input to match the massive paper file down the hall. In addition, we have control (through attributes) over all dates, codes, locations, length, and format for each record. The big pieces that remain for Encoding Level 2 are the control of names--for both authors and recipients--published references, and notes.
As we've gone through the encoding, we also been developing supplemental databases that will enhance search-ability in the final interface. These currently include locations (where letters were written) and accessioned documents (repositories other than the MHS that hold the original manuscripts, i.e. Library of Congress). We are also building a supplemental database of all persons and short titles (published versions of documents).
Much of the work in the coming months will focus on Encoding Level 2 (with an emphasis on automating as much data entry as possible through XSLTs) and the building of the database infrastructure in eXist. As we iron out the kinks in building and managing these databases, I will post what we learn and produce. Stay tuned!
Tuesday, April 20, 2010
Subscribe to:
Posts (Atom)