Oh no, today is the last day of the grant! This is the last blog post in all likelihood. What ever will we do? I suppose I should carry on until the clock strikes 4:45.
In the Advanced Search, users have greater ability to find and narrow down results in the Online Adams Catalog.
The biggest feature I think in this search option is the keyword search, which covers nearly every aspect of every catalog entry. As Susan pointed out in May you can search fun words like "Manure" (6 results) but also more serious words like treaty (922 results) or independence (206 results) or peace (717 results) or negotiate (234 results). Searching via Keywords will return results from the text of the author field, the recipient field, the title field, and the notes field.
The Holding Institution search will let you know where original documents are held in other repositories. A pink slip means it is held physically by the MHS, as does a white slip. White slips means it is a letterbook copy. A yellow slip indicates that it is held outside of the MHS either by a research library such as at the Library of Congress, the British Library, or some other place, or held privately.
The Special Categories search option will search for things coded as poems, essays, speeches, etc. This is newly renamed from its former title of Document Type. The is primarily non-correspondence by genre or subject.
For both Holding Institution and Special Categories users are restricted to a dropdown of controlled terms/identifiers. You can use them in conjuction with any of the other search options, so if you want to search for instances where a title or first line of a JQA poem contains the word Moon, type in Moon in the keywords, and then select from special categories Poems, Hyms, Prayers, &c. - JQA. Click search. Voilia: two results. Or, if you wanted to see if JQA ever wrote a poems titled "Life in the Big City" or "Another Day Another Quarter" you could too, but you will be disappointed because he didn't.
Another option in the Advanced Search is the "View a Record by Its ID Number". Each record has a six-digit ID. The first number possible is 010001. The last number right now is into the 43----'s. (There were 42 "reels" of microfilm scanned, and the 43rd reel is for new slips, but they will fit in chronologically in any given search.) Please not that in each case the numbers do not read --9999 before starting over. So there is no 097596, for example. Only in rare cases to the numbers go higher than the --3000's. The numbers associate back to the original slips, to how they were scanned from a microfilm that was made. We're very fond of them. Searching by record number may be most useful if you've done a previous search and noted down the number. Of course you can always have fun with it and type in a random number and see what you get.
Access to the slip file will be from the Adams home page on the MHS website.
Thanks to Mary and Sara and Susan and Brenda and Bill and Jim for making this project so much fun. We hope you enjoy the access to this data and that it enriches your archival experience and research!
Thursday, June 30, 2011
Tuesday, June 28, 2011
Online Adams Catalog: The Public Interface
The public interface created for the Adams Digital Control File project is called the Online Adams Catalog (OAC). It features three ways at getting at the data of the catalog: a Basic Search, an Advanced Search, and the Record Number Search. I'll try to discuss each in a clear way! This blog post will look at the Basic Search.
Using the catalog is pretty easy if you are familiar with using catalogs. We (will) provide some Popular Searches, which will do some of the work for you. We will feature a "Key to Initials" which will help identify Adams family members and the ways in which their names appear throughout the catalog. There will be an "About the catalog" page, too, which should help to define what it includes, what it does and does not do, etc.
In the Basic Search, users will have the option of searching by name (author, recipient, and either), by date, and by documents visible online. However, please note that as the interface is still being developed, some of this information may change. Apologies in advance if this is the case.
Names
For name searching, users will select whether they want to search by author, recipient, or either/or. Once you click the desired option,
users will have the option of selecting from a "Quick List" of major players or by typing in a search box. Following library conventions, type in the last name, first name style. As you begin to type a name, a list of possibilities will display from which you will make your choice. What I mean is if you type in "Jeffer" and pause, you will see in the list that all those names beginning "Jeffer." As we are probably looking for "Jefferson, Thomas," we find easily that he is the third returned name.
Dates
Searching by dates is perhaps a little easier to describe? You'll see two areas to search: a "from" and a "to". You don't have to search by dates, but it is useful if you want to really narrow down your results. There are some tricky aspects to date searching though!
In the catalog, many documents were written over the course of time. Examples would be a diary or letterbooks, which can contain documents that can cover years. But letters, themselves, were sometimes written over many days, if not weeks or longer. We call these span dates. In the date search area, you can eliminate results that span beyond a specific date range. An example might best illustrate this aspect:
I searched for instances where John Quincy Adams was either an author or recipient of a document in the Adams Papers within the date range of January 1775 and December 1815. This returned 11,266 results. By selecting the option to eliminate span dates, the results were lowered to 11,233.Not the best example, not the most reasonable or intelligent search criteria, but it does show that it removes some documents. A very brief look at the results found that by restricting the query to remove span dates, an essay composed between 11 February 1778-March 1824 was removed.
Documents Online
Selecting the third option in the Basic Search will return results for which an online version of the actual document is available. It should be stated now, VERY LOUD AND CLEAR, that not all the documents for which there is a catalog record are available online. Most of the catalog is not available online, in fact. But when something is online there are options to view it. The kinds of things you can expect to see online are transcriptions of printed volumes, scanned images of printed volumes, scanned images of manuscripts or microfilm, and transcriptions of scanned images of manuscripts and microfilm. I think that covers everything!
This search limiter is one way to go about finding them! For a test, I searched for documents originated by John Adams where Thomas Jefferson was the recipient. I selected the option to only return results that were online. I got two results. (By removing the online option, I see that there are actually 379 documents.) The first is a letter dated 26 May 1777 (record number 012746). On the slip, the right hand area will give you information about where the document can be found (in a printed source, in an online display, etc.) and then there is option to view it.
The catalog will go live in early July. The next post will discuss the Advance searching options, which as the name suggests includes everything you can do in a Basic Search...and more!
Using the catalog is pretty easy if you are familiar with using catalogs. We (will) provide some Popular Searches, which will do some of the work for you. We will feature a "Key to Initials" which will help identify Adams family members and the ways in which their names appear throughout the catalog. There will be an "About the catalog" page, too, which should help to define what it includes, what it does and does not do, etc.
In the Basic Search, users will have the option of searching by name (author, recipient, and either), by date, and by documents visible online. However, please note that as the interface is still being developed, some of this information may change. Apologies in advance if this is the case.
Names
For name searching, users will select whether they want to search by author, recipient, or either/or. Once you click the desired option,
users will have the option of selecting from a "Quick List" of major players or by typing in a search box. Following library conventions, type in the last name, first name style. As you begin to type a name, a list of possibilities will display from which you will make your choice. What I mean is if you type in "Jeffer" and pause, you will see in the list that all those names beginning "Jeffer." As we are probably looking for "Jefferson, Thomas," we find easily that he is the third returned name.
Dates
Searching by dates is perhaps a little easier to describe? You'll see two areas to search: a "from" and a "to". You don't have to search by dates, but it is useful if you want to really narrow down your results. There are some tricky aspects to date searching though!
In the catalog, many documents were written over the course of time. Examples would be a diary or letterbooks, which can contain documents that can cover years. But letters, themselves, were sometimes written over many days, if not weeks or longer. We call these span dates. In the date search area, you can eliminate results that span beyond a specific date range. An example might best illustrate this aspect:
I searched for instances where John Quincy Adams was either an author or recipient of a document in the Adams Papers within the date range of January 1775 and December 1815. This returned 11,266 results. By selecting the option to eliminate span dates, the results were lowered to 11,233.Not the best example, not the most reasonable or intelligent search criteria, but it does show that it removes some documents. A very brief look at the results found that by restricting the query to remove span dates, an essay composed between 11 February 1778-March 1824 was removed.
Documents Online
Selecting the third option in the Basic Search will return results for which an online version of the actual document is available. It should be stated now, VERY LOUD AND CLEAR, that not all the documents for which there is a catalog record are available online. Most of the catalog is not available online, in fact. But when something is online there are options to view it. The kinds of things you can expect to see online are transcriptions of printed volumes, scanned images of printed volumes, scanned images of manuscripts or microfilm, and transcriptions of scanned images of manuscripts and microfilm. I think that covers everything!
This search limiter is one way to go about finding them! For a test, I searched for documents originated by John Adams where Thomas Jefferson was the recipient. I selected the option to only return results that were online. I got two results. (By removing the online option, I see that there are actually 379 documents.) The first is a letter dated 26 May 1777 (record number 012746). On the slip, the right hand area will give you information about where the document can be found (in a printed source, in an online display, etc.) and then there is option to view it.
The catalog will go live in early July. The next post will discuss the Advance searching options, which as the name suggests includes everything you can do in a Basic Search...and more!
Tuesday, June 21, 2011
Names Clean-up in the People Database
The most recent task assigned to me to complete as we prepare to launch the Adams Papers Digital catalog was to review instances were names we assigned as attributes (e.g. adams-john1735 for John Adams (a.k.a. JA) appeared in the XML code, but not in the People database. This is part one of two, the second part being instances were people appear in the database but do not appear in the attributes in the XML.
This is another one of those combination's of human and computer errors, some of which were unavoidable (computer) and some of which happened because of how mundane some of the work was (human).
The attributes that needed seeing to was 1200 strong, and in printed form stretched to 110 pages. (Before you think we're total idiots, this represents approximately 5-6% of the database which means we got 94-95% right. On a ten point grading scale that is a solid A! Sheepishly we realize mistakes happen but anticipating negative criticism about the work we did we wanted to try to spin it positively.)
We did two queries to produce the data. The first query produced the faulty attributes and the slip ID number in which the attribute was contained. For the results that came about in the second run, we included within double-quotes the name associated with the faulty attribute. It gave a little insight into what we could expect before searching the database and editing the code.
Some typical instances looked like this:
a. [extract]-monroe] 160017 - "to Sec. of State [James Monroe] [extract]" | 160043 "to Sec. of State [James Monroe] [extract]"
b. bourne-sylvanus? 072781 - ""to Mr. Sylvanus? Bourne"
c. henry-laurens 031911 - "to Henry Laurens"
d. nicolay-albery-h 341282 - "to Albert H Nicolay"
e. palfrey-john 340735 - "John G. Palfrey" | 341083 - "John G. Palfrey"
There were a number of consistencies as to the errors that were made that were determined quickly when cleaning up the data. In some instances the attribute was valid, but had not been entered into the database. While frustrating, this was an easy fix and was most likely the product of human error. A second reason was due to a typographical error in the XML attribute that was not present in the attribute for that person in the people database. And the reverse, where there was a typo in the people database but the attribute was actually correct. Typographical errors includes transpositions of letters (examples c and d). A third possibility was that in the transformation process which took place in Level 1, the attribute was inaccurately reviewed (examples a and b above). And in the above it should be evident that punctuation and other marks like [ ) ? . were not allowed in the attribute; . Another kind of thing we saw was were attributes in the XML were not as complete as in the database, and the opposite (example e).
The fixes for the above examples should be relatively easy to make yourselves:
a. monroe-james (remove [], add james)
b. bourne-sylvanus (remove ?)
c. laurens-henry (flip first and last names)
d. nicolay-albert-h (typo in albert corrected)
e. palfrey-john-gorham (added middle name)
In addition to checking the slip, we also had to check the database for each name (unless it was a prominent figure). The back-end view for the Adams Papers editors (and us), allows for tabbed browsing of the digital control file. I have not really seen the public interface but imagine this might be a similar feature.
We did not keep count of the number of names added but there were quite a few. In the process we were able to clean up a number of bad attributes that were in the database and in some cases merge or separate people based on a close inspection of names, dates, etc. For example in correcting some of the slips for William Cunningham Jr. I discovered that his attribute should be "cunningham-william0" where as the majority of them were just "cunningham-william", which is the attribute for his father, William Sr. These have all been fixed now so that the users of the digital control file will get the respect they expect when searching for these people.
We will re-run the query in a few days to ensure that every instance was seen to accurately. Let's pretend this was the case, for if any were missed, I won't tell you about it!
The other side to this clean-up, whereby we'll run a query to determine in the database where attributes exist that do not appear in the XML, is more straightforward...those attributes will be deleted from the database.
Also going on has been Beta testing of the public interface. Susan and I were invited to sit in on a meeting yesterday about how that testing went, what some of the feedback was, etc. So I'll post a bit about the public interface next time. The project comes to a close on 30 June 2011 so blogging might slow down - if it doesn't stop outright - after that date. I don't want to turn this into a Brokeback Mountain "I can't quit you" kind of moment, but the reality is is that once the funds stops, so does the blog!
This is another one of those combination's of human and computer errors, some of which were unavoidable (computer) and some of which happened because of how mundane some of the work was (human).
The attributes that needed seeing to was 1200 strong, and in printed form stretched to 110 pages. (Before you think we're total idiots, this represents approximately 5-6% of the database which means we got 94-95% right. On a ten point grading scale that is a solid A! Sheepishly we realize mistakes happen but anticipating negative criticism about the work we did we wanted to try to spin it positively.)
We did two queries to produce the data. The first query produced the faulty attributes and the slip ID number in which the attribute was contained. For the results that came about in the second run, we included within double-quotes the name associated with the faulty attribute. It gave a little insight into what we could expect before searching the database and editing the code.
Some typical instances looked like this:
a. [extract]-monroe] 160017 - "to Sec. of State [James Monroe] [extract]" | 160043 "to Sec. of State [James Monroe] [extract]"
b. bourne-sylvanus? 072781 - ""to Mr. Sylvanus? Bourne"
c. henry-laurens 031911 - "to Henry Laurens"
d. nicolay-albery-h 341282 - "to Albert H Nicolay"
e. palfrey-john 340735 - "John G. Palfrey" | 341083 - "John G. Palfrey"
There were a number of consistencies as to the errors that were made that were determined quickly when cleaning up the data. In some instances the attribute was valid, but had not been entered into the database. While frustrating, this was an easy fix and was most likely the product of human error. A second reason was due to a typographical error in the XML attribute that was not present in the attribute for that person in the people database. And the reverse, where there was a typo in the people database but the attribute was actually correct. Typographical errors includes transpositions of letters (examples c and d). A third possibility was that in the transformation process which took place in Level 1, the attribute was inaccurately reviewed (examples a and b above). And in the above it should be evident that punctuation and other marks like [ ) ? . were not allowed in the attribute; . Another kind of thing we saw was were attributes in the XML were not as complete as in the database, and the opposite (example e).
The fixes for the above examples should be relatively easy to make yourselves:
a. monroe-james (remove [], add james)
b. bourne-sylvanus (remove ?)
c. laurens-henry (flip first and last names)
d. nicolay-albert-h (typo in albert corrected)
e. palfrey-john-gorham (added middle name)
In addition to checking the slip, we also had to check the database for each name (unless it was a prominent figure). The back-end view for the Adams Papers editors (and us), allows for tabbed browsing of the digital control file. I have not really seen the public interface but imagine this might be a similar feature.
We did not keep count of the number of names added but there were quite a few. In the process we were able to clean up a number of bad attributes that were in the database and in some cases merge or separate people based on a close inspection of names, dates, etc. For example in correcting some of the slips for William Cunningham Jr. I discovered that his attribute should be "cunningham-william0" where as the majority of them were just "cunningham-william", which is the attribute for his father, William Sr. These have all been fixed now so that the users of the digital control file will get the respect they expect when searching for these people.
We will re-run the query in a few days to ensure that every instance was seen to accurately. Let's pretend this was the case, for if any were missed, I won't tell you about it!
The other side to this clean-up, whereby we'll run a query to determine in the database where attributes exist that do not appear in the XML, is more straightforward...those attributes will be deleted from the database.
Also going on has been Beta testing of the public interface. Susan and I were invited to sit in on a meeting yesterday about how that testing went, what some of the feedback was, etc. So I'll post a bit about the public interface next time. The project comes to a close on 30 June 2011 so blogging might slow down - if it doesn't stop outright - after that date. I don't want to turn this into a Brokeback Mountain "I can't quit you" kind of moment, but the reality is is that once the funds stops, so does the blog!
Labels:
About the project,
Database,
Encoding Level 1,
Names
Tuesday, June 14, 2011
Almost there...
The end of the Adams slip file conversion project is fast approaching, and the digital catalog will be launched shortly. For some of us here at the Massachusetts Historical Society, it's been all Adams all the time.
The MHS, of course, already offers a variety of online resources related to the Adams family, from digitized letters with transcriptions to digital editions of print volumes. The Adams Electronic Archive contains images and transcriptions of over 1,000 letters between John and Abigail Adams, as well as John's diaries and autobiography. The voluminous diaries of John Quincy Adams have also been digitized, and his line-a-day diaries are the subject of our very popular JQA Twitter project. The Adams Papers Digital Editions reproduce 32 of the print volumes published by the Adams Papers Editorial Project, complete with footnotes and intertextual links.
Navigating through all of these pages can be confusing, and we're hoping the digital catalog will help to mitigate that problem and function as a kind of "clearing house" for Adams researchers. Because the catalog contains item-level descriptive data for every known Adams family manuscript, it's the perfect vehicle for linking to individual digitized documents and online transcriptions, wherever they currently "live." Public researchers, as well as Adams editors, will be able to retrieve a specific record, click on a link, and--voila!--read the document itself.
So, to that end, I spent the last several days adding links to hundreds of individual records using web forms in the dynamic interface. Still to be tackled are links to all the diaries of John and John Quincy Adams. Of course, only a fraction of the items described in the catalog are available in digital format, so it will be important to make that clear. Web developer Bill Beck will design the public interface so that users can limit their search to only those items available online.
The MHS, of course, already offers a variety of online resources related to the Adams family, from digitized letters with transcriptions to digital editions of print volumes. The Adams Electronic Archive contains images and transcriptions of over 1,000 letters between John and Abigail Adams, as well as John's diaries and autobiography. The voluminous diaries of John Quincy Adams have also been digitized, and his line-a-day diaries are the subject of our very popular JQA Twitter project. The Adams Papers Digital Editions reproduce 32 of the print volumes published by the Adams Papers Editorial Project, complete with footnotes and intertextual links.
Navigating through all of these pages can be confusing, and we're hoping the digital catalog will help to mitigate that problem and function as a kind of "clearing house" for Adams researchers. Because the catalog contains item-level descriptive data for every known Adams family manuscript, it's the perfect vehicle for linking to individual digitized documents and online transcriptions, wherever they currently "live." Public researchers, as well as Adams editors, will be able to retrieve a specific record, click on a link, and--voila!--read the document itself.
So, to that end, I spent the last several days adding links to hundreds of individual records using web forms in the dynamic interface. Still to be tackled are links to all the diaries of John and John Quincy Adams. Of course, only a fraction of the items described in the catalog are available in digital format, so it will be important to make that clear. Web developer Bill Beck will design the public interface so that users can limit their search to only those items available online.
Monday, June 6, 2011
Saltonstall as "Adams Lite": Tastes Great, Less Filling
Mr. [John Quincy] Adams is now the rising sun, and of course finds many idolaters. You can hardly conceive the strange appearance he makes--so cold--so unbending rigid muscles amidst such smiles, such good humor, gaiety--and among his own guests too. It seems a miracle that he has ever been chosen President of the U. States. Is it an invincible proof of his eminent merit, or the result of a singular concurrence of fortunate circumstances? Mrs. A. is the very antipode--if you will allow the term to be applied to a lady....And I fear that she has a Courtier's hart [sic]--or like him is heartless.
(Letter from Leverett Saltonstall to his wife Mary, 24 Feb. 1825)
This little teaser is just my way of introducing the Saltonstall family papers slip file project, another grant-funded paper-to-digital conversion modeled on the Adams family catalog. We here at the Massachusetts Historical Society have come to call the Saltonstall slip file "Adams Lite" because it is both much smaller (only 3,000 slips) and much more straightforward than its unwieldy counterpart. Both catalogs consist of a series of paper slips describing individual documents: author, recipient, date, place, length, etc. But unlike Adams, this slip file describes papers in a single collection at the MHS: the Saltonstall family papers, Ms. N-2232. The entire data set fits in a single xml file, and all of the information has been entered, controlled, and verified by one person. And lastly, because publication of the Saltonstall papers was completed years ago, this database requires only one static interface--basically a searchable item-level collection guide. The Adams Papers Editorial Project, on the other hand, is ongoing, so that database requires two interfaces: one static, for public use; the other dynamic, to be edited by Adams Papers staff.
The Saltonstall database fulfills a requirement of the original Adams slip file grant, awarded in the fall of 2008, which specified that that project would serve as a prototype for similar projects. The Saltonstall conversion has been in the works since the beginning of 2010, and many MHS staff members have contributed to the project. Laura Lowell processed the Saltonstall family collection, and our digital team of Nancy Heywood, Laura Wulf, and Peter Steinberg have digitized, transcribed, and marked up many individual items for presentation on the web. I was responsible for building the database, using the Adams slip file as a model.
Mary Claffey's work on the Adams slip file laid most of the groundwork for me. Rather than reinvent the wheel, I literally copied and pasted her schema and revised her tags to suit the needs of the Saltonstall family papers, scaling it down by deleting unnecessary elements and adding or repurposing others. Our web developer, Bill Beck, is designing an attractive and user-friendly interface, also modeled on the Adams slip file, and Laura Lowell's collection guide to the Saltonstall family papers will link to the database. It was a lot of fun to work with so many other members of the department; everyone brought their own strengths to the project.
The Saltonstall family, like the Adams family, is chock full of prominent and interesting people, spanning several generations. Leverett, Sr. (quoted above) was a member of the Massachusetts House of Representatives, the Massachusetts Senate, and the U.S. House of Representatives, and he served as the first mayor of Salem, Mass. His wife, Mary Elizabeth (Sanders) Saltonstall, was the daughter of Thomas Sanders, a well-known Salem merchant. Leverett's great-grandson, also named Leverett, was the governor of Massachusetts during World War II and a U.S. senator for over 20 years.
Keep an eye on the MHS website for further information about both projects.
Subscribe to:
Posts (Atom)