EadConversion User Guide, Version 1.7.0

Introduction

The Encoded Archival Description (EAD) is a standard for encoding archival finding aids in XML. The Ead Conversion application is a proof of concept tool for converting a document describing the container list of a collection into an EAD XML document. It handles the necessary relationship between folders and boxes, and computes all of the containers used by each logical level of the collection.

It accepts a number of input formats: two table like text forms typical of container list descriptions that are cut and pasted out of a document, an intermediate spreadsheet form, and an EAD XML file. The latter is useful for recomputing a container list.

Operation

EadConversion is a Java Web Start application you can run by clicking here. If you select "example.txt" from the Help menu, this small example will be converted and displayed in tabs for each stage of the conversion process. Any of them can be saved to your local disk. The File menu has an Open menu to let you convert your own documents.

Java Web Start provides an environment for applications which allows them to access your files and resources only with your permission. Opening a file for conversion will bring up a dialog box requesting your permission, and the same thing happens on saving a file.

If clicking on the link above does not download and execute the application, you will need to install Sun's Java plugin on your computer. It is available for most systems (except for Mac's prior to OS 10) from http://www.java.com:

Textual Input

This program reads two styles of input describing both the logical structure of a collection and the distribution of the collection in various containers and generates an EAD XML document. Here are two short examples of the two input formats. The second format requires that the first line be the column headings as shown.

[This format has items in the collection using the columns

        {Folder  Dates   Title} 
    

with occasional Series and Box statements denoting the beginning of a Series or Subseries. The Dates and Title columns can be continued on the next line and the program is pretty good at reconstructing them. This format is useful if the original document was created without tables or styles.]

Series I My collection
Box 1 My Subseries
FF1   1940-1947  Title of One item
FF2   undated    Title of another item
Box 2 Another Subseries
FF1   1948       Title of the third item
    

[This format uses indentation of the title to denote logical structure, with columns

        {Title   Box     Folder}
    

It is usually made from a Word table with three columns. The third item would have been one row in the table with the title continued onto a second line.]

Title	Box	Folder
Series I My collection
A logical level. box/folder columns empty	
    Title of One item, 1940-1947           1    1
    An addition logical level		
        Title of another item, undated          2
Logical level 1 again		
    Title of the third item, 1948
        continued on another line          2    1
    Title of a fourth item, 1987-1991           2
    

It is important to understand that no XML markup is allowed in either of these input forms since they normally are made by cutting and pasting from a textual document. Characters like < and & are mapped into XML entites.

This cut and paste operation, unfortunately, loses type attributes like italics. These must be added back in to either the spreadsheet or the XML document, both of which permit XML markup and hence form a better base for editing than the original text file.

Spreadsheet Format

The spreadsheet format is used as an intermediate form in converting to XML. It is also a good place to make editing changes. It has the following column headings:

    Nesting Code    ContTyp  ContVal  Dates   Title   Descr
    

The Nesting column shows the hierarchy of logical elements. The numeric values can be thought of as how much to indent the component in an outline form. They need not be contiguous, just consistent. When reading a textual input file, problems understanding the input are shown with a * in this column.

The Code column gives the level attribute of the component, for example series or subseries. If empty, it defaults to file. If the Nesting column describes an error, this column gives a cryptic code describing the attempt to understand the input line and the remainder of this row contains that line. Some of these errors, like redundant Series statements that occur at page breaks, can be ignored. Others need to be corrected. The correction can be done in the initial input or the spreadsheet can be saved and edited, then used as input.

The ContTyp column describes the hierarchy of container types used in housing this component. It's form is type:type:...:type where type is typically something like box, folder, page, reel, frame, oversize, or carton. Examples are box, folder, box:folder. Note that the form box-folder plays much the same role as box:folder which is the default for the column.

The ContVal column holds the identifiers for the container hierarchy in the form id : id :...: id where there is one identifier for each type in the ContTyp entry. Examples might be 2 for box, ff3 for folder, and 2:ff3 for box:folder.

The Dates column gives one or more dates that will populate the <unitdate> element.

The Title column gives the component description. If there are multiple lines for the title, the first one should be in the Title column and the remainer should be in the Descr column. Each continued line should begin with the XML <lb/> element. All of the upper level components at the same level with the same first line of the title will be grouped as one component.

XML Input

If the input file is an XML file, the dsc element which does not have a level attribute of analyticover is used to generate a spreadsheet with the containers resolved to complete names. This spreadsheet is then converted back into an XML document replacing the original dsc element.

What's the purpose? There are several:

Options

The Options menu has four options.

Changes to options do not take effect until the next time a document is converted (opened).

License and Contacts

You may use EadConversion freely. However, since this is still a proof of concept application, it can and will change without notice. If you would like a snapshot of a particular version, please contact me. Any comments, bugs, observations, or wish list requests are welcome.

Logo

For more information, mail:
info@agileimage.com
Agile Image Movers
http://www.agileimage.com
2225 Mariposa Ave
Boulder, CO 80302