‘Tis the season. The Society of American Archivists’ conference has come to DC, and yesterday dozens (dozens!) of us descended onto the National Archives and Records Administration for a pre-conference meeting about EAC-CPF. By the way, I left my gold and red metal water bottle in the auditorium — I would love a heads-up if anyone found it.
Unsurprisingly, the proceedings provided the basis for a lot of twitter chatter, and one of my favorite digital historians chimed in to ask for context:
Asking why this all matters is really, really smart. And, with respect to the TOTALLY RAD presenters at the workshop on Monday, I think that this step of standing back and explaining why this is important in the first place is the part that might have been missing. So, I’ll do my best to explain why this might be important to different audiences/practitioners, how implementations may change researchers’ experiences, and how I think this fits into the varieties of archival practice that the profession encounters. BTW, this post is a much better run-down of what was discussed than I’ll be providing.
Standard disclaimers apply.
- I am a Johnny-come-lately to EAC. I’ve poked around, read the announcement, and briefly entertained the idea of coding some records. I came to the workshop to listen and talk and think about why this is important and how it might be used.
- I am a pragmatist about this sort of thing, and I believe in using the proper tool for the job, but I also think that we’re all going to want compelling reasons if the adoption of a new standard requires extra time, labor, or thought. I think that those of us who want to change archival practice and want wide-spread participation in these practices owe the profession a really good elevator speech, a killer manual and friendly answers when asked “naive” questions.
- I am not a data nerd. Strong opinions aside, I may not be explaining this with the elegance and precision that others may offer.
Wait, Mo. Wait. What exactly are we talking about here? Um, here’s some official language, written in nerd:
Encoded Archival Context – Corporate bodies, Persons, and Families (EAC-CPF) primarily addresses the description of individuals, families and corporate bodies that create, preserve, use and are responsible for and/or associated with records in a variety of ways… [C]urrently [EAC’s] primary purpose is to standardize the encoding of descriptions about agents to enable the sharing, discovery and display of this information in an electronic environment. It supports the linking of information about one agent to other agents to show/discover the relationships amongst record-creating entities, and the linking to descriptions of records and other contextual entities.
Let’s see if I can provide a gloss. When archivists describe records in our collections, we write about the records, but we also understand that the records don’t speak for themselves. We also have to contextualize how they came to us, who might be found in the records, and what the historical circumstances were around the records’ creation. So, bundled together in a finding aid, we have description (marked-up in EAD) with a bit of context in the middle there. Strictly speaking, I’ve heard the argument that description is description (what a researcher can find in the archives) and context is context (this is information that isn’t necessarily discovered within the records, but is about the records’ circumstances), and that we shouldn’t be mixing context and description. I’ll come back to this point (preview: I find it weak).
There’s a pretty good tradition among our cousins in libraries and museums for giving special attention to people. The Library of Congress maintains the NACO authority file, which is a big, fat list of people who have created published works (or something – the “why” bit on the NACO website makes me want to stab). Basically, it’s a way for us to all know if we’re going to talk about Samuel Clemens, or if we’re going to talk about Mark Twain. We certainly don’t want to do half and half, and have a researcher only encounter half of the available works when she wants to find everything written by that person. So it makes good sense to keep a list, to decide which form is preferred and also to get a sense of what other names we might encounter, and to know that someone has done a bit of research about when this person was born and when he died. There’s something similar for museums that the Getty maintains — ULAN, which is the union list of artists’ names. This is the same idea, and requires a lot of research, because those artist mofos can be cagey.
And a lot of us use NACO (and/or ULAN), but NACO doesn’t have everyone and it’s frankly not worth our time to contribute to the authority file, and we might want to say more about the person than their name and dates.
So, EAC-CPF is a way for us to take information about people in our records, tell machines that these are indeed people that we’re talking about (as opposed to places or folder titles or whatever else is in a collections guide), and when we have a bunch of these records, get a sense of the larger universe of which people are out there in the archives. Using search technologies, we have the data we need to ask better questions and get better results.
The cool thing about structured data is that it lets us compare apples to apples, oranges to oranges, and see right away when we have ended up with an orange apple. Basically, in the case of EAC, you might see a situation where I have the Walt Whitman papers, you have the Walt Whitman papers, and some podunk archives that no one ever heard of ALSO has a long-lost Walt Whitman letter. Podunk archive didn’t know this was a big deal, in fact, it was in a collection that didn’t have much to do with Whitman at all, the other archives didn’t know that Walt Whitman was in this collection or this archives, and it’s really only a researcher who would have thought to make a big deal of this.
The situation I just described happens when all of these EAC records sit in one place, and can be searched or browsed in the context of one another — but (and I think that this is a huge reason why EAD wasn’t adopted as widely as it might have been), the situation is trickier if you don’t have one home where all of these records sit, so that you can compare them to each other and sort through them. There’s a lot of inside baseball in the archival world about who should be hosting such a home (may I point out that the Europeans and Australians don’t seem to have a problem figuring this out?), and that’s where the imperative to be able to have these records work with each other in a de-centralized way comes in.
So a lot of the discussion at the workshop was of really cool projects where EAC records were brought together (btw, EAC records are being made en masse from NACO files and bits and bobs of EADs) to do exactly this — to make it possible to look at the Walt Whitman EAC entry, see all of the institutions that have Walt Whitman records, and compare how they’ve written his biographical notes.
Small side note here — no one at the workshop mentioned issues of intellectual property. I predict that amalgamation may reveal a few notable instances of processing archivists “borrowing” copyrighted material. It’s also been suggested that EAC records may be useful for re-purposing — for “dropping” someone else’s EAC record into a new finding aid. I wonder if the community will be willing to give away their intellectual labor.
In any case, we end up with a lot of duplicate legacy data (and situations in the future where it may be perfectly appropriate to add overlapping new data). And here I’d like to go back to the problem of description/context. Let’s remember how history is actually made — we go to our records, learn about people who lived, sift through variously reliable and unreliable accounts, and synthesize this data into history. I know that the biog/hist (contextual) notes that I write when I write my finding aids are influenced by the records that I just processed — they have to be, because if these records didn’t give insight into the people I’m describing, they wouldn’t be worth having. And even if my contextual notes are entirely divorced from these records, they’re based on some other historical trace that was synthesized by someone else, written in a secondary source, or popularly known. In this way, everything is description, and I don’t think that it makes sense to pretend that description and context are pure and separate.
So, back to Shane’s question of why this would matter to a historian. Well, it’s possible that EAC may give us the structure to present history to you differently. After all, for the most part, historians don’t write about records, they write about people. I can imagine that as a historian, I would much rather discover archival sources from a main entry about a person than from a record group. And I would also say that historians can help contribute to this conversation about how we can most transparently represent the people in our collections and the traces they’ve left behind.