In our ongoing series of questions asked by History students at UAA:
FAQs: what about digital materials?
In the modern social media era, a lot of things are exclusively digital. How do you foresee the preservation of such content in an official archival setting?
This is a great question because this is a challenge that archivists deal with all the time! By the way, we tend to refer to things created in digital format as “born digital” as opposed to things that are digitized. The difference between the two can be substantial: since born digital files may have significant metadata embedded in the file properties or connections to other digital materials (like social media interaction), that important information can be more complicated to capture, preserve, and make discoverable and accessible.
We are already getting an increasing amount of material in digital form whether born digital or digitized, which presents some unique challenges. Digital files are usually less stable than physical records, and unlike a piece of paper or photograph, you can’t tell just by looking at it whether a hard drive or floppy disk has degraded or might be infected with a virus (mold on paper is usually easy to spot!) Another challenge is that it is best practice to keep copies of digital material in multiple places, which usually includes cloud storage. If you have ever subscribed to a cloud storage service, you know they’re expensive, and archives generally have many more terabytes of data than individuals.
Those in charge of funding archives often don’t understand those costs as well as they understand the costs of a physical storage space. Also a building is a one-time cost (mostly!) and digital is an annual cost that keeps growing and requires a pretty significant amount of personnel time, too. We don’t have to peek in every box of papers quarterly to make sure nothing is coming apart, but it’s important to run checks on the digital materials multiple times a year to see if there’s been any degradation so we can replace those files from one of the other copies as soon as possible. And while that work is largely automated and we don’t have to watch every minute of it, it’s still a process that currently takes several days to run and we have to check in on it several times during that process.
It’s also a challenge because generally people don’t think of digital materials as being expensive to store or prioritize deletion of things that may not be important to keep. Also many people may not label or file them very well, so that adds to the long term preservation challenges for those things. For example, take a look at the photos in your cell phone memory right now. How many of those have the people or places in them identified in some way? Do you go through them regularly and delete ones that may not be things you want to keep (or somebody else might want to keep?) If those were to be retained in an archives, how would somebody find out what was in them or if you had photos of a specific person or place? This happens in hard copy too, of course, but the ease of taking photos or sending texts or emails means that there’s a lot more content to be described or possibly to be deleted as non-permanent.
Specifically regarding social media and archiving of web content, some archives are already doing this. There is specialized software out there to capture content on the web. We have not ventured into web archiving for a couple reasons. First, it would take time to learn the software and implement a system for capturing online content. Second, for things like social media, there are rights and privacy issues involved when you have a whole bunch of different people posting things, and maybe not knowing that they are being collected and put in an archive. If someone came to us with a thumb drive containing all of their own tweets or blog posts that they had downloaded, it would be different. Likewise, if the university asked us to archive its old webpages (and provided funding for the necessary storage, software, training, and personnel) we would be more likely to do that than to harvest social media posts without the knowledge or permission of the people posting.
Again, thanks to the students for their great questions. We’ll keep working our way through them and if you have any questions about what we do and how we do it, let us know and maybe you’ll see your question spotlighted here!