Digital Archives

Internet Archive

One of the most fascinating things about the Internet from when I first started using it twenty years ago was this sense of having the world at your fingertips. I remember downloading Mosaic for the first time and discovering this thing they called a search engine, in this case, an early version of Yahoo. Organized by categories, you could search and drill down from topic to topic.

Eventually Google supplanted Yahoo, and then conceived the project back in 2004 of digitizing every known book. I’ve used this to track down quotes to their sources (and sometimes discovered that there was no source for the quote attributed to a particular author). I’ve downloaded 19th century sermon collections available for free. I understand that in recent years, Google has slowed down these efforts. Some would contend this reflects a shift in mission to use of search data in marketing, but it also reflects the fact that they’ve digitized over 20 million books! And they’ve been hampered by some lawsuits along the way.

Another outfit that has also been digitally archiving books and an incredible array of other materials is the Internet Archive. The Internet Archive is a non-profit effort launched in 1996 in San Francisco that includes text, audio, moving pictures, software, and, significantly, archived web pages. I discovered for example that you can look at a collection of archived campaign webpages from 1996. One of the challenges of the internet is its ephemerality. Have you ever come across a weblink that no longer works or a page that no longer exists? The Internet Archive may be the place that still has a record of this. From their homepage you can use their Wayback machine to enter an old URL to see if it is in their archives.

One of the other standout features of Internet Archive for the computer geek is old software from MS-DOS games to VisiCalc for the Apple II. They even have an emulator that allows you to play the games in your browser. Yes, you can play Oregon Trail again!

One writer described the Internet Archive as “a chaotic, beautiful mess”. Indeed, among other places you can go from their home page is a free audiobook collection, a Grateful Dead collection, the Biodiversity Heritage Library, The Iraq War Collection, The Portuguese Web archive, and that collection of MS-DOS software!

The question of course, is whether you can find what you are looking for. My sense is that Google’s search algorithms are better for getting you in the neighborhood of what you are looking for. But the Internet archive is just a fun place to snoop around, and you can do it from your own home.

It occurs to me that one of the big questions around the future of libraries and archives is both how to preserve materials in physical form and also to continue to preserve digitized materials including media that only ever had a digital format, especially because of the weird paradox that digital materials often degrade far faster than the printed page. It makes me wonder if a journal on my daily doings will last longer than my social media presence on Facebook’s servers–of course, who is going to want to study either?

At any rate, exploring all this reminded me that librarians and archivists to day face very different challenges in preserving not only printed primary source materials but the digital record of our society. It will be an interesting task to figure out what is important enough to safe and what is just ephemera!

4 thoughts on “Digital Archives

  1. Archiving digital materials is definitely a huge challenge. I would argue that what we think of as ephemera could be just as valuable to future historians as what we think of as “important” material. Also, I have read many printed primary sources on both Google Books and the Internet Archive for my dissertation. My dissertation research would have been much, much more challenging were it not for the amazing resources that they offer for free.

  2. Thanks for this column, Bob. War and the destruction of historical artifacts are additional reasons to archive books digitally. Do you know of the Hill Museum and Manuscript Library at St John’s University in Collegeville, Minnesota? Here is a link to their website where you can watch a video clip about their involvement in digital archiving of books. http://www.hmml.org/ St John’s is also the home of the St John’s Bible. http://www.saintjohnsbible.org/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s