058: Solving Cultural Amnesia

In This Episode: We know our ancient cultural history because of stone tablets and paper scrolls. We know more recent history because it was printed in books. But with the Internet, where is our history? There are millions of web sites, but if the owner dies and stops paying the bills for their server, it’s shut down, the domain name expires, and all of its knowledge can instantly be lost forever. Someone is trying to do something about that.


Jump to Transcript

How to Subscribe and List of All Episodes

Show Notes

  • To help support Uncommon Sense, see the Patron’s Page, or the form in the sidebar.
  • This is the first of two parts on rescuing human history. The episodes don’t really need to be consumed in order.
  • Kevin Savetz is one of the hosts of ANTIC, the Atari 7-bit Podcast, with hundreds of interviews with pioneers of early personal computing. You can also search or look through the Internet Archive’s Atari archive.
  • For an example of an eclectic collection of items donated to the Internet Archive, consider Ted Nelson’s Junk Mail Cartons — thousands of items that one guy collected and tossed into boxes in “the early days of computing.”
  • A photo of 9-track tape drives is below in the transcript, as is a screenshot of this web site from decades ago.

Transcript

Welcome to Uncommon Sense. I’m Randy Cassingham.

This is the first of a two-part series on another example of Uncommon Sense taken to the extreme to solve the problem of preserving knowledge — and thank goodness for that.

Digital information is much more fragile than books. Digital disks don’t last for hundreds of years like books can: they not only crash, but as technology advances we actually lose the ability to read them. If you found a floppy disk from your own early computer days, would you be able to read it to see what’s on it? Are you sure? You have a working computer with a working floppy drive? And if so, is it the right kind? There are 8-inch, 5-1/4-inch, and 3.5-inch floppies. Early drives were single-sided, then double-sided. Then “single-density” and later “double-density” — technology advances. And then there were formats: a CP/M machine from one manufacturer recorded its data differently than other CP/M machines, let alone IBM-PC clones.

Before that, there were large spools of tape. IBM put out giant drives to read and write data on 7-track tapes. Which didn’t last long, because starting in 1964 there were 9-track tapes. Those spools of tape were 10-1/2 inches in diameter to hold 1,200 to 2,400 feet of half-inch tape, and get this, capacity was measured not in terabytes or gigabytes, but rather in bytes per inch; 1,600 bytes per inch. A 2,400-foot tape could hold up to 42 megabytes of data.

Why 9 tracks? There were 8 tracks to record the bits, since 8 bits make a byte, plus one extra “parity” — or error checking — bit. The IBM Model 2401 tape drive and its successors were used in production environments for a remarkable 30 years, though as time went by the bit-per-inch rating increased, bringing a tape’s capacity up to a maximum of 170 megabytes.

Model 2401 tape drives used as the file system on an IBM System/360 mainframe computer. (Photo: Erik Pitti, uncropped, CC by 2.0)

But this isn’t just a technological issue: it’s a societal issue too.

We don’t seem to worry about having printed books anymore, since if we need to know something, we figure we can just search online for the answer. Yet that brings up a lot of problems. First, the lifespan of the average web page is about 100 days before its changed or removed. Would you be able to find, say, the cost of an IBM 9-track tape drive in 1964? Maybe, but only if someone took the trouble to archive that information, often by scanning printed material, which is typically done by hand, then making it searchable, and making it available online.

That person is Brewster Kahle, the founder (in 1996) of the Internet Archive. You may know of its Wayback Machine, which keeps copies of hundreds of billions of web pages, checking them every two months for changes, and keeping each version archived, though individual web sites can opt out of archiving if they wish.

The first capture of this web site’s home page by the Internet Archive, in late 1997. I “had to” get this from the Archive since even I didn’t keep copies of all of the different “looks” of the site over time. (Click to see larger.)

But the Internet Archive is much more than web pages: it has digitized more than 20 million books, pamphlets, catalogs, and other texts, as well as 4-1/2 million audio recordings, 4 million videos (including 1.6 million TV news programs), 3 million images, and even 200,000 software programs. And they’re still at it, scanning about a thousand books a day, and accepting uploaded information.

With everything, they currently have about 60 petabytes of material archived. A petabyte is a thousand terabytes, and a terabyte is a thousand gigabytes.

The Archive’s mission clearly lays out why all of this is important: “Most societies place importance on preserving artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form. The Archive’s mission is to help preserve those artifacts and create an Internet library for researchers, historians, and scholars.”

Sounds like an expensive proposition, doesn’t it? It is: its budget is $10 million a year. But luckily Brewster Kahle isn’t the only one who thinks this is important, and the non-profit organization has received a lot of donations over the years, from large foundations to individuals. Probably its largest individual donor is Kahle himself: he’s a computer engineer and online entrepreneur who studied artificial intelligence at MIT.

After graduation he was the leading engineer at Thinking Machines in Cambridge, Massachusetts, for six years, where he was on the team that developed WAIS, or the Wide Area Information Server, an online search and document retrieval system that predated the World Wide Web. He and Bruce Gilliat then founded WAIS Inc. in 1992, which was sold to AOL in 1995 for $15 million. Brewster and Bruce weren’t done yet: they then founded a new company in April ’96 to scan the web and, with a browser toolbar, help people find things of interest as they surfed the web.

In April 1996, there was no such thing as Google. They were way, way ahead of the curve!

Kahle and Gilliat named their new company for the Great Library of Alexandria, established in Ptolemaic Egypt in around 275 B.C., which was probably the largest and most significant library of the ancient world, with as many as 400,000 volumes …which collection was destroyed by a huge fire in Alexandria in 48 B.C. But “Great Library of Alexandria” is too long, and they didn’t want their new company to be confused with the ancient library, so they named the company Alexa.

That name not only echoes the idea of creating the greatest possible collection of knowledge, but also provides a constant reminder that knowledge needs to be replicated and backed up in multiple places so it can’t be lost again. We can’t be dependent on giant corporations, which want to own all the information, or governments, which want to manipulate it, so having private, non-profit operations dedicated to making knowledge available to the public is hugely beneficial to humanity. That’s Uncommon Sense in action, and the Internet Archive is leading the way.

Alexa the company’s technology still powers the Archive, which was founded the same year, even though Kahle and Gilliat sold the company — to Amazon — for around a quarter billion dollars’ worth of Amazon stock in 1999. Amazon’s share prices never went above $100 in 1999. Today, Amazon stock is worth about $1850 a share. Gilliat stayed on as CEO, and Kahle turned his attention to the Internet Archive. I couldn’t find out how much Amazon stock Kahle still owns, but you can be pretty sure he is putting quite a lot of his own money into his vision — the Archive’s mission — for the benefit of all.

And now you also know where Amazon got the nickname for their little device that you can talk to, and get verbal answers to questions.

It’s not just money donated to the Internet Archive: original books, documents, and other information is often donated to be archived and made available. For instance, Trent University in Ontario, Canada, donated 250,000 books, rather than just toss or sell them. They were scanned and made available. The Boston Public Library had hundreds of thousands of 78 RPM records in their collection, including some of the only known recordings of early 20th century American music, and that has also been digitized and made available online.

But it’s not just big institutions. My buddy Kevin Savetz, a fellow online entrepreneur, fondly remembers his first computer, an Atari. Over the years he has collected a lot of Atari memorabilia, documents, software disks, sales material, and much, much, more. Kevin didn’t just dump it all on the Internet Archive, he actually paid for people to scan it in so the Archive didn’t have to, and then donated the resulting files!

But when you come down to it, Kahle said in 2009, “It’s not that expensive. For the cost of 60 miles of highway, we can have a 10 million-book digital library available to a generation that is growing up reading on-screen. Our job is to put the best works of humankind within reach of that generation. Through a simple Web search, a student researching the life of John F. Kennedy should be able to find books from many libraries, and many booksellers — and not be limited to one private library whose titles are available for a fee, controlled by a corporation that can dictate what we are allowed to read.”

So what happens if like the Library of Alexandria, the Internet Archive itself catches on fire? After all, it’s located in San Francisco, which entire city pretty much burned to the ground after the great earthquake of 1906. Well, in fact they did have a fire in 2012, which caused about $600,000 in damage. Most of, but not all of, the material that burned had already been digitized and backed up. Kahle had already realized that there needed to be backups in other places, but he also came to the conclusion that there needed to be official mirror sites located in various other countries to back up all of the data. And by the way one of those data centers is in Alexandria, Egypt. Of course, all of those other data centers are another hugely expensive proposition.

But it’s worth it. “Institutions like ours, built for the long-term, need to design for change,” Kahle said. “For us, it means keeping our cultural materials safe, private and perpetually accessible. It means preparing for a Web that may face greater restrictions. It means serving patrons in a world in which government surveillance is not going away; indeed it looks like it will increase. Throughout history, libraries have fought against terrible violations of privacy — where people have been rounded up simply for what they read. At the Internet Archive, we are fighting to protect our readers’ privacy in the digital world.”

Kahle doesn’t just have Uncommon Sense, he’s putting his personal fortune behind preserving the world’s cultural heritage, so that civilization has a “memory and mechanism to learn from its successes and failures.”

Uncommon Sense isn’t the same as being smart. It’s using whatever level of intelligence you have to do something about the problems in your part the world, making the world a better place, rather than just take, which any obliviot can do.

Next week I’ll tell the remarkable story of some people who realized that some irreproducible data was stored on 9-track tapes, and they had to rush to save it before the tapes degraded more than they already had, because there were no backups.

The Show Page for this episode is thisistrue.com/podcast58, which includes links, a photo of 9-track drives, and a place to comment. Like the Archive, I need your support to help keep the podcast going: contributions are why there are no commercials interrupting the episodes — and no ads on the web site. So there’s also a place on the Show Page to contribute, and thanks.

I’m Randy Cassingham …and I’ll talk at you later.

- - -

This page is an example of Randy Cassingham’s style of “Thought-Provoking Entertainment”. His This is True is an email newsletter that uses “weird news” as a vehicle to explore the human condition in an entertaining way. If that sounds good, click here to open a subscribe form.

To really support This is True, you’re invited to sign up for a subscription to the much-expanded “Premium” edition:

One Year Upgrade


(More upgrade options here.)

Q: Why would I want to pay more than the regular rate?

A: To support the publication to help it thrive and stay online: this kind of support means less future need for price increases (and smaller increases when they do happen), which enables more people to upgrade. This option was requested by existing Premium subscribers.

3 Comments on “058: Solving Cultural Amnesia

  1. And my Echo keeps waiting to finish a command every time you said “Alexa.” 🙂

    I had to do some extra gyrations to record this episode, since I have one in my office! -rc

    Reply
  2. Another great story. I had the great fortune to work with Brewster at Thinking Machines starting in 1984. He is not only insanely bright but was always willing to help. We are all a bit older now but he is still a great guy, if you met him on the street, you would never guess he is a multi-millionaire.

    Reply
  3. I used to read science fiction like this. Not too unlike “Foundation” to mention but one. I am genuinely pleased to see that my dreams as an 8 year-old have come to pass!

    Reply

Leave a Comment