Richard Pitt's Personal Site
Marketing and Sales with a Technical Bent

Bit Rot and Lost Information

One of the major problems I and others see in the large-scale use of DRM is the potential loss of public access to the copyright materials at the end of the copyright period. You might think that the loss of (for example) all the images of Mickey Mouse, to future generations would be not all that significant, but you would be wrong. You must take the problem to its logical conclusion; that of every aspect of humanity's history for the period starting now, potentially not being available to coming generations for some technical reason. The impact would be literally devastating.

This problem is not limited solely to DRM, it also is a problem with proprietary data formats in general. I have a number of DAT (4mm digital audio tape) tapes we used in the mid 90s for backup of our ISP's computers that we can't read now.



As man has progressed from prehistoric times to today, the fact of historic documentation has determined implicitly our sense of history and advancement. Prehistoric times were by definition prior to man's ability to record in any way what went on so that future generations could learn from the experiences and mistakes of the prior generations. Advancement went slowly because what little knowledge was generated was passed on verbally or in ways that obviously did not survive for very long.

Only with the development of the ability to record facts and thoughts (thought to be best noted by the cuneiform writings of scribes tracking goods in early Mesopotamia if my history teacher's work serves me correctly) did information survive easily from generation to generation.

Only with the invention of the printing press by Johannes Gutenberg in 1450 did the widespread dissemination of information and thought really take off. Prior to this, the cost of reproducing written works either to forestall degradation due to age of the previous version or to increase the potential readership by making more copies of the original, was so expensive that only very well funded individuals and organizations (the churches being the prime example) could afford such luxuries. The fact that few works other than those sanctioned (and therefore copied) by the churches of the times have survived should be an indication that this was not a good thing.

The subsequent invention of the phonograph by Thomas Edison in 1877 made the recording of voice and music possible for the first time. Prior to this there was no way to preserve the intonations behind the words in a speech or the expertise in a performance of music. Only the raw notes, portrayed on paper via words or musical staff, could be left for the future. We are still finding works of "the masters" left in trunks and other out of the way places by patrons of the past.

But the recordings made by Edison's machines and the many increasingly better and better similar devices of the 110 or so years from 1877 to about 1987 have suffered problems in passing through to following generations. Strangely enough, it's not actually the media the recordings have been made on - it's the reproduction equipment that has fallen afoul of time. The original cylinders that commercial versions of Edison's equipment used were made of wax. They didn't work well after a few playings, but one that has never been played and that has been kept away from heat and excess humidity would be every bit as playable today as it was when recorded - if we had one of the players that still worked! To be sure, there are museums and history buffs who have preserved players, so we actually have not yet lost the ability to play these historic works. In fact, due to the obvious physical nature of the cylinders it is not inconceivable that a new mechanism could be crafted hundreds and even thousands of years in the future if necessary.

I have a cupboard full of 33RPM albums with music I love on them. My problem is that I no longer have a functional turntable/record player and finding one that is of a quality that I want, to allow best reproduction and preservation of my valuable recordings is getting harder and harder.

It's too bad that despite the advances of the digital age, such recreation of playing technologies might not be enough for future generations to resurrect the recordings or our age; and it all hinges on DRM.

Archeologists have decoded the hieroglyphics of Egypt and the cave drawings of France. They've deciphered the writings of even the scribes of Mesopotamia; the first known writings of any substance. But I can't read a perfectly good DAT tape made during the times of our Internet company, Wimsey.com! (1986-1995)

Recordings of Caruso made by his singing into a megaphone carving wax on a 78RPM disk before his death in 1920 are available today. The question now is whether we will be able to listen to a recording of tomorrow's pop music stars sometime in the next century, or even know that they existed; and it all hinges on DRM.

What we are talking about is blandly called "bit rot" in the digital world. This describes the loss of data due to any one of a number of phenomena but is typified by the inability of today's generation of computer systems to read the product of yesterday's and the extension of this to anything digital, including recordings of audio and video.

There are all sorts of reasons why today's systems might not be able to read yesterday's data:

bullet incompatible media types - try finding something that will read a DEC-Tape today (or a 3200BPI 1/2" tape recorded in EBCDC)
bullet incompatible data formats
bullet impermanence of "permanent" media (if it isn't carved into rock or bronze, it won't last centuries)
bullet loss of some key knowledge of how the recording was made originally (i.e. the DRM key)
bullet Some other technical reason (i.e. the DRM algorithm not allowing access after a particular date, regardless of the fact the viewer might have a valid key and have in fact paid for the right to view, or the fact that the copyright has expired and the work is in the public domain in any case.)

The previously mentioned inability to read a digital backup tape from the early 1990s now in the new century is only one example. In this case, the tape drive used to do the recording used a proprietary compression facility built into the hardware of the drive when recording the data. Without an exact duplicate of the drive, the tape is effectively useless. OK, the techies out there will tell you that it's likely that if the tape is only compressed, its decompression should not really be that difficult if the data on it is worth the effort. The point is that the fact that the tape contained data is not obvious, certainly not to future archeologists who might dig it up. For the purposes of this topic, the point to note is that compression is actually a special form of encryption, and encryption is what DRM is all about.

So what happens when all of the history of a period is recorded digitally in a form that is subject to bit rot of some sort, and the form cannot for some reason be "re-cast" in newer media or recovered with any reasonable effort from original media at some far point in the future? The answer is that we fall back to "pre-historic" times and lose our hard-won heritage of experience and expertise. We are left with only that which can be passed on from generation to generation by means of (fallible) human memory. Woe is us!

Without the technical details on the compression/encryption algorithm the drive used, it is not possible/practical to build a replacement. In fact, an alien landing on a barren Earth with no other information might deem the tape to be so much garbage without much thought. There is little chance that the kind of effort that cracked the hieroglyphics of Egypt would translate the tape. In addition, the fact is that the magnetic signature on the tape may no longer be there either. I have some other (DC600) tapes from the same period (mid 90s) that I still have drives for that work fine - but I cannot read the tapes anymore as the magnetic signal has deteriorated. If I re-use them, they work fine in most cases - but the old signal/data is gone.

To paraphrase,
"Those who cannot read historical information are doomed to repeat history"

A bit of Historic Perspective

The National library tried to impose its will upon us back in the days of Wimsey - requesting a copy of every page we hosted or created for our new World Wide Web sites - some of the first in Canada. At the time I believed that their request was mildly insane given the changeable nature of the typical web page even then.

I had to teach them that the web is not a printing press and that if they were to put up web servers that any significant numbers of viewers went to (and you know that government wouldn't put up small servers - only the best) then the budding industry that was just beginning would be skewed in favour of government publishing in somewhat similar manner to what would have happened in the past if only the "Queen's Printer" had the ability to publish books.
The fact was (and is) that unlike printed materials, each page view was counted and could in fact have completely different advertising and/or sponsorship - and if they (the government) were going to offer the same pages but neither include the advertising (since it was "glued" on at viewing) nor pass back page counts (for those static pages that did include advertising but for which only logged page hits "counted") then they would do irreparable harm to the new business model we were inventing.

Today I'm not so sure. I expect that their role may in fact become of critical importance in the face of DRM schemes which may deprive the country of its cultural and historic heritage due to inability to recover media protected at source from copying. What happens when the copyright period expires? What happens as the technologies necessary to decode the locked media fall into disuse?

As time goes on

I recently (October 2003) read an article on a new CD medium that not only allows for far larger (10,000 Gigs) amounts of storage, but also can survive "being dipped in molten iron". http://www.cdfreaks.com/news/8088  is a translation of the original article in Romanian. Sounds like just the thing for backup of my Terabyte home network.

richard

0 comments

The following comments are owned by whomever posted them. This site is not responsible for what they say.

Login Welcome to Richard Pitt's Personal Site
Thursday, September 02 2010 @ 11:26 PM PDT