Brewster’s great speech and congressional search bot blocks

I just finished watching* an amazing keynote by Brewster Kahle. The focus was “universal access to all human knowledge” and presents a compelling case for the (attainable! affordable! possible!) goal of digitizing every book, every piece of music, every piece of film, and every byte of software ever created in human history. There’s not much more that I can say besides that it’s some awe-inspiring stuff, “We will put a man on the moon by the end of the decade” kind of stuff.

An interesting tidbit was mentioned at the end during the Q and A. Brewster mentioned a long-standing annoyance he has with a government site that still stands today. The server that holds the sum of America’s congressional record, thomas.loc.gov, suffers from a lackluster set of onsite search tools. The American people, and in actuality the world, is limited to those onsite search tools because of this: http://thomas.loc.gov/robots.txt

A government for the people, by the people, and of the people blocks any and all outside search engines to every single document that resides on the server. This also includes blocking of archive.org’s wayback machine that creates a rich historical archive of everything it sees. In an age where public documents are disappearing at an alarming rate, I find limiting the ability to find congressional documents by google and the total elimination of an archive by the wayback machine to be a real slap in the face for democracy. I remember hearing about the thomas.loc.gov robot block being put in place back in 1997 or so, when an overzealous bot brought the early perl backend to its knees. Why this is still in place today is baffling to me. [Keynote via boingboing]

* It’s about an hour and a half long, and if you give it a try, I’d strongly suggest using a laptop equipped with an S-video cable out to pipe it to your tv (having a mini headphone-to-RCA audio cable helps if you want hear it through your TV too). Using a mac with OS X, enable the monitors pull down in your menu bar, plug the cables into your tv, then hit the “detect displays” to mirror or extend your desktop to your tv screen, then maximize the real player window to fill your screen.