Chosen Plaintext Indistinguishable from white noise

26Dec/09Off

Information overload

My friends look at me like I'm crazy when I tell them that I'm trying to find an RSS/Atom feed aggregator that won't choke when I try to subscribe to over 350 feeds. I wonder what they will think when they find out that I'm also trying to keep track of:

  • 466 Twitter feeds,
  • 188 Facebook friends,
  • 30 podcasts,
  • 10 IRC channels,
  • 6 Meetup groups,
  • 3 instant messaging networks,
  • 2 voicemail boxes, and
  • 1 email address that sorts 2000 messages per day through 231 procmail rules and a spam filter.

Does that sound like information overload?

Let's put it into perspective: The ATLAS detector (part of the Large Hadron Collider) generates 25 megabytes of output every time a packet of protons crosses it, which happens 40 million times per second. That's a whopping 1,000,000 GB every second!

Today's technology can't store that much data for very long. Even assuming the availability of $100 1TB hard disk drives that could store the information fast enough (they can't) and that were 100% reliable (they aren't), ATLAS would need 1.4 million of these drives (and a budget of $144 million) per day just for this initial data storage.

That is information overload!

The scientists and engineers working on ATLAS are well aware of these physical limitations, and they deal with them by using a clever multi-stage process that ultimately figures out what's important and discards the remaining 99.99997% of the data, resulting in a data rate of 'only' 320 MB per second (see page 5 of the linked PDF).

Think about those numbers. That is the state of the art.

Yet, I struggle with a handful of RSS feeds and a couple of social networking sites! I've thought about just pruning my subscriptions and changing my email address every few years like everyone else does, but I'm too embarrassed to admit to myself that I am so far behind the state of the art.

Darnit, I'm a programmer! If the ATLAS scientists can do what they did, I should easily be able to put together something that can sift through a measly fraction of the data while throwing away much less of it, especially when I can let it work for seconds or even minutes instead of nanoseconds. It's a simple, routine automation task.

Right?

Filed under: Rants, Usability 4 Comments