Converting data exhaust into data value
I've written frequently about my thoughts on the Implicit Web. As more consumers spend more time online (and perform more of their activities online), they leave a trail of digital breadcrumbs exposing data about themselves and their interests. This "digital exhaust' is often massive -- requiring terabytes of data and log files. And while storage costs are coming way down, it is still typically too expensive for companies to analyze all their historical data. Instead, companies frequently resort to sampling or archiving. And for those companies that do try to retain and analyze historical data, they typically find that their database queries take hours or days.
That's why I'm so excited that First Round Capital portfolio company, Aster Data Systems, has launched today -- after three years in stealth mode. Google and Yahoo power their sites using databases distributed across many clusters of servers. Aster Data offers clustered databases for web analytics -- and today announced that they are already supporting Myspace (which is running a 100 server node cluster across hundreds of terabytes) and Aggregate Knowledge's Pique service (which is performing analysis on over 100 million users).
As more companies seek to transform their data exhaust into data value (hey wait a minute -- perhaps that's the Web 2.0 version of "clean tech" -- converting messy data into clean insight) -- I think they will need tools like Aster Data to help them discover deep insights on massive data sets. More information on Aster can be found on their website and blog.