I guess it was only a matter of time before some company found a way to glean polling information from blogs. CNN has a story about how automated programs search millions of blogs to determine how people feel about the issues of the day — in this case, rising gas prices.
But the most surprising part is how they automatically categorize people by age and gender. From the article:
Umbria uses speech patterns to determine the age and gender of blog posters and divides the blogger population into three age groups:
# Boomers — born between 1946-1964
# Generation X — born between 1965 -1978
# Generation Y — born after 1979
How accurate could that be? How could they possibly determine these things … are men and women’s writing styles so different? I haven’t really noticed that… but then as an old Boomer (according to that chart), probably a lot of things I miss…
So who is Umbria? Well, they have two slick websites, Umbria Speaks and Umbria Listens, the latter being the main corporate website. But it doesn’t answer how it categorizes each site (and how it sorted THIS site, assuming it came here).
Some further digging turned up this blog regarding a symposium on “Computational Approaches to Analysing Weblogs”. A very cursory reading gives the impressions that they sort the blogs based on their LiveJournal or Blogger categories; people of this age and this gender tend to have blogs on that. Etc. Blogs, like this one, unaffiliated with the major blog services (I run this one on my own space using open source software), probably aren’t ever seen.
So, independent, obscure bloggers aren’t being heard by the big guys. Hey, maybe that’s for the better. Aside from the weird guy who responded to my GMail post, this blog has been free of spam and advertisements.
Leaving with a last quote from that blog on the symposium on what happens when bloggers realize they are being sourced for market data:
People realise that info is actually public may lead to a backlash from consumers if the usage is abused. Defintion of blogs is they want the world to see the information. Recognition that my opinion is worth something. How do i get to share in this. Interesting frontier to see how consumers will want a quid pro quo.
(Checked my logs; the only bots that crawl this site are Google, Yahoo! and MSN. And sometimes, RARELY, Technorati. Oh, cool. Google even found the calendar software I installed but never got around to using. And who is GigaBlast?)