170 Billion Tweets: The Library Of Congress Twitter Collection

The US Library of Congress has a pretty impressive collection of public tweets – 170 billion of them.

The massive compilation is part of an ambitious project by the Library of Congress to archive all public tweets. The effort was initially proclaimed by the Library in April 2010 and it collected a steady stream of Twitter messages in the years following.

This week, the Library of Congress proudly announced in a public blog posting the incredible number of tweets it has collected. PC Magazine writes that in February 2011 the institution was receiving upwards of 140 million tweets each day for their archive. By October 2012 the number of tweets processed daily jumped to roughly half a billion.

The collection kicked off three years ago with Twitter providing the Library with a 21 billion strong archive of public tweets posted from 2006 through April 2010. Twitter has since given the Library of Congress access to the full stream of 150 billion public tweets posted since that time, according to CNET.

Gayle Osterberg, director of communications for the Library of Congress, wrote in this week’s blog posting:

“The Library’s first objectives were to acquire and preserve the 2006-10 archive; to establish a secure, sustainable process for receiving and preserving a daily, ongoing stream of tweets through the present day; and to create a structure for organizing the entire archive by date.”

The utilization of such a comprehensive digital archive of information has not yet been fully determined. However, the Library’s 170 billion tweets collection has peaked the interest of numerous researchers:

“Though the Library has been building and stabilizing the archive and has not yet offered researchers access, we have nevertheless received approximately 400 inquiries from researchers all over the world. Some broad topics of interest expressed by researchers run from patterns in the rise of citizen journalism and elected officials’ communications to tracking vaccination rates and predicting stock market activity.”

What do you think could be accomplished with the information contained in the 170 billion tweets collected by the Library of Congress?