Google Video

From Archiveteam
Revision as of 10:09, 23 April 2011 by Lightblb (talk | contribs) (→‎Press)
Jump to navigation Jump to search

So this one time in Google Video...

On April 15, Google sent e-mail to anyone who had uploaded video to the Google Video site, informing them that all user content was to be deleted in roughly 30 days. They also announced that after 14 days (to April 29th), they would no longer make the videos available for viewing.

Archive Team whipped into action and inspired a cluster of archivists to attempt to download and preserve the whole of Google Video for suffering mankind. Over the course of a few short days the team and technologies evolved from a brute force 'download everything alphabetically' approach to a sophisticated DOCID scraping operation, with keyword and related video searches producing a list of some 2.5 - 2.8 million DOCID's. These were then handed off to a distributed job management system - listerine - which assigned downloads to volunteers from around the world. In a couple of days, 18TB of verified video data had been downloaded and the team was on schedule to mirror the entire Google Video archive.

One week in, Google announced they were no longer doing any of this, and were going to keep Google Video up indefinitely, as well as adding migration tools to move YouTube videos into user accounts.


April 16 - Boing Boing: Help Archive Team save Google Video content from the abyss
April 17 - Read Write Web: As Google Video Shuts Its Doors, Here's How to Save the Content
April 18 - Wired: Technology Archivists step in as Google Video shuts down for good
April 18 - Laughing Squid: Archive Team Is Trying To Download Google Video Before It Shuts Down
April 19 - 404 Tech Support: Google Video Is Shutting Down and One Team’s Effort To Save the Content
April 20 - Slashdot: Google Video Effort Goes Distributed April 20 - Blog: Google Video Effort Goes Distributed April 23 - Emu Console Exploit News: The Archive Team WON: Google is going to migrate Google Videos to Youtube! (I know, sensory overload. But they got the first scoop on the conclusion and they attribute it to us!)

A Brief History

Within days of the announcement, Jason Scott had thrown together a script, "googlegargle," to automatically download videos identified by scraping links. Then Alex Buie created "listerine," a centrally coordinated, distributed processing system akin to SETI@Home. The listerine client would ask his central server for a video identifier, download it, then report it as finished. It was fire-and-forget for the scores of volunteer downloaders. With this new weapon, The Archive Team was saving Google Video at the rate of 5 terabytes per day.

GV dl rate.png

By the time of Google's capitulation, over 1 million videos and 18TB had been downloaded by our team. With a reported total of 2.5 - 2.8 million videos, Google Video was already 40% preserved.