Difference between revisions of "Google Video (Archive)"
(→Table) |
|||
Line 148: | Line 148: | ||
* PLEASE add your custom searches and their details to this table! | * PLEASE add your custom searches and their details to this table! | ||
* Words suggestions: <s>conference, hack, wiki, linux, creative commons, part</s> Either already in the table or added to the BOINC client | * Words suggestions: <s>conference, hack, wiki, linux, creative commons, part</s> Either already in the table or added to the BOINC client | ||
:talk, interview, documentary, public domain, subtitles, | :talk, <s>interview</s>, <s>documentary</s>, public domain, subtitles, | ||
==== Years ==== | ==== Years ==== | ||
[http://www.google.com/search?q=1900+site:video.google.com&tbm=vid 1900], [http://www.google.com/search?q=1901+site:video.google.com&tbm=vid 1901], [http://www.google.com/search?q=1902+site:video.google.com&tbm=vid 1902], [http://www.google.com/search?q=1903+site:video.google.com&tbm=vid 1903], [http://www.google.com/search?q=1904+site:video.google.com&tbm=vid 1904], [http://www.google.com/search?q=1905+site:video.google.com&tbm=vid 1905], [http://www.google.com/search?q=1906+site:video.google.com&tbm=vid 1906], [http://www.google.com/search?q=1907+site:video.google.com&tbm=vid 1907], [http://www.google.com/search?q=1908+site:video.google.com&tbm=vid 1908], [http://www.google.com/search?q=1909+site:video.google.com&tbm=vid 1909], [http://www.google.com/search?q=1910+site:video.google.com&tbm=vid 1910], [http://www.google.com/search?q=1911+site:video.google.com&tbm=vid 1911], [http://www.google.com/search?q=1912+site:video.google.com&tbm=vid 1912], [http://www.google.com/search?q=1913+site:video.google.com&tbm=vid 1913], [http://www.google.com/search?q=1914+site:video.google.com&tbm=vid 1914], [http://www.google.com/search?q=1915+site:video.google.com&tbm=vid 1915], [http://www.google.com/search?q=1916+site:video.google.com&tbm=vid 1916], [http://www.google.com/search?q=1917+site:video.google.com&tbm=vid 1917], [http://www.google.com/search?q=1918+site:video.google.com&tbm=vid 1918], [http://www.google.com/search?q=1919+site:video.google.com&tbm=vid 1919], [http://www.google.com/search?q=1920+site:video.google.com&tbm=vid 1920], [http://www.google.com/search?q=1921+site:video.google.com&tbm=vid 1921], [http://www.google.com/search?q=1922+site:video.google.com&tbm=vid 1922], [http://www.google.com/search?q=1923+site:video.google.com&tbm=vid 1923], [http://www.google.com/search?q=1924+site:video.google.com&tbm=vid 1924], [http://www.google.com/search?q=1925+site:video.google.com&tbm=vid 1925], [http://www.google.com/search?q=1926+site:video.google.com&tbm=vid 1926], [http://www.google.com/search?q=1927+site:video.google.com&tbm=vid 1927], [http://www.google.com/search?q=1928+site:video.google.com&tbm=vid 1928], [http://www.google.com/search?q=1929+site:video.google.com&tbm=vid 1929], [http://www.google.com/search?q=1930+site:video.google.com&tbm=vid 1930], [http://www.google.com/search?q=1931+site:video.google.com&tbm=vid 1931], [http://www.google.com/search?q=1932+site:video.google.com&tbm=vid 1932], [http://www.google.com/search?q=1933+site:video.google.com&tbm=vid 1933], [http://www.google.com/search?q=1934+site:video.google.com&tbm=vid 1934], [http://www.google.com/search?q=1935+site:video.google.com&tbm=vid 1935], [http://www.google.com/search?q=1936+site:video.google.com&tbm=vid 1936], [http://www.google.com/search?q=1937+site:video.google.com&tbm=vid 1937], [http://www.google.com/search?q=1938+site:video.google.com&tbm=vid 1938], [http://www.google.com/search?q=1939+site:video.google.com&tbm=vid 1939], [http://www.google.com/search?q=1940+site:video.google.com&tbm=vid 1940], [http://www.google.com/search?q=1941+site:video.google.com&tbm=vid 1941], [http://www.google.com/search?q=1942+site:video.google.com&tbm=vid 1942], [http://www.google.com/search?q=1943+site:video.google.com&tbm=vid 1943], [http://www.google.com/search?q=1944+site:video.google.com&tbm=vid 1944], [http://www.google.com/search?q=1945+site:video.google.com&tbm=vid 1945], [http://www.google.com/search?q=1946+site:video.google.com&tbm=vid 1946], [http://www.google.com/search?q=1947+site:video.google.com&tbm=vid 1947], [http://www.google.com/search?q=1948+site:video.google.com&tbm=vid 1948], [http://www.google.com/search?q=1949+site:video.google.com&tbm=vid 1949], [http://www.google.com/search?q=1950+site:video.google.com&tbm=vid 1950], [http://www.google.com/search?q=1951+site:video.google.com&tbm=vid 1951], [http://www.google.com/search?q=1952+site:video.google.com&tbm=vid 1952], [http://www.google.com/search?q=1953+site:video.google.com&tbm=vid 1953], [http://www.google.com/search?q=1954+site:video.google.com&tbm=vid 1954], [http://www.google.com/search?q=1955+site:video.google.com&tbm=vid 1955], [http://www.google.com/search?q=1956+site:video.google.com&tbm=vid 1956], [http://www.google.com/search?q=1957+site:video.google.com&tbm=vid 1957], [http://www.google.com/search?q=1958+site:video.google.com&tbm=vid 1958], [http://www.google.com/search?q=1959+site:video.google.com&tbm=vid 1959], [http://www.google.com/search?q=1960+site:video.google.com&tbm=vid 1960], [http://www.google.com/search?q=1961+site:video.google.com&tbm=vid 1961], [http://www.google.com/search?q=1962+site:video.google.com&tbm=vid 1962], [http://www.google.com/search?q=1963+site:video.google.com&tbm=vid 1963], [http://www.google.com/search?q=1964+site:video.google.com&tbm=vid 1964], [http://www.google.com/search?q=1965+site:video.google.com&tbm=vid 1965], [http://www.google.com/search?q=1966+site:video.google.com&tbm=vid 1966], [http://www.google.com/search?q=1967+site:video.google.com&tbm=vid 1967], [http://www.google.com/search?q=1968+site:video.google.com&tbm=vid 1968], [http://www.google.com/search?q=1969+site:video.google.com&tbm=vid 1969], [http://www.google.com/search?q=1970+site:video.google.com&tbm=vid 1970], [http://www.google.com/search?q=1971+site:video.google.com&tbm=vid 1971], [http://www.google.com/search?q=1972+site:video.google.com&tbm=vid 1972], [http://www.google.com/search?q=1973+site:video.google.com&tbm=vid 1973], [http://www.google.com/search?q=1974+site:video.google.com&tbm=vid 1974], [http://www.google.com/search?q=1975+site:video.google.com&tbm=vid 1975], [http://www.google.com/search?q=1976+site:video.google.com&tbm=vid 1976], [http://www.google.com/search?q=1977+site:video.google.com&tbm=vid 1977], [http://www.google.com/search?q=1978+site:video.google.com&tbm=vid 1978], [http://www.google.com/search?q=1979+site:video.google.com&tbm=vid 1979], [http://www.google.com/search?q=1980+site:video.google.com&tbm=vid 1980], [http://www.google.com/search?q=1981+site:video.google.com&tbm=vid 1981], [http://www.google.com/search?q=1982+site:video.google.com&tbm=vid 1982], [http://www.google.com/search?q=1983+site:video.google.com&tbm=vid 1983], [http://www.google.com/search?q=1984+site:video.google.com&tbm=vid 1984], [http://www.google.com/search?q=1985+site:video.google.com&tbm=vid 1985], [http://www.google.com/search?q=1986+site:video.google.com&tbm=vid 1986], [http://www.google.com/search?q=1987+site:video.google.com&tbm=vid 1987], [http://www.google.com/search?q=1988+site:video.google.com&tbm=vid 1988], [http://www.google.com/search?q=1989+site:video.google.com&tbm=vid 1989], [http://www.google.com/search?q=1990+site:video.google.com&tbm=vid 1990], [http://www.google.com/search?q=1991+site:video.google.com&tbm=vid 1991], [http://www.google.com/search?q=1992+site:video.google.com&tbm=vid 1992], [http://www.google.com/search?q=1993+site:video.google.com&tbm=vid 1993], [http://www.google.com/search?q=1994+site:video.google.com&tbm=vid 1994], [http://www.google.com/search?q=1995+site:video.google.com&tbm=vid 1995], [http://www.google.com/search?q=1996+site:video.google.com&tbm=vid 1996], [http://www.google.com/search?q=1997+site:video.google.com&tbm=vid 1997], [http://www.google.com/search?q=1998+site:video.google.com&tbm=vid 1998], [http://www.google.com/search?q=1999+site:video.google.com&tbm=vid 1999] | [http://www.google.com/search?q=1900+site:video.google.com&tbm=vid 1900], [http://www.google.com/search?q=1901+site:video.google.com&tbm=vid 1901], [http://www.google.com/search?q=1902+site:video.google.com&tbm=vid 1902], [http://www.google.com/search?q=1903+site:video.google.com&tbm=vid 1903], [http://www.google.com/search?q=1904+site:video.google.com&tbm=vid 1904], [http://www.google.com/search?q=1905+site:video.google.com&tbm=vid 1905], [http://www.google.com/search?q=1906+site:video.google.com&tbm=vid 1906], [http://www.google.com/search?q=1907+site:video.google.com&tbm=vid 1907], [http://www.google.com/search?q=1908+site:video.google.com&tbm=vid 1908], [http://www.google.com/search?q=1909+site:video.google.com&tbm=vid 1909], [http://www.google.com/search?q=1910+site:video.google.com&tbm=vid 1910], [http://www.google.com/search?q=1911+site:video.google.com&tbm=vid 1911], [http://www.google.com/search?q=1912+site:video.google.com&tbm=vid 1912], [http://www.google.com/search?q=1913+site:video.google.com&tbm=vid 1913], [http://www.google.com/search?q=1914+site:video.google.com&tbm=vid 1914], [http://www.google.com/search?q=1915+site:video.google.com&tbm=vid 1915], [http://www.google.com/search?q=1916+site:video.google.com&tbm=vid 1916], [http://www.google.com/search?q=1917+site:video.google.com&tbm=vid 1917], [http://www.google.com/search?q=1918+site:video.google.com&tbm=vid 1918], [http://www.google.com/search?q=1919+site:video.google.com&tbm=vid 1919], [http://www.google.com/search?q=1920+site:video.google.com&tbm=vid 1920], [http://www.google.com/search?q=1921+site:video.google.com&tbm=vid 1921], [http://www.google.com/search?q=1922+site:video.google.com&tbm=vid 1922], [http://www.google.com/search?q=1923+site:video.google.com&tbm=vid 1923], [http://www.google.com/search?q=1924+site:video.google.com&tbm=vid 1924], [http://www.google.com/search?q=1925+site:video.google.com&tbm=vid 1925], [http://www.google.com/search?q=1926+site:video.google.com&tbm=vid 1926], [http://www.google.com/search?q=1927+site:video.google.com&tbm=vid 1927], [http://www.google.com/search?q=1928+site:video.google.com&tbm=vid 1928], [http://www.google.com/search?q=1929+site:video.google.com&tbm=vid 1929], [http://www.google.com/search?q=1930+site:video.google.com&tbm=vid 1930], [http://www.google.com/search?q=1931+site:video.google.com&tbm=vid 1931], [http://www.google.com/search?q=1932+site:video.google.com&tbm=vid 1932], [http://www.google.com/search?q=1933+site:video.google.com&tbm=vid 1933], [http://www.google.com/search?q=1934+site:video.google.com&tbm=vid 1934], [http://www.google.com/search?q=1935+site:video.google.com&tbm=vid 1935], [http://www.google.com/search?q=1936+site:video.google.com&tbm=vid 1936], [http://www.google.com/search?q=1937+site:video.google.com&tbm=vid 1937], [http://www.google.com/search?q=1938+site:video.google.com&tbm=vid 1938], [http://www.google.com/search?q=1939+site:video.google.com&tbm=vid 1939], [http://www.google.com/search?q=1940+site:video.google.com&tbm=vid 1940], [http://www.google.com/search?q=1941+site:video.google.com&tbm=vid 1941], [http://www.google.com/search?q=1942+site:video.google.com&tbm=vid 1942], [http://www.google.com/search?q=1943+site:video.google.com&tbm=vid 1943], [http://www.google.com/search?q=1944+site:video.google.com&tbm=vid 1944], [http://www.google.com/search?q=1945+site:video.google.com&tbm=vid 1945], [http://www.google.com/search?q=1946+site:video.google.com&tbm=vid 1946], [http://www.google.com/search?q=1947+site:video.google.com&tbm=vid 1947], [http://www.google.com/search?q=1948+site:video.google.com&tbm=vid 1948], [http://www.google.com/search?q=1949+site:video.google.com&tbm=vid 1949], [http://www.google.com/search?q=1950+site:video.google.com&tbm=vid 1950], [http://www.google.com/search?q=1951+site:video.google.com&tbm=vid 1951], [http://www.google.com/search?q=1952+site:video.google.com&tbm=vid 1952], [http://www.google.com/search?q=1953+site:video.google.com&tbm=vid 1953], [http://www.google.com/search?q=1954+site:video.google.com&tbm=vid 1954], [http://www.google.com/search?q=1955+site:video.google.com&tbm=vid 1955], [http://www.google.com/search?q=1956+site:video.google.com&tbm=vid 1956], [http://www.google.com/search?q=1957+site:video.google.com&tbm=vid 1957], [http://www.google.com/search?q=1958+site:video.google.com&tbm=vid 1958], [http://www.google.com/search?q=1959+site:video.google.com&tbm=vid 1959], [http://www.google.com/search?q=1960+site:video.google.com&tbm=vid 1960], [http://www.google.com/search?q=1961+site:video.google.com&tbm=vid 1961], [http://www.google.com/search?q=1962+site:video.google.com&tbm=vid 1962], [http://www.google.com/search?q=1963+site:video.google.com&tbm=vid 1963], [http://www.google.com/search?q=1964+site:video.google.com&tbm=vid 1964], [http://www.google.com/search?q=1965+site:video.google.com&tbm=vid 1965], [http://www.google.com/search?q=1966+site:video.google.com&tbm=vid 1966], [http://www.google.com/search?q=1967+site:video.google.com&tbm=vid 1967], [http://www.google.com/search?q=1968+site:video.google.com&tbm=vid 1968], [http://www.google.com/search?q=1969+site:video.google.com&tbm=vid 1969], [http://www.google.com/search?q=1970+site:video.google.com&tbm=vid 1970], [http://www.google.com/search?q=1971+site:video.google.com&tbm=vid 1971], [http://www.google.com/search?q=1972+site:video.google.com&tbm=vid 1972], [http://www.google.com/search?q=1973+site:video.google.com&tbm=vid 1973], [http://www.google.com/search?q=1974+site:video.google.com&tbm=vid 1974], [http://www.google.com/search?q=1975+site:video.google.com&tbm=vid 1975], [http://www.google.com/search?q=1976+site:video.google.com&tbm=vid 1976], [http://www.google.com/search?q=1977+site:video.google.com&tbm=vid 1977], [http://www.google.com/search?q=1978+site:video.google.com&tbm=vid 1978], [http://www.google.com/search?q=1979+site:video.google.com&tbm=vid 1979], [http://www.google.com/search?q=1980+site:video.google.com&tbm=vid 1980], [http://www.google.com/search?q=1981+site:video.google.com&tbm=vid 1981], [http://www.google.com/search?q=1982+site:video.google.com&tbm=vid 1982], [http://www.google.com/search?q=1983+site:video.google.com&tbm=vid 1983], [http://www.google.com/search?q=1984+site:video.google.com&tbm=vid 1984], [http://www.google.com/search?q=1985+site:video.google.com&tbm=vid 1985], [http://www.google.com/search?q=1986+site:video.google.com&tbm=vid 1986], [http://www.google.com/search?q=1987+site:video.google.com&tbm=vid 1987], [http://www.google.com/search?q=1988+site:video.google.com&tbm=vid 1988], [http://www.google.com/search?q=1989+site:video.google.com&tbm=vid 1989], [http://www.google.com/search?q=1990+site:video.google.com&tbm=vid 1990], [http://www.google.com/search?q=1991+site:video.google.com&tbm=vid 1991], [http://www.google.com/search?q=1992+site:video.google.com&tbm=vid 1992], [http://www.google.com/search?q=1993+site:video.google.com&tbm=vid 1993], [http://www.google.com/search?q=1994+site:video.google.com&tbm=vid 1994], [http://www.google.com/search?q=1995+site:video.google.com&tbm=vid 1995], [http://www.google.com/search?q=1996+site:video.google.com&tbm=vid 1996], [http://www.google.com/search?q=1997+site:video.google.com&tbm=vid 1997], [http://www.google.com/search?q=1998+site:video.google.com&tbm=vid 1998], [http://www.google.com/search?q=1999+site:video.google.com&tbm=vid 1999] |
Revision as of 11:43, 19 April 2011
Google Video | |
![]() | |
URL | http://video.google.com |
Status | Closing in 2011-04-29[1] |
Archiving status | In progress... |
Archiving type | Unknown |
IRC channel | #archiveteam-bs (on hackint) |
Google Video is a video sharing website which is shutting down.
If you want to save your own videos, see the announcement and tools below.
If you want to help archive Google Video, get some machines running and join us in IRC (EFNet #archiveteam / #googlegrape)
Joining the archival effort
The automatic scripts only work on FreeBSD, Linux, Solaris, Windows and maybe OS X. They also seem to work fine in Cygwin. Alternatively, you can run *nix in a virtual machine (given you have a fast enough machine).
To help scrape videos
First of all, please add your name/nickname to this list, along with the storage and bandwidth you have available.
On Linux Systems
- Download youtube-dl or from your distribution.
- Make sure it's marked executable: chmod +x youtube-dl
- Download and install wget for your distribution
- Download googlegargle (Norc's updated, dupe-safe version of googlegargle is here.)
- Get aria2 from your distribution (or if you're on Mac OS X, MacPorts) or SourceForge
- note: check aria2 dependencies in the readme
- Pick a seed list from below, save it under the filename "list" and add your name to the list (you will need a wiki account)
- Change the first few lines of the googlegargle script to reflect your installation
- If you're using youtube-dl from your distro, run "which youtube-dl" or "sudo updatedb; locate youtube-dl" to find the location of the command. Change DLSCRIPT to this.
- For older aria versions, some options need to be removed (--max-connection-per-server=16 --min-split-size=1M)
- You might need to upgrade your version from your system package manager, however the most recent version still may not suffice.
- Change the ARIA variable in the script to the location of your ARIA executable. Usually (ubuntu) at /usr/bin/aria2c, change ARIA variable to this.
- To know where aria2 is located you can use either of these commands:
- "sudo updatedb; locate aria2"
- "which aria2" / "which aria2c"
- To know where aria2 is located you can use either of these commands:
- Invoke googlegargle
- Check with your OS settings to insure that your computer will not auto suspend or sleep after long periods of inactivity.
On Windows Systems
- Download the scraping script for Windows (you still need python and aria2, which can be downloaded separately - instructions in archive). Script location: http://www.pentium100.com/gg_windows.zip
On Solaris Systems
The scripts are known to work on OpenIndiana r147. You'll have to install aria2c and youtube-dl from source, but other than that the googlegargle script should work without modifications.
Don't forget to join the IRC channel to coordinate who's getting what!
To help index videos (low bandwidth/storage)
On Linux Systems
Note: This will only work on machines with X running. To run it on a headless server, use Xvfb (virtual framebuffer). On Ubuntu/Debian: 'apt-get install xvfb', then use xvfb-run to start your main script. An X server will now be made available to any programs that need it.
- Get the tools needed to build phantomjs (a headless web browser) and run the script: Qt WebKit, git, and curl. On Debian or Ubuntu Maverick and up, install the packages build-essential, curl, git, libqtwebkit4, libqtwebkit-dev, and libqt4-dev by issuing the command:
sudo apt-get install build-essential curl git libqtwebkit4 libqtwebkit-dev libqt4-dev
On Ubuntu Lucid 10.04: Since Lucid comes with Qt4.6, not the required 4.7, you may need to add a ppa before trying to install the needed packages.
sudo add-apt-repository ppa:kubuntu-ppa/backports && sudo apt-get update
Additionally, in Lucid, the git package is named git-core, so:
sudo apt-get install git-core
or, on Fedora:
sudo yum install curl git qt-webkit qt-webkit-devel qt-devel
- Run the following command to get the phantomjs source code:
git clone https://github.com/ariya/phantomjs.git
- Enter the directory that was just created by using the following command:
cd phantomjs
- Build phantomjs by issuing the command:
qmake && make -j2
- Move the phantomjs binary somewhere in your path by issuing the command:
cd bin && sudo mv ./phantomjs /usr/local/bin
- Create a folder called gvscript and download the script to get the list of Google Video related pages to scrape: http://199.48.254.90/at/google_video_related.tar.gz
- Extract the above downloaded file (Right-click and Extract To.. or use tar -zxvf ./google_video_related.tar.gz)
- In a terminal, navigate to the folder where you extracted the google_video_related file (above) and run the following command to help scrape Google Video:
while : ; do ./related.sh ; done
On Windows Systems
Grab the following archive which comes with full instructions: http://nstrom.chaosnet.org/google_video_related_win.zip
Once the script's running simply leave it running and head on over to #ggtesting on EFnet (IRC) if you need any assistance or in case the script has any issues. The script will contact the server to get a page to index the related video links, do that indexing, send back the results and repeat! It takes very little processing and bandwidth on your end (a couple of kb/sec, if that).
Cherry picking
The seed files do currently not include all videos, so you might want to save precious videos explicitely. To do that, add IDs (docid URL parameter of the Google Video) to the "list" file in the same directory as the script, for example:
docid=1545969803753962248 docid=1598207563000425446 docid=-1679753730105404298
and start ./googlegargle
To request a cherrypick, add it to this list: http://piratepad.net/gvspecificrequests
If you download something from that list, add its docid to http://piratepad.net/TL7KDN8821 so that others won't download those videos for the second time.
Custom keyword searches
Linux Bash Command
If you want to grab videos by your own custom keyword search term, you can use this command:
SEARCH='my+search+term';for i in `seq 0 10 990 `;do curl -A "AT, Bitches" "http://www.google.com/search?q=$SEARCH+site:video.google.com&hl=en&safe=off&tbm=vid&start=$i&sa=N"|grep -o "docid=[0-9-]*"|sort -u|tee -a seed_videos_$SEARCH;done
Change "my+search+term" to your search term, and remember to use a plus sign instead of spaces (or url encode the text for other special characters).
Linux Bash Script
An alternative search script which sorts and dedupes results and can restrict searches to long, medium and short videos is here.
Searches Undertaken
Since we want to minimize overlap, here are some search terms that are already in progress of being downloaded (and the user who downloads them):
- Darkstar: "rare", "vintage", "commercial"
- NomDuClavier: "douglas adams", "richard dawkins", "charles darwin", "michio kaku", "brian cox", "vernor vinge", "carl sagan", "simon singh"
- oli: "australia history"
- dnova: "microelectronics"
- Lightblb: "documentary" (medium and long videos), "lecture" (medium and long videos), "atheism" (medium & long), "interview" (long)
- ttuttle: "astronomy"
- crackbab1: "ecology"
- tj__: "army"
Also check the specificrequest PiratePad under Cherry Picking on this page.
Deduplication
To avoid downloading videos that have already been downloaded by others,
- check if you have SQLite installed ("which sqlite3")
- download the gv-dedup scripts
- initialize a fresh database with "./gv-list-create.sh"
- download all seed lists on this page (plus the cherry picks) and import them with "./gv-list-import.sh seed_file" (or "find seeds/* -exec ./gv-list-import.sh {} \;")
- invoke "./gv-list-dedup.sh seed_videos_foo > list" to filter already downloaded videos from your custom seed list
- also import your custom seed file with "./gv-list-import.sh list"
A pre-filled database (slightly outdated as of Apr 19, 00:00:00 CEST) is available.
Seed List Downloads
- Original Lists: http://199.48.254.90/at/seeds/
Custom searches, suggestions...
- PLEASE add your custom searches and their details to this table!
- Words suggestions:
conference, hack, wiki, linux, creative commons, partEither already in the table or added to the BOINC client
- talk,
interview,documentary, public domain, subtitles,
Years
1900, 1901, 1902, 1903, 1904, 1905, 1906, 1907, 1908, 1909, 1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 1918, 1919, 1920, 1921, 1922, 1923, 1924, 1925, 1926, 1927, 1928, 1929, 1930, 1931, 1932, 1933, 1934, 1935, 1936, 1937, 1938, 1939, 1940, 1941, 1942, 1943, 1944, 1945, 1946, 1947, 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999
Countries
AFGHANISTAN, ÅLAND+ISLANDS, ALBANIA, ALGERIA, AMERICAN+SAMOA, ANDORRA, ANGOLA, ANGUILLA, ANTARCTICA, ANTIGUA+AND+BARBUDA, ARGENTINA, ARMENIA, ARUBA, AUSTRALIA, AUSTRIA, AZERBAIJAN, BAHAMAS, BAHRAIN, BANGLADESH, BARBADOS, BELARUS, BELGIUM, BELIZE, BENIN, BERMUDA, BHUTAN, BOLIVIA,+PLURINATIONAL+STATE+OF, BONAIRE,+SAINT+EUSTATIUS+AND+SABA, BOSNIA+AND+HERZEGOVINA, BOTSWANA, BOUVET+ISLAND, BRAZIL, BRITISH+INDIAN+OCEAN+TERRITORY, BRUNEI+DARUSSALAM, BULGARIA, BURKINA+FASO, BURUNDI, CAMBODIA, CAMEROON, CANADA, CAPE+VERDE, CAYMAN+ISLANDS, CENTRAL+AFRICAN+REPUBLIC, CHAD, CHILE, CHINA, CHRISTMAS+ISLAND, COCOS+(KEELING)+ISLANDS, COLOMBIA, COMOROS, CONGO, CONGO, COOK+ISLANDS, COSTA+RICA, CÔTE+D'IVOIRE, CROATIA, CUBA, CURAÇAO, CYPRUS, CZECH+REPUBLIC, DENMARK, DJIBOUTI, DOMINICA, DOMINICAN+REPUBLIC, ECUADOR, EGYPT, EL+SALVADOR, EQUATORIAL+GUINEA, ERITREA, ESTONIA, ETHIOPIA, FALKLAND+ISLANDS+(MALVINAS), FAROE+ISLANDS, FIJI, FINLAND, FRANCE, FRENCH+GUIANA, FRENCH+POLYNESIA, FRENCH+SOUTHERN+TERRITORIES, GABON, GAMBIA, GEORGIA, GERMANY, GHANA, GIBRALTAR, GREECE, GREENLAND, GRENADA, GUADELOUPE, GUAM, GUATEMALA, GUERNSEY, GUINEA, GUINEA-BISSAU, GUYANA, HAITI, HEARD+ISLAND+AND+MCDONALD+ISLANDS, HOLY+SEE+(VATICAN+CITY+STATE), HONDURAS, HONG+KONG, HUNGARY, ICELAND, INDIA, INDONESIA, IRAN, IRAQ, IRELAND, ISLE+OF+MAN, ISRAEL, ITALY, JAMAICA, JAPAN, JERSEY, JORDAN, KAZAKHSTAN, KENYA, KIRIBATI, KOREA, KUWAIT, KYRGYZSTAN, LAO, LATVIA, LEBANON, LESOTHO, LIBERIA, LIBYAN+ARAB+JAMAHIRIYA, LIECHTENSTEIN, LITHUANIA, LUXEMBOURG, MACAO, MACEDONIA, MADAGASCAR, MALAWI, MALAYSIA, MALDIVES, MALI, MALTA, MARSHALL+ISLANDS, MARTINIQUE, MAURITANIA, MAURITIUS, MAYOTTE, MEXICO, MICRONESIA,+FEDERATED+STATES+OF, MOLDOVA, MONACO, MONGOLIA, MONTENEGRO, MONTSERRAT, MOROCCO, MOZAMBIQUE, MYANMAR, NAMIBIA, NAURU, NEPAL, NETHERLANDS, NEW+CALEDONIA, NEW+ZEALAND, NICARAGUA, NIGER, NIGERIA, NIUE, NORFOLK+ISLAND, NORTHERN+MARIANA+ISLANDS, NORWAY, OMAN, PAKISTAN, PALAU, PALESTINIAN+TERRITORY,+OCCUPIED, PANAMA, PAPUA+NEW+GUINEA, PARAGUAY, PERU, PHILIPPINES, PITCAIRN, POLAND, PORTUGAL, PUERTO+RICO, QATAR, RÉUNION, ROMANIA, RUSSIAN+FEDERATION, RWANDA, SAINT+BARTHÉLEMY, SAINT+HELENA,+ASCENSION+AND+TRISTAN+DA+CUNHA, SAINT+KITTS+AND+NEVIS, SAINT+LUCIA, SAINT+MARTIN+(FRENCH+PART), SAINT+PIERRE+AND+MIQUELON, SAINT+VINCENT+AND+THE+GRENADINES, SAMOA, SAN+MARINO, SAO+TOME+AND+PRINCIPE, SAUDI+ARABIA, SENEGAL, SERBIA, SEYCHELLES, SIERRA+LEONE, SINGAPORE, SINT+MAARTEN+(DUTCH+PART), SLOVAKIA, SLOVENIA, SOLOMON+ISLANDS, SOMALIA, SOUTH+AFRICA, SOUTH+GEORGIA+AND+THE+SOUTH+SANDWICH+ISLANDS, SPAIN, SRI+LANKA, SUDAN, SURINAME, SVALBARD+AND+JAN+MAYEN, SWAZILAND, SWEDEN, SWITZERLAND, SYRIA, TAIWAN, TAJIKISTAN, TANZANIA, THAILAND, TIMOR-LESTE, TOGO, TOKELAU, TONGA, TRINIDAD+AND+TOBAGO, TUNISIA, TURKEY, TURKMENISTAN, TURKS+AND+CAICOS+ISLANDS, TUVALU, UGANDA, UKRAINE, UNITED+ARAB+EMIRATES, UNITED+KINGDOM, UNITED+STATES, UNITED+STATES+MINOR+OUTLYING+ISLANDS, URUGUAY, UZBEKISTAN, VANUATU, VENEZUELA, VIET+NAM, VIRGIN+ISLANDS,+BRITISH, VIRGIN+ISLANDS,+U.S., WALLIS+AND+FUTUNA, WESTERN+SAHARA, YEMEN, ZAMBIA, ZIMBABWE
Table
Legend
Uploaded to Archive.org | |
Done/Complete with no errors | |
Done/Complete with errors | |
In progress | |
Partially claimed and in progress | |
Not claimed | |
Unknown status (If you know please edit) |
Seed list | Videos (lines) | Downloader | Complete? (Size?) |
---|---|---|---|
seed_videos_ecology | 890 | crackbab1 | |
seed_videos_meme | 996 | yipdw | 483/999, 6.5 GB |
seed_videos_defcon | 822 | ndurner | done |
seed_videos_ml_documentary_dedupe | 1975 | Lightblb, Papyrus, NomDuClavier | 0 completed chunks of 4 (4 claimed) Lightblb: aa Papyrus: ab NomDuClavier: ac, ad |
seed_videos_ml_lecture_dedupe | 1898 | Lightblb, gribozavr, kn100 | 1 completed chunks of 4 (4 claimed) Lightblb: aa ab gribozavr: ad (complete, 28Gb) |
seed_videos_ml_atheism_dedupe | 698 | norc, Mqrius | 2 complete of 2 norc: ab done (16G), Mqrius: aa done (41GB). |
seed_videos_l_interview_dedupe | 986 | Pentium100, wgfreewill | Done (136GB) |
seed_videos_2_a | 25,761 | swebb | 8.6G (4/17/2011) |
seed_videos_2_k | 19,266 (24,242) | Lightblb, ARc[Clone, crackbab1, Pentium100, Mqrius, arketype, Darkstar | 49 chunks completed of 49 Lightblb: aa ab ac ad ae (Done: 68GB) |
seed_videos_2_l | 22,641 | ndurner, wgfreewill | Split 46 chunks of 500 videos each ndurner: aa done; wgfreewill - 8k videos of 22k, finished april 20th. |
seed_videos_2_m | 24,465 | Jade Falcon | Jade @ 2854/24465 ~77G and counting...(100 concurrent threads!) balrog running in reverse |
seed_videos_2_o | 25,049 | travelinlibrarian | Split 51 chunks of 500 videos each travelinlibrarian 263/1-500 |
seed_videos_2_p | 23,713 | oli, Xentac, db48x, otro, Mqrius, Pentium100 | 48 chunks
oli: paa to pah (complete, 90GB) |
seed_videos_2_q | 17,727 | DoubleJ | |
seed_videos_2_t | 25,301 | businux | Split 51 chunks of 500 videos each 961/25,301 3.79% 33GB |
seed_videos_2_u | 23,528 | barbich, negge | 30 chunks complete of 48 (all 48 claimed) barbich: finished 0 to 29 (100% done, 370G) |
seed_videos_2_w | 21,732 | nickmoorman | Split] 0 chunks completed of 34 (34 claimed0 nickmoorman: aa ab ac ad ae af ag ah ai aj |
seed_videos_2_x | 19,733 | ksh | 100% / 78GB
Need to check for errors!
|
seed_videos_2_y | 20,965 | negge | Done (216GB) |
seed_videos_2_z | 18,877 | flare | |
seed_videos_a | 1000 | Dr.Sweety | Currently in progress (9%), should be finished in 1-2 days (slow download fixed (no --max-connections), downloading with 1MBps now) |
seed_videos_a_related | This list contain errors | Dr.Sweety | Done, 44G total. ~1097 out of 1284 seem to be DocIDs, rest is text. Some of the DocIDs are broken (see below). What about the errors, will there be an updated list? |
seed_videos_b | 999 | bjwebb | 136/999 |
seed_videos_c | 981 | dnova | Done (40.2GB) |
seed_videos_d | 999 | nomduclav | complete |
seed_videos_e | 999 | nomduclav | complete |
seed_videos_f | 999 | DoubleJ | Done (25GB) |
seed_videos_g | 999 | dnova | Done (30.9GB) one bad id=7751522177274361392 |
seed_videos_h | 999 | ARc[Clone | Done |
seed_videos_i | 999 | DeCarabas | 915/999 |
seed_videos_j | 999 | joethehuman | Done (36.7 GB) |
seed_videos_k | 999 | aggroskater | 803/999 (22.9 GB so far) |
seed_videos_l | 999 | yipdw | Done (58 GB); six bad IDs: -1165561225258043258, 1922748009661857239, 300163955057959602, -7110898118644169273, -7942619273555709195, 8543705644990106023 |
seed_videos_m | 999 | TJ__ | Done (34.7GB) |
seed_videos_n | 999 | ndurner | Done (38 GB) |
seed_videos_o | 999 | com_lab, grelbar (list) | ~38GB (com_lab) already uploaded, should be done, check when i get home (grelbar) |
seed_videos_p | 999 | Pneu | |
seed_videos_q | 996 | nomduclavier | Done (~24Gb) |
seed_videos_r | 996 | Pentium | Done (26.5GB), two bad IDs (-6997682955012239023, -5475489738249304784) |
seed_videos_s | 999 | Pentium | Done (48.9GB), two bad IDs (2103424227166759427, -8954969329395485241) |
seed_videos_t | 999 | joethehuman | Done (56.8 GB) Will check for errors later |
seed_videos_u | 999 | perfinion, 0xDEADBEEF, norc | 0xDEADBEEF 516/1000 24GB. norc 500-1000 done, 24GB. Perfinion done, 44GB. |
seed_videos_v | 999 | masterme1 | 162/999 (~11GB) |
seed_videos_w | 1000 | com_lab | Done (~5.7GB) |
seed_videos_x | 1000 | Dark-Star | Done (~33GB) |
seed_videos_y | 1000 | beremat | 613/1000, (~37.76GB) |
seed_videos_z | 1000 | ksh | Done (27GB) |
"microelectronics", "circuit+design", "microprocessor", "chiptune", "electrical+engineering", "hardware+hacking", "unboxing", "demoscene", |
1267 | dnova | Done (33.9GB) |
"transistor", "tonawanda", "micron", "gallium", "nanometer", "femtosecond", "qubit", "integrated+circuit" |
343 | dnova | Done (7.1GB) |
"singularity" | 174 | db48x | completed, 12.57GB (list created at 8am UTC April 18th 2011) |
"Feynman" | 28 | db48x | completed, 2.20GB (list created at 9am UTC April 18th 2011) |
"police" | 998 | lutostag | (list created at 8am UTC April 18th 2011) |
"eliezer" | 150 (1000) | norc | completed, 6.8G (list created at 8am UTC April 18th 2011) |
"obama" | 1000 | ryan__ | 302/1000 as of 04-19-2011 00:51 EDT (still WIP) (list created at 8am UTC April 18th 2011) |
"cia" | 999 | ndurner | (list created at 8am UTC April 18th 2011) |
"charlie" | 1000 | ryan__ | 120/1000 as of 04-19-2011 00:51 EDT (still WIP) (list created at 8am UTC April 18th 2011) |
IDs from the metafilter thread | 28 | db48x | completed, 6.17GB (list created at 9am UTC April 18th 2011) |
IDs from the reddit thread | 106 | ndurner | (list created at 9am UTC April 18th 2011) |
"rare" | ~3100 | Darkstar | done (~70gb) |
"vintage" | |||
"commercial" | |||
[http://pastebin.com/ZkzNmwEW "douglas adams", "richard dawkins", "charles darwin" |
NomDuClavier] | 513 videos, done (one de-duped list for the 3 terms) | |
"australia history" "indigenous aboriginal australia" |
1659 | oli | complete |
"linux" | 1641 | xtat | Just started |
"Bugs Bunny" | 153 | stack,wgfreewill | Done, 2.7GB |
"rodney mullen" | 176 | com_lab | Done, 1.7GB |
"tech talks" | 946 | tahu | in progress, 215 videos, 22GB, 2011-04-19 10:10:03 UTC |
"rick astley" | 17 | db48x | completed, 272.8MB (grabbed 13:00 UTC April 18th 2011) |
"CERN" | 912 | vled | Done |
multiple: "michio kaku", "brian cox", "vernor vinge", "carl sagan", "simon singh" |
176 | nomduclavier | done |
"intel", "amd" |
1547 | leftfield | In progress -leftfield I'd love if someone could take this one. -dnova |
"foia" | 89 | com_lab | Done, 4.1GB |
"creative commons" | 968 | aikidork | ~15% grabbed, ~10% rsync (Tue Apr 19 00:17:53 UTC 2011) |
"TED" | 1000 | vled | w/ problems |
"programming" | 1546 | Xentac | In Progress |
"military", "army", "navy", "air force", "marine corps" | 3108 | tj__ & ksh | In Progress (5%) |
"fiddle", "banjo", "old time music" | 921 | RJL20 | In Progress (~ |
"silent+film" | 1000 | dericed | In Progress |
"industrial" | 1584 | Archive242 | In Progress |
(pretty much) every valid GV link on MetaFilter | 1675 | RJL20 | In Progress (~112730M / Errors: 28) |
http://hubpages.com/hub/The_Best_of_GoogleVideo | 122 | Lightblb | In Progress |
Total | >324,788 | Archive Team | >2.24 TB (Apr. 19, 11:37:13 UTC) |
Broken DocIDs
DocID | Title | list |
---|---|---|
-4313176927520589553 | Ferrari 320 km/h SelMcKenzie | seed_videos_h |
710915802292429594 | Triple H-Best Pedigree Ever | seed_videos_h |
919675995190477263 | 404s | seed_videos_h |
-7433458566080701467 | 404s | seed_videos_2_k |
7476314005948269525 | Tan Tay Du Ky 2 tap 1 phan 2 | seed_videos_2_k |
1310034078921227326 | Presentatie H. van Garderen | seed_videos_h |
-8196546459051063200 | Ethiopia - Ethiopian Talk Show - Dr. Kinfe M Kassaye | seed_videos_m |
6012309833489564165 | I'm gonna miss you forever | seed_videos_m |
1006201176909432045 | Nick "KNUCKLEHEAD" Thomas Learning to Ride A KX 65 | seed_videos_2_k_br |
9013618753646293166 | TooSexii | seed_videos_m |
4607644763702261746 | Most Haunted | seed_videos_m |
910327017359455024 | 404s | seed_videos_2_k_br |
-3505183273546479430 | Top 10 Dunkers in Slam Dunk Contest History by www.todonba.mx.kz | seed_videos_2_k_bu |
515155312540224448 | Prof. Stephen Berk - The Six Day War -- (Only downloads 106MB & manual seek fails) | seed_videos_m |
8233620694803027158 | Tien Kiem Ky Hiep 12a | seed_videos_2_k_bs |
-7026671761719496982 | KV Kortrijk - Virton: kans Vervaeke | seed_videos_2_k_bo |
4744936758707683681 | 404s | seed_videos_2_k_bo |
-4138015874145288917 | Irvine City Council Regular Meeting -- content too short (expected 880173643 bytes and served 871) | seed_videos_2_k_bo |
1751753922865083288 | Lou Dobbs - Bill Gates Testifies to Senate: Part 2 | seed_videos_h |
-1847242336625060764 | 404s | seed_videos_h |
-840074924615574683 | H.O.T. TV EPISODE 7 | seed_videos_h |
5450039563312738134 | seed_videos_2_o | |
2740779495236816438 | seed_videos_2_o | |
8240553330007645065 | 404 | "rick astley" |
2776148046666235174 | 404 | seed_videos_d |
4641809537228296381 | 404 | seed_videos_ |
-4718427583805445551 | 404 | seed_videos_e |
5588388288256218328 | 404 | seed_videos_d |
-1413491257698089214 | Redirects to http://www.khou.com/news/119535529.html | seed_videos_a_related |
1895753595163256038 | Redirects to http://tv.sky.com/martina-my-toughest-opponent | seed_videos_a_related |
-4941694769105315227 | Redirects to http://saratoga-north.ynn.com/content/headlines/524274/governor-visit-s-nation-s-capitol/ | seed_videos_a_related |
-7773409926173229653 | Redirects to http://www.zacks.com/commentary/15486/Value+Stock+Picks-August+24,+2010 | seed_videos_a_related |
7391058183663855490 | Redirects to http://www.ebaumsworld.com/video/watch/81158874/ | seed_videos_a_related |
-4381742157481868130 | Redirects to http://arcade.modemhelp.net/play-3613-Stealing_A_Van.html | seed_videos_a_related |
-1554641026467581780 | Redirects to http://s167.photobucket.com/albums/u158/browneydgurl1212/?action=view¤t=meganstealinghashbrown.mp4 | seed_videos_a_related |
2353616771034791644 | Redirects to http://berkshires.ynn.com/content/headlines/523405/glens-falls-woman-accused-of-stealing-a-cat-from-pet-store/ | seed_videos_a_related |
-423230311474262633 | seed_videos_2_k_at | |
-1989250447613793254 | seed_videos_2_k_at | |
-1717591024529167847 | seed_videos_2_k_au | |
-1893715945421217990 | seed_videos_2_k_aw | |
98954701061936704 | seed_videos_2_k_az | |
-857514171338089705 | 871B instead of 9.9MB | seed_videos_2_k_az |
187959010149993716 | seed_videos_2_k_az | |
-3761310108351243571 | seed_videos_2_k_az | |
-5034671686367848138 | Umar Kalim breaks it all content too short | seed_videos_2_k_bh |
3687153060611498767 | Picnic Tables at CiCo content too short | seed_videos_2_k_bj |
1010610140821179600 | seed_videos_2_k_bf | |
1272139449455901373 | seed_videos_2_k_bi | |
2154847967655726343 | seed_videos_2_k_bj | |
2453599535490760149 | seed_videos_2_k_bl | |
2525371248363122880 | seed_videos_2_k_bf | |
-3761310108351243571 | seed_videos_2_k_bh | |
4549148983829940555 | 404s | seed_videos_2_k_bi |
7051814862620931463 | seed_videos_2_k_bh | |
-7353344548521134361 | seed_videos_2_k_bl | |
-817434969229495880 | seed_videos_2_k_bh | |
8335036545639007262 | seed_videos_2_k_bh | |
-8653635503491974486 | seed_videos_2_k_bh | |
-970580050717025709 | seed_videos_2_k_bg | |
-3891054104657374974 | seed_videos_2_k_bb | |
-5401734107040161313 | seed_videos_2_k_bb | |
-6540216432023094075 | seed_videos_2_k_bb | |
-1165561225258043258 | L'universo elegante parte 1 | seed_videos_l |
1922748009661857239 | 4/8 - L'histoire secrète du pétrole - Le temps des premiers craquements | seed_videos_l |
300163955057959602 | 6/8 - L'histoire secrète du pétrole - Le temps des magouilles | seed_videos_l |
-7110898118644169273 | Beppe Grillo e l'inceneritore | seed_videos_l |
-7942619273555709195 | Le monde selon Monsanto - Arte FR | seed_videos_l |
8543705644990106023 | José Bové à Aubagne le 7 Février. | seed_videos_l |
2781869234442161475 | 404 | seed_videos_2_k_ap |
3684594607388096414 | 404 | seed_videos_2_k_ap |
4857427355245773332 | 404 | seed_videos_2_wap |
4818927167565306511 | 404 | seed_videos_2_wap |
Tools
Youtube-DL
- http://rg3.github.com/youtube-dl/download.html
- python youtube-dl googlevideourl
DocID scripts
GoogleGargle
Aria2c (APT)
- apt-add-repository ppa:t-tujikawa/ppa
- apt-get update
- apt-get install aria2c
Aria2c (RPM)
Fedora and CentOS have RPMs available.
- yum install aria2
Searcher
Bash script to search for terms on Google Video, includes dedupe and ability to restrict search by video length.
predict-download-size
Bash script to read a docid list and find out the total size of the listed videos. Requires youtube-dl, curl.
Troubleshooting
- /usr/bin/aria2c: unrecognized option '--max-connection-per-server=16'
- The Aria version available in many linux distributions is not up to date and will throw errors.
- To fix this remove the option from the goooglegargle script line starting with "ARIAOPTIONS="
- User 'negge' on IRC reports the following ARIA command line works for Debian Squeeze with ext4 filesystem,
- --max-overall-download-limit=1024M --file-allocation=falloc --max-connection-per-server=4 --min-split-size=1M --log-level=notice --remote-time=true
- or for ext3 on Debian Squeeze,
- --max-overall-download-limit=1024M --file-allocation=prealloc --max-connection-per-server=4 --min-split-size=1M --log-level=notice --remote-time=true
FAQ
- Is there any estimate on how many videos are on Google Video?
- Wikipedia said it has 2,500,000 videos, a semi-official Google blog mentioned 2.8M
- Is there anything about grabbing metadata for vids? like descriptions?
- Googlegrape does that, it saves the html of the video download page
- What happens to the data after you claim a seed on the wiki and download it?
- We've got 140TB of space allocated to us on archive.org, and can get more
- Is there already some space where it can be uploaded to?
- Not yet, the effort is still young and things take time to organize.
- How can I split seed files if I want to download fewer videos or share the task with others?
- On *nix machines use: split --lines=500 [seedfile] [seedfile] to create a set of files each 500 lines in length in the form seedfileaa seedfileab ... etc.
- How can I check if there are duplicates in a seed file?
- On *nix machines use: sort [infile] | uniq -d to show all duplicates.
- How can I remove duplicates from a seed file before I start to use it?
- On *nix machines use: sort [infile] | uniq -u > [outfile] to produce a new seed file with duplicates removed.
Announcement: Uploaded video content no longer available
On April 29, 2011 videos that have been uploaded to Google Video will no longer be available for playback. We’ve added a Download button to the Video Status page, so you can download videos that you want to save. If you don’t want to download your videos, you don’t need to do anything. (The Download feature will be disabled after May 13, 2011.)
How do I download videos that I've uploaded?
On the Video Status page, click Download Video located on the right side of each of your videos in the "Actions" column.Once a video has been downloaded, an "Already Downloaded" message will appear. If you have many videos on Google Video, you may need to use the paging controls located on the bottom right of the page to access them all. This download option will be available through May 13, 2011.
I've downloaded my videos. Now what do I do with these FLV files?
FLV files are videos that have been encoded in the Flash Video Format. You can upload your videos in FLV format to other video hosting sites like YouTube or Picassa Web Albums. If you would like to playback your videos on your computer and they don’t seem to be working, you might need to install an FLV player. In order to find an FLV player to install, try doing a Google search for [ FLV player ].