Early projects
Jump to navigation
Jump to search
Some archives available for downloading, by Archive Team or by other volunteers or groups.
Look at Archive Team Collection at Internet Archive too.
This page is also embedded into the Projects page.
Available for download
Title/Download link | Description | Size |
---|---|---|
Geocities - The PATCHED Torrent (IA) | The popular web hosting service founded in 1994. It was closed by Yahoo! in 2009 | 641.4 GB |
URL Shortener Backup Torrent v4 | URLTeam compressed backups of various URL shorteners (README) | 75 GB |
URL Shortener Backup Torrent v3 outdated, use v4 | URLTeam compressed backups of various URL shorteners (README) | 50 GB |
URL Shortener Backup Torrent v2 outdated, use v4 | URLTeam compressed backups of various URL shorteners (README) | 48 GB |
URL Shortener Backup Torrent v1 outdated, use v4 | URLTeam compressed backups of various URL shorteners (README) | 41.1 GB |
Papers from Philosophical Transactions of the Royal Society | This archive contains 18,592 scientific publications totaling 33GiB, all from Philosophical Transactions of the Royal Society and which should be available to everyone at no cost, but most have previously only been made available at high prices through paywall gatekeepers like JSTOR. | 32.48 GB |
The May 2011 Calufa Twitter Scrape | 90+ million tweets from more than 6 million users | 14.9 GB |
Internet Gopher Archive 2007 (IA) | Archive of gopher sites | 14.8 GB |
Encyclopedia Dramatica January 2010 Mirror | lulz | 11.7 GB |
The TEXTFILES.COM Time Capsule | This collection comprises all the major text-based sets of the TEXTFILES.COM site | 11 GB |
Salon Table Talk | Threads of this talk site | +6.0 GB |
Usenet Archive of UTZOO Tapes | Collection of .TGZ files of very early USENET posted data | 2.0 GB |
Quux.org Gopher Mirror Collection 2006 (IA) | This is a collection of mirrors maintained by gopher.quux.org. These mirrors were taken offline in 2006 due to bandwidth constraints | 1.5 GB |
full-history-linux.git.tar | GIT repository of Linux Kernel from 1991 to 2010 (details) | 594 MB |
Cheng-Caverlee-Lee September 2009 - January 2010 Twitter Scrape | Almost 10 million tweets | 425 MB |
The 2010 Reddit Research Project | Dataset on affinities of 60,000+ Reddit users, recorded in 2010 | ~360 MB |
Archive Team Starwars.Yahoo.Com Panic Download | This is a panic download of the starwars.yahoo.com forums and profiles, done before the closure of same by Yahoo on December 15, 2009. This includes as many messages, profiles, and pages related to the site as could be easily brought in. | ~250 MB |
Social Structure of Facebook Networks Facebook Data Scrape | Facebook data scrape related to paper "The Social Structure of Facebook Networks", by Amanda L. Traud, Peter J. Mucha, Mason A. Porter | 197 MB |
Archive Team's Etherpad Time Capsule | This archive contains roughly 6,400 Etherpads, in their final state | 125 MB |
WikiTeam archives | Archives about wikis. See WikiTeam | +100 MB |
Archive Team | Archive Team.org Site Rip from August 03, 2011 | 75 MB |
Boing Boing Posts Archive (2000-2011) | Two collections of Boing Boing postings provided by the cultural website boingboing.net on its 5th and 11th anniversaries | 42 MB |
Archive Team Quotes Database Backup | Amusing snatches of conversation from IRC and other online gathering places | 5 MB |
Mirror of Revelation Passage Series Website | wget of a small author's website. | ~500kb |
Archive Team Powerblogs Shutdown Snapshot | This is a 108-blog snapshot of the final month of Powerblogs, before their shutdown | ? |
BBC Closing Panic Archives | Some BBC sites | ? |
stillflying.net | A firefly fan fiction site that maded the rest of season 1 and season 2 pdf scripts for what would have been if firefly wasn't canceled. | 408.1mb |
Google Reader | Text for 46M feeds, per-feed statistics, Reader Directory search results | ~8800GB |
Earbits | Website, ~130,000 MP3s and metadata. | ~650GB |
SciMag | 38 million scientific articles | ~28TB |
Google Video | ||
Yahoo! Video |
Archived but not available
- The following three sections have been moved here without modification from the old Projects page.
Finished projects
This is a list of completed projects which do not have their own page on this wiki.
See Category:Rescued Sites for projects which do have their own page on this wiki.
- Archive Team founded by Jason Scott [1]
- an archiveteam thepiratebay.org user created by bbot
- Get the password from him or Jason. (Not really a project, per se.)
- (mirror | 4.5MB archive) The infoAnarchy wiki was archived by Scott.
- infoAnarchy was down for several months in the first part of 2011, but is back up as of May 2011. There is now very little content updating on the site. As of 2014-06-02, infoAnarchy has a "Revive infoanarchy.org blog & wiki" notice and a request for donations, suggesting it may not have a future. As of 2014-06-02, a "database is locked" message will be given to logged-in users.
- If there are future updates to that archive, they may be found at http://sdboyd56.com/archives/
- FIXME - This archive has non-relative links, requiring it to be in /infoanarchy. It needs to be redone or edited to have relative links.
- FIXME - This archive does not include the complete history, which is absolutely essential in this case, as significant editing history exists.
- (mirror) The Cyberpunk Project was archived by Scott
- Note that this wiki does not allow the Russian TLD, so the URL will have to be edited to be visited.
- Most pages haven't been changed since 2007. It hasn't been updated or changed since April 2010.
- FIXME - this mirror is incomplete, or its links are pointing to the live website.
- (archive) Emulation Zone was archived by Start
- FIXME - vgaa.emulationzone.org-2014-0708.warc.gz got interrupted by a crash and needs to be re-archived
Other projects
- FanFiction.Net is being pre-emptively archived.
- seanp2k is running somaseek.com and tracking all the song history for all of the internet radio stations on somafm.com since March 2010.
- Ross is interviewing the sites of 2008.
- l.m.orchard is starting work on some self-hosted web apps that will migrate and archive from other sites. (ie. FriendFeed, Delicious)
- sungo is archiving etherpad.
- Tsp is attempting to archive the stories from fanfiction.net and fictionpress.
- emijrp is a member of WikiTeam. Also, downloading albums from Jamendo. You can know more about his projects in his userpage.
- Jean-Claude Bradley and Andrew Lang are archiving the Open Notebook Science projects Reaction Attempts and the ONS Solubility Challenge. This includes the lab notebooks and all associated raw data files.
- Hydriz is currently archiving all available dumps and downloads generated by Wikimedia and uploading them to the Internet Archive (see collection).
Dead projects
- EmuWikiAdmin created EmuWiki, a collection of all emulators, emulator documents, and hardware information that exists, regrouped in a referenced database. Unfortunately, it shut down in May 2010 due to copyright issues. A 20GB torrent was released, and its contents are available at https://archive.org/details/EmuWiki_Collection.
Some more
You'll find traces of some other old projects on the historical IRC channel list: IRC/Old.