Projects
Projects status |
---|
Online (332) · Special cases (55) · Endangered (77) · Closing (19) · Offline (448)
|
Rescued Sites (507) · Self-Saved (17) · Partially Rescued Sites (222) · In Progress (48) · Upcoming (10) · Not Saved Yet (416) · On hiatus (11) · Lost Sites (100)
|
Unknown Status (74) |
This page should contain, or directly link to, almost all ArchiveTeam archiving endavours, categorized.
- Current projects: currently active, upcoming and recently finished grandiose ArchiveTeam projects. (Extract of the next two categories.)
- Warrior projects: projects that utilize(d) ArchiveTeam's distributed archiving system.
- Manual projects that need(ed) much more effort than just pushing a button.
- Small projects: small-scale website archiving projects usually done by a single individual.
- Early projects: first archiving endavours on the dawn of ArchiveTeam, in a format nobody is apparently able/dare to touch.
(The box on the top counts projects having dedicated wiki pages, those numbers aren't complete and far don't contain all projects mentioned in the sections below.)
If you know of a website in danger, let us know that on IRC. If it's a larger site, please also mention it on the Deathwatch page. And, after a decision is made on IRC, or if it doesn't need a decision, then, to help things kept documented and up to date, you are encouraged to add projects, or modify their status
- in the appropriate section(s),
- on the project's dedicated wiki page (if any),
- on Deathwatch and/or on Alive... OR ARE THEY.
The box on the top is generated automatically from projects' dedicated wiki pages, so shouldn't be touched.
Important: Contents of sections below are embedded from other pages, that is, don't edit the section, nor this page, but use the "Edit this list" link! (That opens the corresponding page for editing, and after editing, you'll be forwarded to the page containing only that list: don't worry, you didn't delete the others.)
Current projects
Currently active team projects you can get involved in.
Last updated: 2024-12-06 (Edit this list)
Archive Team recruiting
- Help us: ☞ Download and run your warrior ☜.
- What's on: online tracker.
- Donate to keep our projects going.
- Want to code for Archive Team? Here's a starting point.
Warrior-based projects
Short-term, urgent projects
Medium-term projects
(none currently)
Long-term projects
- Telegram: Archiving public messages in various newsworthy and/or otherwise notable Telegram channels. IRC Channel #telegrab (on hackint).
- URLTeam: URL shorteners were a fucking awful idea. IRC Channel #urlteam (on hackint).
- URLs: A random collection of stuff. IRC Channel #// (on hackint).
- YouTube: Archiving selected videos. IRC Channel #down-the-tube (on hackint).
Long-term, slower-paced projects
These are projects that are actively running but generally only have small numbers of items available to complete at a time.
- Blogger: Grabbing inactive Blogger blogs since Google began a mass purge of inactive Google accounts on or after 2023-12-01. IRC Channel #frogger (on hackint).
- GitHub: Embraced-uh, I mean, bought by Microsoft. IRC Channel #gitgud (on hackint).
- Imgur: Unregistered users' "old" and "inactive" images will be purged, and all NSFW content is being shown the door on 2023-05-15. IRC Channel #imgone (on hackint).
- MediaFire: Not 'at-risk' but grabbing speculatively to save historic files IRC Channel #mediaonfire (on hackint).
- Pastebin: Archiving the pastas. IRC Channel #pastalavista (on hackint).
An updated Warrior virtual appliance (v3.2, v4.0) is now available with better support for newer projects that utilize wget-at.
Manual projects
- ArchiveBot: For those with lots of disk space, bandwidth and long-term commitment. IRC Channel #archivebot (on hackint).
- Codearchiver: Dumping and archival of source code repositories and associated version control systems. IRC Channel #codearchiver (on hackint).
- Dead people: When people die, their webpages and/or social media might go "Poof!" due to fees and other knick-knack. IRC Channel #archiveteam (on hackint)
- Wikibot and WikiTeam: Saving wikis dumps (XML). And their external links for the Wayback Machine (WARC) as well as exporting MediaWiki databases. Permanent effort, everyone can help (you choose the size of your downloads). IRC Channels #wikibot (on hackint) #wikiteam (on hackint).
Upcoming & proposed projects
- Chrome Web Store: Google has announced a timeline of policy changes that will lead to content being removed between 2021-12-01 and 2025. IRC Channel #chromeweblore (on hackint).
- Photobucket: Finally following through on over a year of email threats that free accounts are going to be mass deactivated if they don't pay up. IRC Channel #photosucket (on hackint).
- Abandoned iOS App Store & Google Play apps: Both Apple and Google are slimming down on abandoned apps, with an estimated ~1.5M of them at risk. IRC Channel #appocalypse (on hackint).
- Twitter: General instability; deleting inactive accounts
2019-12-11sometime. IRC Channel #archiveteam-bs (on hackint). - VKontakte: A Russian equivalent of Facebook carries the risk of tumbling down under the weight of sanctions as a result of the government's invasion of Ukraine. IRC Channel #lostkontakt (on hackint).
- JamiiForums: the Tanzanian government would like this gone. IRC Channel #archiveteam-bs (on hackint).
- LiveJournal: Very old, widely regarded as in decline, and has a lot of important stuff buried in it. IRC Channel #archiveteam-bs (on hackint).
- The Pirate Bay: Recently came back up, grabbing an archive for sanity's sake. IRC Channel #archiveteam-bs (on hackint).
- Valhalla: Where to store what even the Internet Archive doesn't have space for? IRC Channel #huntinggrounds (on hackint).
- Giphy: Bought by
FacebookShutterstock, to be "integrated" (assimilated) intoInstagramhttps://news.knowyourmeme.com/news/facebook-to-buy-giphy
Recently finished projects
- ASK.fm: A question/answer hosting site shutting down on 2024-12-01. IRC Channel #dontaskfm (on hackint)
- マンガ図書館Z (Manga Library Z): A site that distributed old and out-of-print manga is shutting down on 2024-11-26. IRC Channel #mangoes (on hackint)
- Garnek.pl: A Polish photo sharing and blogging site is shutting down on 2024-11-25. IRC Channel #garnie (on hackint)}
- Veoh: A video hosting site shutting down on 2024-11-11. IRC Channel #veohnah (on hackint)
- Nhentai: A hoster of hentai (pornographic manga) has repeatedly been in legal trouble and is being sued in the US. IRC Channel #177013 (on hackint)
On Hiatus
- Angelfire: Angelfire is a web hosting service that contains big chunks of early WWW history and has no proper backup. IRC Channel #angelonfire (on hackint).
- Audit 2014: It's time to verify our shit. IRC Channel #auditteam (on hackint).
- Flickr:
Yahoo!SmugMug decided to kill it after finding Yahoo!'s plans to do so before they were bought by Verizon. IRC Channel #flickrfckr (on hackint). - FTP: Help us find and download all FTP sites! IRC Channel #effteepee (on hackint).
- Google Drive: Same as MediaFire. IRC Channel #googlecrash (on hackint). Currently on hiatus.
- Google Groups: "Gone within a year" (SketchCow, 2016-06-07).
- Google News Archive: Let's store all newspapers at Google, WCGW? IRC Channel #papersplease (on hackint).
- INTERNETARCHIVE.BAK: Grab a slice of the big cake of The Archive! IRC Channel #internetarchive.bak (on hackint).
- ISP Hosting: Finding ISP web hosting services before the Grim Reaper finds them. IRC Channel #webroasting (on hackint).
- Miraheze:
Shutting down sometime between 2023-09-01 and 2023-10-31.Rescued by new volunteers! - Project Newsletter: Archiving e-newsletters, currently in development. IRC Channel #projectnewsletter (on hackint).
- Quizlet: Flashcards and other learning tools IRC Channel #quizletusin (on hackint).
- Tinkercad: Autodesk announced its intent to put designs from inactive OAuth accounts back into minds around 2021-05-24. IRC Channel #tinkerhad (on hackint).
- Tumblr: Yahoo! considered killing it, now Yahoo has been acquired and Verizon declared war on NSFW blogs. Tumblr has since been sold to Automattic. IRC Channel #tumbledown (on hackint).
- Reddit: Banning communities that generate bad PR for Reddit Inc. Restricted access to APIs and data on 2023-06-19. IRC Channel #shreddit (on hackint).
ArchiveTeam uses the hackint IRC network – ircs://irc.hackint.org:6697 (TLS required) – webchat: https://chat.hackint.org/ – More info
Warrior projects
ArchiveTeam's past, current and future Warrior projects with details, in a table form.
Last updated: 2021-10-31 (Edit this list)
Project | IRC channel | Status | Began | Finished | Result | Archive Location |
---|---|---|---|---|---|---|
Fotoalbum (script-only) | #lookatthisfotograph (on hackint) | Active | ||||
Google Sites (script-only) | #nearlylostmygoogles (on hackint) | Active | ||||
Github (script-only) | #gitgud (on hackint) | Active | ||||
Bitbucket (Mercurial repositories) | #kickthebucket (on hackint) | In Development | ||||
#shreddit (on hackint) | In Development | |||||
Pastebin | #pastalavista (on hackint) | Active | May 30, 2020 | |||
Google+ | #googleminus (on EFnet) (abandoned) | Downloads Finished | March 5, 2019 | April 2, 2019 | Qualified Success | archive |
Flickr | #flickrfckr (on hackint) | Active | January 9, 2019 | archive | ||
Tumblr | #tumbledown (on hackint) | Archive Posted | December 8, 2018 | December 17, 2018 | Qualified Success | archive |
NUjij | Archive Posted | August 25, 2016 | Success | archive | ||
Yahoo! Answers | #noanswers (on hackint) | Archive Posted | August 21, 2016 | archive | ||
Orkut | #throatkut (on EFnet) (abandoned) | Archive Posted | August 6, 2016 | archive | ||
Portalgraphics.net | Archive Posted | July 23, 2016 | July 27, 2016 | Success | archive | |
DNS History | #greatlookup (on EFnet) (abandoned) | Aborted | July 4, 2016 | August 22, 2016 | Failure | |
THOMAS | Archive Posted | July 3, 2016 | July 5, 2016 | Qualified Success | archive | |
Coursera | #cursera (on EFnet) (abandoned) | Archive Posted | June 26, 2016 | June 30, 2016 | Success | archive |
Olympe | Downloads Finished | June 5, 2016 | June 6, 2016 | Qualified Success | ||
ZippCast | Archive Posted | June 3, 2016 | June 10, 2016 | Qualified Success | archive | |
Arto | Archive Posted | May 8, 2016 | June 29, 2016 | Success | archive | |
Bayimg | Archive Posted | April 28, 2016 | archive | |||
PDF 2016 | #pdflush (on EFnet) (abandoned) | Active | April 8, 2016 | archive | ||
Virgin Media | #virginsacrifice (on EFnet) (abandoned) | Downloads Finished | March 30, 2016 | April 28, 2016 | Qualified Success | |
LiveJournal | #recordedjournal (on EFnet) (abandoned) | Active | March 12, 2016 | |||
GameTrailers | #unhitchedtrailer (on EFnet) (abandoned) | Archive Posted | February 9, 2016 | February 18, 2016 | Qualified Success | archive |
Fotolog.com | #fotologout (on EFnet) (abandoned) | Active | February 8, 2016 | archive | ||
Friends Reunited | #friendsununited (on EFnet) (abandoned) | Archive Posted | February 5, 2016 | February 26, 2016 | Qualified Success | archive |
myVIP (script-only) |
#byevip (on EFnet) (abandoned) | Archive Posted | January 24, 2016 | August 30, 2016 | Success | archive |
MusicBrainz (external links) | Archive Posted | January 8, 2016 | January 9, 2016 | Success | archive | |
OldFriends | Archive Posted | December 29, 2015 | January 20, 2016 | Success | archive | |
Google Code | #googlecodeblue (on EFnet) (abandoned) | Active | December 18, 2015 | archive | ||
Docstoc | #docstop (on EFnet) (abandoned) | Archive Posted | November 24, 2015 | December 1, 2015 | Qualified Success | archive |
FTP (script-only) | #effteepee (on hackint) | Active | November 30, 2015 | archive | ||
aDrive | #bdrive (on EFnet) (abandoned) | Archive Posted | November 15, 2015 | November 16, 2015 | Qualified Success | archive |
Telenor personal websites | #nohome (on EFnet) (abandoned) | Archive Posted | October 29, 2015 | October 31, 2015 | Qualified Success | archive |
WikiTeam (WARC format) | #wikiteam (on hackint) | Active | October 26, 2015 | archive | ||
Yuku | Active | October 25, 2015 | archive | |||
GameFront | #grillfront (on EFnet) (abandoned) | Archive Posted | October 20, 2015 | April 29, 2016 | Success | archive |
RuTracker | #rutrasher (on EFnet) (abandoned) | Archive Posted | October 5, 2015 | May 31, 2016 | Success | archive |
Thingiverse | Archive Posted | September 23, 2015 | January 24, 2016 | Success | archive | |
Skillfeed | #skillessfeed (on EFnet) (abandoned) | Archive Posted | September 14, 2015 | September 20, 2015 | Success | archive |
Blingee | #tragedee (on EFnet) (abandoned) | Archive Posted | August 16, 2015 | October 8, 2015 | Qualified Success | archive |
Google Moderator | #moderhater (on EFnet) (abandoned) | Archive Posted | July 21, 2015 | July 22, 2015 | Success | archive |
Toshiba Support | #toshibah (on EFnet) (abandoned) | Archive Posted | June 24, 2015 | July 5, 2015 | Success | archive |
Xfire Social Website | #xfired (on EFnet) (abandoned) | Archive Posted | June 19, 2015 | July 9, 2015 | Qualified Success | archive |
Zoocasa | #zoohouse (on EFnet) (abandoned) | Archive Posted | June 18, 2015 | June 25, 2015 | Success | archive |
SourceForge | #coldstorage (on EFnet) (abandoned) | Aborted | June 17, 2015 | June 19, 2015 | ||
Pomf.se | #pomfret (on EFnet) (abandoned) | Archive Posted | June 9, 2015 | June 17, 2015 | Success | archive |
Google Baraza | #bonanza (on EFnet) (abandoned) | Archive Posted | April 28, 2015 | May 7, 2015 | Success | archive |
Google Helpouts | #helpus (on EFnet) (abandoned) | Archive Posted | April 16, 2015 | April 21, 2015 | Success | archive |
LayerVault | #layersalt (on EFnet) (abandoned) | Archive Posted | April 6, 2015 | April 11, 2015 | Success | archive |
FriendFeed | #humancentifeed (on EFnet) (abandoned) | Archive Posted | April 2, 2015 | April 9, 2015 | Qualified Success | archive |
Last.fm | #lastchance.fm (on EFnet) (abandoned) | Archive Posted | March 30, 2015 | August 28, 2015 | Qualified Success | archive |
FurAffinity | #iceking (on EFnet) (abandoned) | Archive Posted | March 26, 2015 | June 15, 2015 | Success | archive |
Madden GIFERATOR | #jiferator (on EFnet) (abandoned) | Archive Posted | March 21, 2015 | March 23, 2015 | Success | archive |
RapidShare | #rapidscare (on EFnet) (abandoned) | Archive Posted | March 20, 2015 | March 29, 2015 | Qualified Success | archive |
Trovebox | #treasuretrove (on EFnet) (abandoned) | Archive Posted | March 14, 2015 | June 27, 2015 | Success | archive |
Google Business Sitebuilder | #sitebreaker (on EFnet) (abandoned) | Archive Posted | March 9, 2015 | March 10, 2015 | Success | archive |
Blogger | #frogger (on EFnet) (abandoned) | Aborted | February 25, 2015 | May 6, 2015 | ||
TestFlight | #crashed (on EFnet) (abandoned) | Archive Posted | February 13, 2015 | February 25, 2015 | Success | archive |
Cobook | #cookbook (on EFnet) (abandoned) | Archive Posted | February 9, 2015 | February 11, 2015 | Success | archive |
Ovi Store | #downlovi (on EFnet) (abandoned) | Archive Posted | February 3, 2015 | February 15, 2015 | Qualified Success | archive |
Inkblazers | #inkerasers (on EFnet) (abandoned) | Archive Posted | January 18, 2015 | January 31, 2015 | Success | archive |
Brace.io | #braceyourself (on EFnet) (abandoned) | Archive Posted | January 12, 2015 | January 18, 2015 | Success | archive |
Vstreamers | #destreamers (on EFnet) (abandoned) | Archive Posted | January 6, 2015 | January 10, 2015 | Success | archive |
Nokia Memories | #backtorubber (on EFnet) (abandoned) | Archive Posted | December 30, 2014 | December 30, 2014 | Success | archive |
Microsoft Clip Art | #clipfart (on EFnet) (abandoned) | Archive Posted | December 23, 2014 | December 29, 2014 | Success | archive |
Roon | #rooined (on EFnet) (abandoned) | Archive Posted | December 20, 2014 | December 21, 2014 | Success | archive |
ZipList | #zipyourlips (on EFnet) (abandoned) | Archive Posted | December 2, 2014 | December 4, 2014 | Success | archive |
Viddy | #viddiot (on EFnet) (abandoned) | Archive Posted | December 2, 2014 | December 15, 2014 | Success | archive |
Halo (Halo 2 & 3 stuff) |
#yolohalo (on EFnet) (abandoned) | Archive Posted | November 6, 2014 | June 23, 2015 | Success | archive |
GameMaker Sandbox | Archive Posted | October 15, 2014 | October 19, 2014 | Success | archive | |
Qwiki | #quickie (on EFnet) (abandoned) | Archive Posted | September 28, 2014 | November 1, 2014 | Qualified Success | archive |
Quizilla | #fizzilla (on EFnet) (abandoned) | Archive Posted | September 4, 2014 | October 1, 2014 | Success | archive |
Ancestry.com | #ancienthistory (on EFnet) (abandoned) | Archive Posted | September 19, 2014 | November 5, 2014 | Success | archive |
TwitPic | #quitpic (on EFnet) (abandoned) | Archive Posted | September 4, 2014 | January 2, 2015 | Qualified Success | archive |
Verizon Personal Web Space | #verizoff (on EFnet) (abandoned) | Archive Posted | September 2, 2014 | October 1, 2014 | Qualified Success | archive |
Swipnet | #swiped (on EFnet) (abandoned) | Archive Posted | August 19, 2014 | September 1, 2014 | Success | archive |
Canv.as | #canvas (on EFnet) (abandoned) | Archive Posted | August 11, 2014 | August 12, 2014 | Success | archive |
Twitch.tv | #burnthetwitch (on EFnet) (abandoned) | Archive Posted | August 9, 2014 | August 24, 2014 | Qualified Success | archive |
Fotopedia | #fotofinished (on EFnet) (abandoned) | Archive Posted | August 5, 2014 | August 7, 2014 | Success | archive |
Yahoo! Voices | #shutup (on EFnet) (abandoned) | Archive Posted | July 28, 2014 | July 31, 2014 | Success | archive |
Justin.tv | #justouttv (on EFnet) (abandoned) | Archive Posted | June 5, 2014 | June 15, 2014 | Success | archive |
Viddler | #fiddler (on EFnet) (abandoned) | Cancelled | February 21, 2014 | February 27, 2014 | Qualified Success | archive |
Bebo | #cockandballs (on EFnet) (abandoned) | Hiatus | February 18, 2014 | archive | ||
My Opera | #fatlady (on EFnet) (abandoned) | Archive Posted | February 16, 2014 | March 3, 2014 | Success | archive |
Dogster | #rawdogster (on EFnet) (abandoned) | Archive Posted | February 7, 2014 | February 16, 2014 | Success | archive |
Wretch & Yahoo! Blog | #shipwretched (on EFnet) (abandoned) | Archive Posted | December 17, 2013 | January 9, 2014 | Qualified Success | archives: Wretch, Yahoo Blog |
Hyves | #angerthehyve (on EFnet) (abandoned) | Archive Posted | November 10, 2013 | December 2, 2013 | Success | archive |
Blip.tv | #blooper.tv (on EFnet) (abandoned) | Archive Posted | October 11, 2013 | August 27, 2015 | Qualified Success | archive 1 archive 2 |
Zapd | #crapd (on EFnet) (abandoned) | Archive Posted | October 1, 2013 | October 8, 2013 | Success | archive |
Xanga | #jenga (on EFnet) (abandoned) | Downloads Paused | June 21, 2013 | August 31, 2013 | archive | |
Streetfiles.org | #streetsoffire (on EFnet) (abandoned) | Archive Posted | April 28, 2013 | April 30, 2013 | Qualified Success | archive |
Yahoo! Upcoming | #outgong (on EFnet) (abandoned) | Archive Posted | April 20, 2013 | April 25, 2013 | archive | |
Formspring | #firespring (on EFnet) (abandoned) | Archive Posted | March 24, 2013 | September 19, 2013 | Success | archive |
Yahoo! Messages | #BurnTheMessenger (on EFnet) (abandoned) | Archive Posted | March 20, 2013 | March 31, 2013 | archive | |
Storylane | Archive Posted | March 8, 2013 | March 15, 2013 | archive | ||
Posterous | #preposterous (on EFnet) (abandoned) | Archive Posted | February 23, 2013 | June 29, 2013 | archive | |
Xanga | #jenga (on EFnet) (abandoned) | Downloads Paused | January 22, 2013 | February 16, 2013 | archive, user lookup, user list | |
Punchfork | Archive Posted | January 11, 2013 | March 6, 2013 | archive, user lookup | ||
URLTeam | #urlteam (on hackint) | Active | all releases | |||
weblog.nl | Archive Posted | January 19, 2013 | February 2, 2013 | archive, user lookup | ||
Yahoo! Blog | #yahooblah (on EFnet) (abandoned) | Archive Posted | January 8, 2013 | January 19, 2013 | archive | |
GitHub Downloads | Archive Posted | December 13, 2012 | December 17, 2012 | Success | archive, index | |
Daily Booth | Archive Posted | November 19, 2012 | December 29, 2012 | archive, user lookup | ||
BT Internet | Archive Posted | October 10, 2012 | November 2, 2012 | Success | archive | |
Webshots | #webshots (on EFnet) (abandoned) | Archive Posted | October 4, 2012 | November 18, 2012 | archive, user lookup | |
City of Heroes | Archive Posted | September 3, 2012 | December 1, 2012 | Success | archive | |
Cinch.FM | Archive Posted | August 20, 2012 | August 22, 2012 | Success | archive | |
Tumblr (test project) | Archive Posted | August 9, 2012 | August 19, 2012 | archive (tar), archive (warc) | ||
Picplz | Archive Posted | June 3, 2012 | June 15, 2012 | archive, user lookup, index | ||
Tabblo | Archive Posted | May 23, 2012 | May 26, 2012 | Success | archive, user lookup | |
FortuneCity | #fortuneshitty (on EFnet) (abandoned) | Archive Posted | April 4, 2012 | April 11, 2012 | Qualified Success | archive, user lookup |
MobileMe | Archive Posted | April 3, 2012 | Aug 8, 2012 | Success | archive, user lookup, index |
Status
- In Development
- a future project
- Active
- start up a Warrior and join the fun; this one is in progress right now
- Active (paused)
- not running currently but stay tuned!
- On Hold
- project suspended indefinitely but not given up
- Downloads Finished
- we've finished downloading the data
- Archived
- the collected data has been properly archived
- Archive Posted
- the archive is available for download
Result
- Success
- downloaded all of the data and posted the archive publicly
- Qualified Success
- either we couldn't get all of the data, or the archive can't be made public
- Failure
- the site closed before we could download anything
Manual projects
Difficult, discussion-intensive, human-resource-intensive and audit projects.
Last updated: 2021-10-31 (Edit this list)
Project | IRC channel | Description | Status | Started | Finished | Archives/Results |
---|---|---|---|---|---|---|
Yahoogroups-joiner | #yahoosucks (on hackint) | Filling out captchas to archive Yahoo Groups | Active | 2019-10-19 | leaderboard | |
Project Newsletter | #projectnewsletter (on hackint) | Archiving all the email newsletters | Active | 2015-03-27 | ||
Woohoo | #woohoo (on EFnet) (abandoned) | Doing a census of all of Yahoo!'s products | Active | 2015-03-13 | result | |
Froogle | #froogle (on EFnet) (abandoned) | Doing a census of all of Google's products | Active | 2015-03-13 | result | |
INTERNETARCHIVE.BAK | #internetarchive.bak (on hackint) | Backing up the Internet Archive | Active | 2015-03-02 | stats | |
ISP Hosting | #webroasting (on hackint) | Finding ISP web hosting services before the Grim Reaper finds them. | Active | 2014-12-30 | see there | |
Project Valhalla | #huntinggrounds (on hackint) | Discussing where and how to store archives that are too big for the Internet Archive at the moment. | Active | 2014-09-18 | see there | |
Audit2014 | #auditteam (on hackint) | We've uploaded a bunch of stuff. Let's go through the list and make sure it's categorized, has decent metadata, etc. | Active | 2014-07-16 | list, the content | |
ArchiveBot | #archivebot (on hackint) | IRC bot designed to automate the archival of smaller websites | Active | 2013-09-06 | archives, search | |
AOL | #aohell (on hackint) | Archiving the original AOL, not AOL's current website | Active | 2013-01-28 | [1] | |
WikiTeam | #wikiteam (on hackint) | Exporting Mediawiki databases in XML dumps | Active | 2011-04-05 | [2] | |
FTP | #effteepee (on hackint) | Downloading all the FTP sites | Active | e.g. [3] |
Small projects
List of smaller website rescuing projects, usually done by single individuals.
Last updated: 2016-02-12 (Edit this list)
See also what's been crawled by ArchiveBot: browse here.
For Hungarian websites, see bzc6p's userpage.
You should also try searching on http://archive.org including keyword archiveteam, or for browsing, directly in the Wayback Machine.
Website | Site status | Closure date | Archiving status | Archived by | Started | Finished | Archives |
---|---|---|---|---|---|---|---|
Wikispot | Closed | 2015-07-27 | Partially saved | bzc6p | 2015-06-30 | 2015-07-31 | [4] |
Pastebin | Online | In progress... | joepie91 | 2014-09-09 | |||
TechNet | Closing | 2014-03-28 | Partially saved | Arkiver, Mithrandir, Darkstar | |||
Widgetbox | Closed | 2014-09-30 | Saved | Arkiver | 2013-12-19 | ||
Quick.io | Closed | 2013-12-31 |
Saved |
Arkiver | 2013-12-13 | 2013-12-13 | |
winamp.com |
Saved |
2013-11 | 2013-11 | [5] |
Early projects
List of ArchiveTeam's early endavours, for historical interest, not edited.
Last updated: 2023-12-10 (Edit this list)
Some archives available for downloading, by Archive Team or by other volunteers or groups.
Look at Archive Team Collection at Internet Archive too.
Available for download
Title/Download link | Description | Size |
---|---|---|
Geocities - The PATCHED Torrent (IA) | The popular web hosting service founded in 1994. It was closed by Yahoo! in 2009 | 641.4 GB |
URL Shortener Backup Torrent v4 | URLTeam compressed backups of various URL shorteners (README) | 75 GB |
URL Shortener Backup Torrent v3 outdated, use v4 | URLTeam compressed backups of various URL shorteners (README) | 50 GB |
URL Shortener Backup Torrent v2 outdated, use v4 | URLTeam compressed backups of various URL shorteners (README) | 48 GB |
URL Shortener Backup Torrent v1 outdated, use v4 | URLTeam compressed backups of various URL shorteners (README) | 41.1 GB |
Papers from Philosophical Transactions of the Royal Society | This archive contains 18,592 scientific publications totaling 33GiB, all from Philosophical Transactions of the Royal Society and which should be available to everyone at no cost, but most have previously only been made available at high prices through paywall gatekeepers like JSTOR. | 32.48 GB |
The May 2011 Calufa Twitter Scrape | 90+ million tweets from more than 6 million users | 14.9 GB |
Internet Gopher Archive 2007 (IA) | Archive of gopher sites | 14.8 GB |
Encyclopedia Dramatica January 2010 Mirror | lulz | 11.7 GB |
The TEXTFILES.COM Time Capsule | This collection comprises all the major text-based sets of the TEXTFILES.COM site | 11 GB |
Salon Table Talk | Threads of this talk site | +6.0 GB |
Usenet Archive of UTZOO Tapes | Collection of .TGZ files of very early USENET posted data | 2.0 GB |
Quux.org Gopher Mirror Collection 2006 (IA) | This is a collection of mirrors maintained by gopher.quux.org. These mirrors were taken offline in 2006 due to bandwidth constraints | 1.5 GB |
full-history-linux.git.tar | GIT repository of Linux Kernel from 1991 to 2010 (details) | 594 MB |
Cheng-Caverlee-Lee September 2009 - January 2010 Twitter Scrape | Almost 10 million tweets | 425 MB |
The 2010 Reddit Research Project | Dataset on affinities of 60,000+ Reddit users, recorded in 2010 | ~360 MB |
Archive Team Starwars.Yahoo.Com Panic Download | This is a panic download of the starwars.yahoo.com forums and profiles, done before the closure of same by Yahoo on December 15, 2009. This includes as many messages, profiles, and pages related to the site as could be easily brought in. | ~250 MB |
Social Structure of Facebook Networks Facebook Data Scrape | Facebook data scrape related to paper "The Social Structure of Facebook Networks", by Amanda L. Traud, Peter J. Mucha, Mason A. Porter | 197 MB |
Archive Team's Etherpad Time Capsule | This archive contains roughly 6,400 Etherpads, in their final state | 125 MB |
WikiTeam archives | Archives about wikis. See WikiTeam | +100 MB |
Archive Team | Archive Team.org Site Rip from August 03, 2011 | 75 MB |
Boing Boing Posts Archive (2000-2011) | Two collections of Boing Boing postings provided by the cultural website boingboing.net on its 5th and 11th anniversaries | 42 MB |
Archive Team Quotes Database Backup | Amusing snatches of conversation from IRC and other online gathering places | 5 MB |
Mirror of Revelation Passage Series Website | wget of a small author's website. | ~500kb |
Archive Team Powerblogs Shutdown Snapshot | This is a 108-blog snapshot of the final month of Powerblogs, before their shutdown | ? |
BBC Closing Panic Archives | Some BBC sites | ? |
stillflying.net | A firefly fan fiction site that maded the rest of season 1 and season 2 pdf scripts for what would have been if firefly wasn't canceled. | 408.1mb |
Google Reader | Text for 46M feeds, per-feed statistics, Reader Directory search results | ~8800GB |
Earbits | Website, ~130,000 MP3s and metadata. | ~650GB |
SciMag | 38 million scientific articles | ~28TB |
Google Video | ||
Yahoo! Video |
Archived but not available
- The following three sections have been moved here without modification from the old Projects page.
Finished projects
This is a list of completed projects which do not have their own page on this wiki.
See Category:Rescued Sites for projects which do have their own page on this wiki.
- Archive Team founded by Jason Scott [6]
- an archiveteam thepiratebay.org user created by bbot
- Get the password from him or Jason. (Not really a project, per se.)
- (mirror | 4.5MB archive) The infoAnarchy wiki was archived by Scott.
- infoAnarchy was down for several months in the first part of 2011, but is back up as of May 2011. There is now very little content updating on the site. As of 2014-06-02, infoAnarchy has a "Revive infoanarchy.org blog & wiki" notice and a request for donations, suggesting it may not have a future. As of 2014-06-02, a "database is locked" message will be given to logged-in users.
- If there are future updates to that archive, they may be found at http://sdboyd56.com/archives/
- FIXME - This archive has non-relative links, requiring it to be in /infoanarchy. It needs to be redone or edited to have relative links.
- FIXME - This archive does not include the complete history, which is absolutely essential in this case, as significant editing history exists.
- (mirror) The Cyberpunk Project was archived by Scott
- Note that this wiki does not allow the Russian TLD, so the URL will have to be edited to be visited.
- Most pages haven't been changed since 2007. It hasn't been updated or changed since April 2010.
- FIXME - this mirror is incomplete, or its links are pointing to the live website.
- (archive) Emulation Zone was archived by Start
- FIXME - vgaa.emulationzone.org-2014-0708.warc.gz got interrupted by a crash and needs to be re-archived
Other projects
- FanFiction.Net is being pre-emptively archived.
- seanp2k is running somaseek.com and tracking all the song history for all of the internet radio stations on somafm.com since March 2010.
- Ross is interviewing the sites of 2008.
- l.m.orchard is starting work on some self-hosted web apps that will migrate and archive from other sites. (ie. FriendFeed, Delicious)
- sungo is archiving etherpad.
- Tsp is attempting to archive the stories from fanfiction.net and fictionpress.
- emijrp is a member of WikiTeam. Also, downloading albums from Jamendo. You can know more about his projects in his userpage.
- Jean-Claude Bradley and Andrew Lang are archiving the Open Notebook Science projects Reaction Attempts and the ONS Solubility Challenge. This includes the lab notebooks and all associated raw data files.
- Hydriz is currently archiving all available dumps and downloads generated by Wikimedia and uploading them to the Internet Archive (see collection).
Dead projects
- EmuWikiAdmin created EmuWiki, a collection of all emulators, emulator documents, and hardware information that exists, regrouped in a referenced database. Unfortunately, it shut down in May 2010 due to copyright issues. A 20GB torrent was released, and its contents are available at https://archive.org/details/EmuWiki_Collection.
Some more
You'll find traces of some other old projects on the historical IRC channel list: IRC/Old.
← Fire Drill • Projects • Philosophy →