Difference between revisions of "Main Page/Current Projects"

From Archiveteam
Jump to navigation Jump to search
(🟒)
Β 
(390 intermediate revisions by 42 users not shown)
Line 1: Line 1:
__NOTOC__
__NOTOC__
== Archive Team recruiting ==
== Archive Team recruiting ==
* [[Dev|Want to code for Archive Team? Here's a starting point.]]
* Help us: '''[[ArchiveTeam_Warrior|☞ Download and run your warrior ☜]]'''. What's on: [https://tracker.archiveteam.org/ online tracker].
* Help us: '''[[warrior|☞ Download and run your warrior ☜]]'''.<br>
* '''[[Donate]]''' to keep our projects going.
* What's on: [http://tracker.archiveteam.org/ online tracker].<br>
* Anything shutting down? Put it on the '''[[Deathwatch]]''' or tell us on '''[[IRC]]'''!
<!--Combined project activity graphs [http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/project_items.html here].-->
* Want to code for Archive Team? [[Dev|Here's a starting point.]]


== Warrior based projects ==
== Warrior-based projects ==
{{:CurrentWarriorProject}}
{{:CurrentWarriorProject}}


<!-- Urgent projects -->
=== Short-term, urgent projects ===
<!-- Long-term projects -->
<!-- Projects with strong deadline, deadline is in the future -->
* [[NewsGrabber]]: Saving all news articles. Help with server power or by finding more newssites. '''IRC Channel {{IRC|newsgrabber}}'''.
<!-- sorted by deadline (soonest on top) -->
* [[Tistory]]: Will delete inactive blogs on {{datetime|2025-09-22}} '''IRC Channel {{IRC|tatteredstory}}'''
* [[Typepad]]: A blogging service ceased to exist by the end of September 2025. '''IRC Channel {{IRC|typebad}}'''
Β 
=== Medium-term projects ===
<!-- Projects for which the deadline has passed, deadline is unclear, but there is a moment they are "finished" -->
<!-- sorted alphabetically -->
* [[Meta Ad Library]]: Database for advertisements for Facebook and other products by Meta. '''IRC Channel {{IRC|fads}}'''
* [[Peing]]: A Japanese question/answer service, was slated to be shutdown on {{datetime|2025-08-29}}. '''IRC Channel {{IRC|peingpong}}'''
* [[US Government]]: Archiving the US government. '''IRC Channel {{IRC|UncleSamsArchive}}'''
** [[Radio Free Asia]]: Non-profit media organization owned by USAGM.
** [[Radio Free Europe|Radio Free Europe/Radio Liberty]]: Non-profit media organization owned by USAGM.
** [[Voice of America]]: An internationally-broadcasting state media network at risk of closure.
Β 
=== Long-term projects ===
<!-- Ongoing projects. No deadline, no moment of "finishing" -->
<!-- sorted alphabetically -->
* [[Microsoft Update]]: Removal of legacy Windows drivers announced. '''IRC Channel {{IRC|windowfixer}}'''
* [[Telegram]]: Archiving public messages in various newsworthy and/or otherwise notable Telegram channels. '''IRC Channel {{IRC|telegrab}}'''.
* [[Twitch]]: Archiving metadata and select videos. '''IRC Channel {{IRC|burnthetwitch}}'''
* [[URLTeam]]: URL shorteners were a fucking awful idea. '''IRC Channel {{IRC|urlteam}}'''.
* [[URLTeam]]: URL shorteners were a fucking awful idea. '''IRC Channel {{IRC|urlteam}}'''.
* [[WikiTeam]]: Saving wikis dumps (XML). And their external links for the Wayback Machine (WARC) as well as xporting Mediawiki databases. Permanent effort, [https://github.com/WikiTeam/wikiteam/wiki/Tutorial#I_have_no_shell_access_to_server everyone can help] (you choose the size of your downloads). '''IRC Channel {{IRC|wikiteam}}'''.
* [[URLs]]: A random collection of stuff. '''IRC Channel {{IRC|//}}'''.
=== Scripts only ===
* [[YouTube]]: Archiving [[YouTube#Scope|selected videos]]. '''IRC Channel {{IRC|down-the-tube}}'''.
* [[FTP]]: Download all the FTP sites! '''IRC Channel {{IRC|effteepee}}'''.
Β 
=== Long-term, slower-paced projects ===
These are projects that are actively running but generally only have small numbers of items available to complete at a time.
<!-- sorted alphabetically -->
* [[Blogger]]: Grabbing inactive Blogger blogs since Google began a mass purge of inactive Google accounts on or after {{datetime|2023-12-01}}. '''IRC Channel {{IRC|frogger}}'''.
* [[GitHub]]: Embraced-uh, I mean, bought by Microsoft. '''IRC Channel {{IRC|gitgud}}'''.
* [[Imgur]]: Unregistered users' "old" and "inactive" images will be purged, and all NSFW content is being shown the door on {{datetime|2023-05-15}}. '''IRC Channel {{IRC|imgone}}'''.
* [[MediaFire]]: [https://twitter.com/textfiles/status/1349516443654758401 Not 'at-risk' but grabbing speculatively to save historic files] '''IRC Channel {{IRC|mediaonfire}}'''.
* [[Pastebin]]: Archiving the pastas. '''IRC Channel {{IRC|pastalavista}}'''.


== Manual projects ==
== Manual projects ==
* [[ArchiveBot]]: For those with lots of disk space, bandwidth and long-term commitment. '''IRC Channel {{IRC|archivebot}}'''.
* [[ArchiveBot]]: For those with lots of disk space, bandwidth and long-term commitment. '''IRC Channel {{IRC|archivebot}}'''.
* [[FTP]]: Help us find all FTP sites! '''IRC Channel {{IRC|effteepee}}'''.
* [[Codearchiver]]: Dumping and archival of source code repositories and associated version control systems. '''IRC Channel {{IRC|codearchiver}}'''.
* [[INTERNETARCHIVE.BAK]]: Grab a slice of the big cake of [[Internet Archive|The Archive]]! '''IRC Channel {{IRC|internetarchive.bak}}'''.
* Dead people: When people die, their webpages and/or social media might go "Poof!" due to fees and other knick-knack. '''IRC Channel {{IRC|archiveteam}}'''
* [[ISP Hosting]]: Finding ISP web hosting services before the Grim Reaper finds them. '''IRC Channel {{IRC|webroasting}}'''.
* [[Wikibot]] and [[WikiTeam]]: Saving wikis dumps (XML). And their external links for the Wayback Machine (WARC) as well as exporting MediaWiki databases. Permanent effort, [https://github.com/WikiTeam/wikiteam/wiki/Tutorial#I_have_no_shell_access_to_server everyone can help] (you choose the size of your downloads). '''IRC Channels {{IRC|wikibot}} {{IRC|wikiteam}}'''.
* [[Mozilla Addons]] (aka addons.mozilla.org, AMO): deleting legacy addons in October 2018. '''IRC Channel {{IRC|outofammo}}'''.
* [[Formats|File Formats]] and [[Just Solve the Problem 2012|Just Solve]]: Let's Document all the File Formats! also has contents on the likes of subdomains, e.g. [http://fileformats.archiveteam.org fileformats.archiveteam.org] and [http://justsolve.archiveteam.org justsolve.archiveteam.org]. '''IRC Channels {{IRC|justsolve}}'''
* [[Project Newsletter]]: Archiving e-newsletters, currently in development. '''IRC Channel {{IRC|projectnewsletter}}'''.


* [[GeoCities Japan]]: It's GeoCities all over again... The Japanese version shuts down in March 2019. '''IRC Channel {{IRC|notagain}}'''.
== Recently finished projects ==
<!-- projects that have finished in the last 30 days go here in reverse-chronogical order to be found easily and showcase recent work. additionally, keep projects here that are still in the tracker but not yet deleted so it won't confuse people. -->
* [[Oshiete! Goo]]: A Q&A service closed on September 17, 2025. '''IRC Channel {{IRC|itsgoone}}'''


== Active Projects Not Yet Ingesting ==
== Upcoming & proposed projects ==
<!-- Websites you would like to have archived. Please create a wikipage about the project with information about the website (shutting down? (when), why should it be archived, etc.). -->
<!-- Top priority: could disappear anytime now -->
<!-- Top priority: could disappear anytime now -->
<!-- Shutting down, definite deadline given -->
<!-- Shutting down, definite deadline given -->
* [[Flickr]]: <s>[[Yahoo!]]</s> SmugMug decided to kill it after finding Yahoo!'s plans to do so before they were bought by Verizon. '''IRC Channel {{IRC|flickrfckr}}'''.
* [[Goo Blog]]: A blogging service closed on {{datetime|2025-11-25}} '''IRC Channel {{IRC|itsgoone}}'''
Β 
* [[Chrome Web Store]]: Google has announced a timeline of policy changes that will lead to content being removed between {{datetime|2021-12-01}} and 2025. '''IRC Channel {{IRC|chromeweblore}}'''.
== Proposed projects ==
<!-- Websites you would like to have archived. Please create a wikipage about the project with information about the website (shutting down? (when), why should it be archived, etc.). -->
<!-- Shutting down, vague deadline given -->
<!-- Shutting down, vague deadline given -->
* [[AccuWeather#AccuWeather_Forums|AccuWeather Forums]]: discussion forum for meteorologists is going down the drain sometime in 2018. '''IRC Channel {{IRC|accio-weather}}'''.
* [[Photobucket]]: Finally following through on over a year of email threats that free accounts are going to be mass deactivated if they don't pay up. '''IRC Channel {{IRC|photosucket}}'''.
* [[Google+]]: Google finally realised they failed. Shutting down by August 2019. '''IRC Channel {{IRC|googleminus}}'''.
* Abandoned iOS App Store & Google Play apps: Both Apple and Google are slimming down on abandoned apps, [https://www.cnet.com/tech/mobile/one-third-of-apple-and-google-apps-are-so-outdated-they-could-get-removed/ with an estimated ~1.5M of them at risk]. '''IRC Channel {{IRC|appocalypse}}'''.
* [[Twitter]]: General instability; deleting inactive accounts <s>{{datetime|2019-12-11}}</s> sometime. '''IRC Channel {{IRC|twitterdead|EFnet|abandoned}}'''.
<!-- Shutting down, no deadline given -->
<!-- Shutting down, no deadline given -->
* [[Imgur]]: Image hoster decided that using it for hosting images is not permitted. '''IRC Channel {{IRC|imgone}}'''.
<!-- Archiving the archives -->
<!-- Archiving the archives -->
<!-- Misc. projects (unmaintained sites, distrust in owners) -->
<!-- Misc. projects (unmaintained sites, distrust in owners) -->
* [[JamiiForums]]: the Tanzanian government would like this gone. '''IRC Channel {{IRC|jammedforums}}'''.
* [[Dailymotion]]: Archiving inactive videos. '''IRC Channel {{IRC|DailyDemotion}}'''
* [[ZAM Network]]: Setting several sites on fire in autumn 2018.
* [[VKontakte]]: A Russian equivalent of Facebook carries the risk of tumbling down under the weight of sanctions as a result of the government's invasion of Ukraine. '''IRC Channel {{IRC|lostkontakt}}'''.
* [[GitHub]]: Reportedly embraced-uh, I mean, bought by Microsoft. '''IRC Channel {{IRC|getgit}}'''.
* [[JamiiForums]]: the Tanzanian government would like this gone. '''IRC Channel {{IRC|jammedforums|EFnet|abandoned}}'''.
* [[LiveJournal]]: Very old, widely regarded as in decline, and has a lot of important stuff buried in it. '''IRC Channel {{IRC|recordedjournal}}'''.
* [[LiveJournal]]: Very old, widely regarded as in decline, and has a lot of important stuff buried in it. '''IRC Channel {{IRC|recordedjournal|EFnet|abandoned}}'''.
* [[Tumblr]]: [[Yahoo!]] considered killing it, now Yahoo has been acquired. '''IRC Channel {{IRC|stumblr}}'''.
* [[The Pirate Bay]]: Recently came back up, grabbing an archive for sanity's sake. '''IRC Channel {{IRC|yarharfiddlededee|EFnet|abandoned}}'''.
* [[Ownlog]]: Ownlog is losing popularity and support from its owners. '''IRC Channel {{IRC|pwnlog}}'''.
* [[Valhalla]]: Where to store what even the [[Internet Archive]] doesn't have space for? '''IRC Channel {{IRC|huntinggrounds}}'''.
* [[The Pirate Bay]]: Recently came back up, grabbing an archive for sanity's sake. '''IRC Channel {{IRC|yarharfiddlededee}}'''.
* [[Giphy]]: Bought by <s>Facebook</s>Shutterstock, to be "integrated" (assimilated) into <s>Instagram</s> https://news.knowyourmeme.com/news/facebook-to-buy-giphy
Β 
== On hiatus ==
* [[Angelfire]]: Angelfire is a web hosting service that contains big chunks of early WWW history and has no proper backup. '''IRC Channel {{IRC|angelonfire}}'''.
* [[Angelfire]]: Angelfire is a web hosting service that contains big chunks of early WWW history and has no proper backup. '''IRC Channel {{IRC|angelonfire}}'''.
* [[Valhalla]]: Where to store what even the [[Internet Archive]] doesn't have space for?
* [[Audit2014|Audit 2014]]: It's time to verify our shit. '''IRC Channel {{IRC|auditteam}}'''.
Β 
* [[Flickr]]: <s>[[Yahoo!]]</s> SmugMug decided to kill it after finding Yahoo!'s plans to do so before they were bought by Verizon. '''IRC Channel {{IRC|flickrfckr}}'''.
== Recently finished ==
* [[FTP]]: Help us find and download all FTP sites! '''IRC Channel {{IRC|effteepee}}'''.
<!-- put projects here that are still in the tracker but not yet deleted so it won't confuse people -->
* [[Google Drive]]: Same as MediaFire. '''IRC Channel {{IRC|googlecrash}}'''. Currently on hiatus.
* [[TF2 Outpost]]: Setting itself and TF2 trades on fire after September in brazen act of insurance fraud due to costs. '''IRC Channel TBD'''.
* [[Google Groups]]: "Gone within a year" ([[User:Jscott|SketchCow]], {{datetime|2016-06-07}}).
* [[Tindeck]]: Creative Commons audio hosting site going silent on 2018-08-01. '''IRC Channel {{IRC|findeck}}'''.
Β 
== Hiatus / Missed the Mark ==
* [[Bazaar.tf]]: TF2 item trading site <s>is calling it quits and closing its shop for good in December</s> changed its mind on quitting after TF2 Outpost's closure. '''IRC Channel TBD'''.
* [[Google News Archive]]: Let's store all newspapers at Google, WCGW? '''IRC Channel {{IRC|papersplease}}'''.
* [[Google News Archive]]: Let's store all newspapers at Google, WCGW? '''IRC Channel {{IRC|papersplease}}'''.
* [[INTERNETARCHIVE.BAK]]: Grab a slice of the big cake of [[Internet Archive|The Archive]]! '''IRC Channel {{IRC|internetarchive.bak}}'''.
* [[ISP Hosting]]: Finding ISP web hosting services before the Grim Reaper finds them. '''IRC Channel {{IRC|webroasting}}'''.
* [[Livestream]]: A video stream site merging with Vimeo in {{datetime|2025-01}}. '''IRC Channel {{IRC|deadtrickle}}'''
* [[Miraheze]]: <s>Shutting down sometime between {{datetime|2023-09-01}} and {{datetime|2023-10-31}}.</s> Rescued by new volunteers!
* [[Project Newsletter]]: Archiving e-newsletters, currently in development. '''IRC Channel {{IRC|projectnewsletter}}'''.
* [[Quizlet]]: Flashcards and other learning tools '''IRC Channel {{IRC|quizletusin}}'''.
* [[Quizlet]]: Flashcards and other learning tools '''IRC Channel {{IRC|quizletusin}}'''.
* [[yuku]]: Lately yuku is very unstable and hosting thousands of forums. Project currently paused. '''IRC Channel {{IRC|archiveteam}}'''.
* [[Tinkercad]]: Autodesk announced its intent to put designs from inactive OAuth accounts back into minds around {{datetime|2021-05-24}}. '''IRC Channel {{IRC|tinkerhad}}'''.
* [[Audit2014|Audit 2014]]: It's time to verify our shit. '''IRC Channel {{IRC|auditteam}}'''. THIS PROJECT IS ON HIATUS AND WILL BE RETURNED TO AS A AUDIT2018.
* [[Tumblr]]: [[Yahoo!]] considered killing it, now Yahoo has been acquired and Verizon declared war on NSFW blogs. Tumblr has since been sold to Automattic. '''IRC Channel {{IRC|tumbledown}}'''.
* [[Woohoo]]: Yahoo is untrustworthy, let's do a census of all their products. '''IRC Channel {{IRC|woohoo}}'''. WELL YAHOO IS SORTA DEADISH.
* [[Reddit]]: Banning communities that generate bad PR for Reddit Inc. Restricted access to APIs and data on {{datetime|2023-06-19}}. '''IRC Channel {{IRC|shreddit}}'''.
* [[Google Groups]]: "Gone within a year" ([[User:Jscott|SketchCow]], 2016-06-07).
* [[The Escapist]]: Staff activity dwindling due to layoffs, remaining supporters are doing it for free. '''IRC Channel TBD'''.
* [[YTMND]]: Supposed to be "shutting down soon" in 2016, but still up 2 years later. '''IRC Channel TBD'''.
* [[DevPort]]: This [http://developerportfolio.com/ portfolio SaaS provider] has [http://www.lowendtalk.com/discussion/65135/need-some-help-saas-provider-is-dead-but-my-site-is-still-up-how-should-i-grab-it reportedly] been having infrastructure issues, and removed their social media accounts. Possible impending shutdown.


<small>ArchiveTeam uses the EFnet IRC network – irc://irc.efnet.org – webchat: http://chat.efnet.org:9090 – [[IRC|More info]]</small>
<small>ArchiveTeam uses the hackint IRC network – ircs://irc.hackint.org:6697 (TLS required) – webchat: https://chat.hackint.org/#/connect?join=archiveteam-bs – [[Archiveteam:IRC|More info]]</small>

Latest revision as of 14:42, 20 September 2025

Archive Team recruiting

Warrior-based projects

ArchiveTeam's Choice: Tistory

Short-term, urgent projects

  • Tistory: Will delete inactive blogs on 2025-09-22 IRC Channel #tatteredstory (on hackint)
  • Typepad: A blogging service ceased to exist by the end of September 2025. IRC Channel #typebad (on hackint)

Medium-term projects

Long-term projects

Long-term, slower-paced projects

These are projects that are actively running but generally only have small numbers of items available to complete at a time.

Manual projects

Recently finished projects

Upcoming & proposed projects

On hiatus

  • Angelfire: Angelfire is a web hosting service that contains big chunks of early WWW history and has no proper backup. IRC Channel #angelonfire (on hackint).
  • Audit 2014: It's time to verify our shit. IRC Channel #auditteam (on hackint).
  • Flickr: Yahoo! SmugMug decided to kill it after finding Yahoo!'s plans to do so before they were bought by Verizon. IRC Channel #flickrfckr (on hackint).
  • FTP: Help us find and download all FTP sites! IRC Channel #effteepee (on hackint).
  • Google Drive: Same as MediaFire. IRC Channel #googlecrash (on hackint). Currently on hiatus.
  • Google Groups: "Gone within a year" (SketchCow, 2016-06-07).
  • Google News Archive: Let's store all newspapers at Google, WCGW? IRC Channel #papersplease (on hackint).
  • INTERNETARCHIVE.BAK: Grab a slice of the big cake of The Archive! IRC Channel #internetarchive.bak (on hackint).
  • ISP Hosting: Finding ISP web hosting services before the Grim Reaper finds them. IRC Channel #webroasting (on hackint).
  • Livestream: A video stream site merging with Vimeo in 2025-01. IRC Channel #deadtrickle (on hackint)
  • Miraheze: Shutting down sometime between 2023-09-01 and 2023-10-31. Rescued by new volunteers!
  • Project Newsletter: Archiving e-newsletters, currently in development. IRC Channel #projectnewsletter (on hackint).
  • Quizlet: Flashcards and other learning tools IRC Channel #quizletusin (on hackint).
  • Tinkercad: Autodesk announced its intent to put designs from inactive OAuth accounts back into minds around 2021-05-24. IRC Channel #tinkerhad (on hackint).
  • Tumblr: Yahoo! considered killing it, now Yahoo has been acquired and Verizon declared war on NSFW blogs. Tumblr has since been sold to Automattic. IRC Channel #tumbledown (on hackint).
  • Reddit: Banning communities that generate bad PR for Reddit Inc. Restricted access to APIs and data on 2023-06-19. IRC Channel #shreddit (on hackint).

ArchiveTeam uses the hackint IRC network – ircs://irc.hackint.org:6697 (TLS required) – webchat: https://chat.hackint.org/#/connect?join=archiveteam-bs – More info