Difference between revisions of "Current Projects"
Jump to navigation
Jump to search
(→Upcoming & proposed projects: Add YouTube metadata) |
(Move YouTube to current projects, all projects work in Warrior these days) |
||
Line 10: | Line 10: | ||
<!-- Urgent projects --> | <!-- Urgent projects --> | ||
* [[YouTube]]: Archiving all older unlisted YouTube videos (360p) and metadata (including webpages, comments, and captions) before YouTube makes them private starting July 23, 2021. Will shift to archiving all YouTube metadata and selected videos afterwards. '''IRC Channel {{IRC|down-the-tube|network=hackint}}'''. | |||
* [[Webs]]: Vistaprint is killing off the Freewebs you knew from the 2000s on <s>March 31</s> June 30, 2021, unless you pay up. '''IRC Channel {{IRC|webbed|network=hackint}}'''. | * [[Webs]]: Vistaprint is killing off the Freewebs you knew from the 2000s on <s>March 31</s> June 30, 2021, unless you pay up. '''IRC Channel {{IRC|webbed|network=hackint}}'''. | ||
* [[Periscope]]: Another Twitter acquisition, another shutdown. This time, its live-streamer gets to join Vine in the bin at the end of March. '''IRC Channel {{IRC|microscope|network=hackint}}'''. | * [[Periscope]]: Another Twitter acquisition, another shutdown. This time, its live-streamer gets to join Vine in the bin at the end of March. '''IRC Channel {{IRC|microscope|network=hackint}}'''. | ||
<!-- Long-term projects --> | <!-- Long-term projects --> | ||
* [[URLTeam]]: URL shorteners were a fucking awful idea. '''IRC Channel {{IRC|urlteam|network=hackint}}'''. | * [[URLTeam]]: URL shorteners were a fucking awful idea. '''IRC Channel {{IRC|urlteam|network=hackint}}'''. | ||
* [[Reddit]]: Banning communities that generate bad PR for Reddit Inc. Currently grabbing ''new'' material. '''IRC Channel {{IRC|shreddit|network=hackint}}'''. | |||
'' | |||
* [[MediaFire]]: [https://twitter.com/textfiles/status/1349516443654758401 Not 'at-risk' but grabbing speculatively to save historic files] '''IRC Channel {{IRC|mediaonfire|network=hackint}}'''. | * [[MediaFire]]: [https://twitter.com/textfiles/status/1349516443654758401 Not 'at-risk' but grabbing speculatively to save historic files] '''IRC Channel {{IRC|mediaonfire|network=hackint}}'''. | ||
* Classic [[Google Sites]]: Making more sites inaccessible to the public starting September 1, 2021. '''IRC Channel {{IRC|nearlylostmygoogles|network=hackint}}'''. | * Classic [[Google Sites]]: Making more sites inaccessible to the public starting September 1, 2021. '''IRC Channel {{IRC|nearlylostmygoogles|network=hackint}}'''. | ||
* [[GitHub]]: Embraced-uh, I mean, bought by Microsoft. '''IRC Channel {{IRC|gitgud|network=hackint}}'''. | * [[GitHub]]: Embraced-uh, I mean, bought by Microsoft. '''IRC Channel {{IRC|gitgud|network=hackint}}'''. | ||
* [[URLs]]: A random collection of stuff. '''IRC Channel {{IRC|//|network=hackint}}'''. | * [[URLs]]: A random collection of stuff. '''IRC Channel {{IRC|//|network=hackint}}'''. | ||
''An updated Warrior virtual appliance (v3.2) is now available with better support for newer projects that utilize wget-at. Please download it using the link above.'' | |||
<!-- | |||
=== Scripts only === --> | |||
== Manual projects == | == Manual projects == | ||
Line 33: | Line 34: | ||
<!-- Top priority: could disappear anytime now --> | <!-- Top priority: could disappear anytime now --> | ||
<!-- Shutting down, definite deadline given --> | <!-- Shutting down, definite deadline given --> | ||
* [[Chrome Web Store]]: Google has announced a timeline of policy changes that will lead to content being removed between December 1, 2020 and June 2022. '''IRC Channel {{IRC|chromeweblore|network=hackint}}'''. | * [[Chrome Web Store]]: Google has announced a timeline of policy changes that will lead to content being removed between December 1, 2020 and June 2022. '''IRC Channel {{IRC|chromeweblore|network=hackint}}'''. | ||
<!-- Shutting down, vague deadline given --> | <!-- Shutting down, vague deadline given --> |
Revision as of 02:20, 18 July 2021
Archive Team recruiting
- Want to code for Archive Team? Here's a starting point.
- Help us: ☞ Download and run your warrior ☜.
- What's on: online tracker.
Warrior-based projects
- YouTube: Archiving all older unlisted YouTube videos (360p) and metadata (including webpages, comments, and captions) before YouTube makes them private starting July 23, 2021. Will shift to archiving all YouTube metadata and selected videos afterwards. IRC Channel #down-the-tube (on hackint).
- Webs: Vistaprint is killing off the Freewebs you knew from the 2000s on
March 31June 30, 2021, unless you pay up. IRC Channel #webbed (on hackint). - Periscope: Another Twitter acquisition, another shutdown. This time, its live-streamer gets to join Vine in the bin at the end of March. IRC Channel #microscope (on hackint).
- URLTeam: URL shorteners were a fucking awful idea. IRC Channel #urlteam (on hackint).
- Reddit: Banning communities that generate bad PR for Reddit Inc. Currently grabbing new material. IRC Channel #shreddit (on hackint).
- MediaFire: Not 'at-risk' but grabbing speculatively to save historic files IRC Channel #mediaonfire (on hackint).
- Classic Google Sites: Making more sites inaccessible to the public starting September 1, 2021. IRC Channel #nearlylostmygoogles (on hackint).
- GitHub: Embraced-uh, I mean, bought by Microsoft. IRC Channel #gitgud (on hackint).
- URLs: A random collection of stuff. IRC Channel #// (on hackint).
An updated Warrior virtual appliance (v3.2) is now available with better support for newer projects that utilize wget-at. Please download it using the link above.
Manual projects
- 2019-2021 coronavirus outbreak: Documenting and preserving data, events, and impacts of the virus on society. IRC Channel #coronarchive (on hackint)
- ArchiveBot: For those with lots of disk space, bandwidth and long-term commitment. IRC Channel #archivebot (on hackint).
- WikiTeam: Saving wikis dumps (XML). And their external links for the Wayback Machine (WARC) as well as exporting MediaWiki databases. Permanent effort, everyone can help (you choose the size of your downloads). IRC Channel #wikiteam (on hackint).
Upcoming & proposed projects
- Chrome Web Store: Google has announced a timeline of policy changes that will lead to content being removed between December 1, 2020 and June 2022. IRC Channel #chromeweblore (on hackint).
- Kinja: Deleting all user pages, maybe? IRC Channel #gokinjagokinjago (on hackint).
- Twitter: Deleting inactive accounts
2019-12-11sometime. IRC Channel #twitterdead (on hackint). - Imgur: Image hoster decided that using it for hosting images is not permitted. IRC Channel #imgone (on hackint).
- JamiiForums: the Tanzanian government would like this gone. IRC Channel #jammedforums (on hackint).
- LiveJournal: Very old, widely regarded as in decline, and has a lot of important stuff buried in it. IRC Channel #recordedjournal (on hackint).
- Ownlog: Ownlog is losing popularity and support from its owners. IRC Channel #pwnlog (on hackint).
- The Pirate Bay: Recently came back up, grabbing an archive for sanity's sake. IRC Channel #yarharfiddlededee (on hackint).
- Valhalla: Where to store what even the Internet Archive doesn't have space for? IRC Channel #huntinggrounds (on hackint).
- Giphy: Bought by Facebook, to be "integrated" (assimilated) into Instagram https://news.knowyourmeme.com/news/facebook-to-buy-giphy
Recently finished projects
- CodePlex: Microsoft's self-archive will be permanently removed from its Recycle Bin after July 1, 2021. IRC Channel #plexicode (on hackint).
- Google Poly: A 3D art repository that Google will send to the trash compactors on June 30, 2021. New uploads cease April 30. IRC Channel #polygone (on hackint).
- Bintray: JFrog is dismantling the software distribution platform used by numerous projects in May. IRC Channel #binnedtray (on hackint).
Hiatus / Missed the Mark
- Tinkercad: Autodesk announced its intent to put designs from inactive OAuth accounts back into minds around May 24, 2021. IRC Channel #tinkerhad (on hackint).
- Angelfire: Angelfire is a web hosting service that contains big chunks of early WWW history and has no proper backup. IRC Channel #angelonfire (on hackint).
- Audit 2014: It's time to verify our shit. IRC Channel #auditteam (on hackint). THIS PROJECT IS ON HIATUS AND WILL BE RETURNED TO AS AUDIT2018.
- Flickr:
Yahoo!SmugMug decided to kill it after finding Yahoo!'s plans to do so before they were bought by Verizon. IRC Channel #flickrfckr (on hackint). - FTP: Help us find and download all FTP sites! IRC Channel #effteepee (on hackint).
- Google Groups: "Gone within a year" (SketchCow, 2016-06-07).
- Google News Archive: Let's store all newspapers at Google, WCGW? IRC Channel #papersplease (on hackint).
- DevPort: This portfolio SaaS provider has reportedly been having infrastructure issues, and removed their social media accounts. Possible impending shutdown.
- INTERNETARCHIVE.BAK: Grab a slice of the big cake of The Archive! IRC Channel #internetarchive.bak (on hackint).
- ISP Hosting: Finding ISP web hosting services before the Grim Reaper finds them. IRC Channel #webroasting (on hackint).
- NewsGrabber: Saving all news articles. Currently paused. IRC Channel #newsgrabber (on hackint).
- Project Newsletter: Archiving e-newsletters, currently in development. IRC Channel #projectnewsletter (on hackint).
- Quizlet: Flashcards and other learning tools IRC Channel #quizletusin (on hackint).
- Tumblr: Yahoo! considered killing it, now Yahoo has been acquired and Verizon declared war on NSFW blogs. IRC Channel #tumbledown (on hackint).
- yuku: Lately yuku is very unstable and hosting thousands of forums. Project currently paused. IRC Channel #archiveteam (on hackint).
ArchiveTeam primarily uses the hackint IRC network – ircs://irc.hackint.org:6697 (TLS required) – webchat: https://webirc.hackint.org/ – More info
ArchiveTeam also has some channels left on the EFnet IRC network – irc://irc.efnet.org – webchat: http://chat.efnet.org:9090 – More info