URLs
URLs | |
URL | https://url.spec.whatwg.org/ |
Status | Special case |
Archiving status | In progress... |
Archiving type | Unknown |
Project source | urls-grab urls-sources |
Project tracker | urls |
IRC channel | #// (on hackint) |
Data? | archiveteam_urls |
The URLs project is a continuous generic project to archive random URLs from various sources, e.g. external links discovered in other projects or in older archives. Current projects as of early 2021 that send outlinks to URLs include the Reddit and Yahoo! Answers projects. User:TheTechRobo's Discord dumps usually have extracted URLs and attachments get sent there too, and lists of URLs can be submitted.
Important note: If you run this project, you'll likely see your IP get banned from Facebook, Instagram, YouTube, etc., and using those sites may become difficult (e.g. constant captchas, forced login). Also, if you run at significant speed, you'll likely see abuse notices, IP blacklists, and so on.
How to help if you have lists of URLs
This project requires lists of URLs for content on the target website. If you have a source of URLs, please:
- If the list exceeds a few megabytes, please compress it, preferably using
zstd -10
. - Upload the file to https://transfer.archivete.am/.
- Share the resulting URL in the project IRC channel.
- If you would like to keep the list non-public instead, e.g. for privacy reasons or for not wanting to be publicly associated with it, please get in touch with a channel op (e.g. User:Arkiver or User:JustAnotherArchivist). Note that the items generated from your list would still be processed publicly, of course, but they would be mixed with everything else.
See also Category:Projects requiring URL lists for other ArchiveTeam projects that necessitate URL lists.