User:Start
Jump to navigation
Jump to search
I like preserving the web.
I also go by Start+Select and Pressstart.
Archives
- FoxyTunes
- safeway.ca
- Emulation Zone
- Battle for the Net (July 18, 2014, September 12, 2014)
- The Open Internet
- Startups for Net Neutrality
- net.net
- WWDC Timer
- Option V Mac
- Chromercise
- Hidden From Google
- orteil.dashnet.org
- Pingus
- Tux4Kids
- TuxKart
- Minecraft Assets Server
- https://archive.org/details/bmf.*rustedmagick.com-cr-panicgrab-20140808 (remove asterisk, spam filter doesn't like this link) - The Original Cutting Room Floor
- TPPX logs
- Misc. Nintendo sites
- mojang.com/notch
- legowracers.4t2portfolio.co.uk
Website Crawls
- cache.lego.com
Public HTTP/FTP Server List
Searching intitle:"index of /" inurl:"ftp"
on Google gives millions of results.
- ftp://ftp.3drealms.com/ - 3D Realms
- ftp://ftp.adobe.com/ - Adobe
- ftp://ftp.amanda.org/ - Amanda Network Backup
- http://staticky.com/mirrors/ftp.apple.com/developer/ - Apple's former developer FTP (mirror)
- ftp://ftp.atari.com/ - Atari
- http://ftp.blizzard.com/pub/ - Blizzard (only works through HTTP)
- ftp://ftp.mrunix.net/ - Borg: The Collective
- http://media.codeweavers.com/ - CodeWeavers
- ftp://ftp.debian.org/ - Debian
- ftp://ftp.eggheads.org/ - EggDrop
- ftp://ftp.ea.com/ - Electronic Arts
- http://largedownloads.ea.com - Electronic Arts (large downloads)
- ftp://ftp.gnu.org/ - GNU
- ftp://ftp.gnus.org/ - GNUS
- ftp://ftp.software.ibm.com/ - IBM
- ftp://ftp.idsoftware.com/ - iD Software
- ftp://ftp.isc.org/ - Internet Systems Consortium
- ftp://ftp.kochmedia.com/ - Koch Media
- ftp://ftp.kernel.org/ - Linux Kernel Archives
- ftp://ftp.lyx.org/ - LyX
- ftp://ftp.microsoft.com/ - Microsoft (sometimes up, sometimes down)
- ftp://ftp.research.microsoft.com/ - Microsoft Research
- ftp://ftp.research.microsoft.com/downloads - hidden directory
- ftp://ftp.research.microsoft.com/ - Microsoft Research
- http://assets.minecraft.net/ - Minecraft (no longer used)
- [1] - Mozilla
- http://releases.mozilla.org/pub/mozilla.org/
- http://download.cdn.mozilla.net/pub/ - Mozilla (older software)
- ftp://ftp.ncftp.com/ - NcFTP
- ftp://ftp.netscape.com/ - Netscape
- ftp://ftp.oldskool.org/ - Oldskool PC Network
- ftp://ftp.opera.com/pub/ - Opera
- http://get.geo.opera.com/ - Opera (alt)
- ftp://pingus.seul.org - Pingus
- ftp://ftp.pgpi.com/ - PGP
- ftp://ftp.iso.pld-linux.org/ - PLD Linux
- ftp://ftp.povray.org/ - POV-Ray
- ftp://ftp.sangoma.com/ - Sangoma
- ftp://ftp.scriptics.com/ - Scriptics
- ftp://ftp.slackware.com/ - Slackware Linux
- http://download.sonymediasoftware.com/ - Sony Creative Software
- ftp://ftp.sunet.se/ - Sunet
- ftp://ftp.suse.com/ - SUSE Linux
- ftp://ftp.ubisoft.com/ - Ubisoft
- ftp://ftp.bluebyte.com/ - Ubisoft Blue Byte
- http://releases.ubuntu.com/ - Ubuntu
- http://cdimage.ubuntu.com/ - "Unsupported Ubuntu Images"
- ftp://ftp.snt.utwente.nl/ - University of Twente
- ftp://ftp.westwood.com/ - Westwood
- http://wdl2.winworldpc.com - WinWorld
blah blah blah ignore
Items
- TODO: Scrape Google
- TODO: Scrape Bing
- TODO: Scrape DuckDuckGo
- TODO: Scrape Twitter
- TODO: Scrape Reddit
- TODO: Scrape links from MediaWiki wikis
- TODO: Scrape the Open Directory Project
- TODO: Scrape the Common Crawl Index
- TODO: Scrape the Wayback Machine
- TODO: Scrape URLTeam dumps
- TODO: Scrape a list of subdomains from DNSdumpster.com (if applicable)