Fileplanet

From Archiveteam
Jump to navigation Jump to search
FilePlanet
Fileplanet logo
Website host of game content, 1999-2012
Website host of game content, 1999-2012
URL http://www.fileplanet.com
Status Closing
Archiving status In progress...
Archiving type Unknown
IRC channel #fireplanet (on hackint)

FilePlanet is no longer hosting new content, and "is in the process of being archived [by IGN]."

FilePlanet hosts 87,190 download pages of game-related material (demos, patches, mods, promo stuff, etc.), which needs to be archived. These tend to be larger files, ranging from 10MB patches to 3GB clients. We'll want all the arms we can for this one, since it gets harder the farther the archiving goes (files are numbered chronologically, and Skyrim mods are bigger than Doom ones).

What We Need

How to help

  • Have bash, wget, grep, rev, cut
  • >100 gigabytes of space, just to be safe
  • Put https://raw.github.com/SpiritQuaddicted/fileplanet-file-download/master/download_pages_and_files_from_fileplanet.sh somewhere (I'd suggest ~/somepath/fileplanetdownload/ ) and "chmod +x" it
  • Pick a free increment (eg 110000-114999) and tell people about it (#fireplanet in EFnet or post it here). Be careful. In lower ranges a 5k range might work, but they get HUGE later. In the 220k range and probably lower too, we better use 100 IDs per chunk.
  • * Keep the chunk sizes small. <30G would be nice. The less the better.
  • Run the script with your start and end IDs as arguments. Eg "./download_pages_and_files_from_fileplanet.sh 110000 114999"
  • Take a walk for half a day.
  • You can tail the .log files if you are curious. See right below.
  • Once you are done with your chunk, you will have a directory named after your range, eg 110000-114999/ . Inside that pages_xx000-xx999.log and files_xx000-xx999.log plus the www.fileplanet.com/ directory.
  • Done! GOTO 10

In the end we'll upload all the parts to archive.org. If you have an account, you can use eg s3cmd.

s3cmd --add-header x-archive-auto-make-bucket:1 --add-header "x-archive-meta-description:Files from Fileplanet (www.fileplanet.com), all files from the ID range 110000 to 114999." put 110000-114999/*.log 110000-114999.tar s3://FileplanetFiles_110000-114999/

The log files are important! Make sure they are saved!

Notes

  • For planning a good range to download, check http://www.quaddicted.com/stuff/temp/file_IDs_from_sitemaps.txt but be aware that apparently that does not cover all IDs we can get by simply incrementing by 1. Schbirid downloaded eg the file 75059 which is not listed in the sitemaps. So you can not trust that ID list.
  • The range 175000-177761 (weird end number since that's when the server ran out of space...) had ~1100 files and 69G. We will need to use 1k ID increments for those ranges.
  • Schbirid mailed to FPOps@IGN.com on the 3rd of May, no reply.

Status

Range Status Number of files Size in gigabytes Downloader
00000-09999 Done, archived 1991 1G Schbirid
10000-19999 Done, archived 3159 9G Schbirid
20000-29999 Done, archived 6453 7G Schbirid
30000-39999 Done, archived 4085 9G Schbirid
40000-49999 Done, archived 5704 18G Schbirid
50000-54999 Done, archived 2707 24G Schbirid
55000-59999 Done, archived (bad URL) 2390 24G Schbirid
60000-64999 Done, archived 2349 24G Schbirid
65000-69999 Done, archived 305 4G Schbirid
70000-79999 Done, archived 59 0.2G Schbirid
80000-84999 Done, archived 2822 31G Debianer
85000-89999 Done, archived 1869 29G Schbirid
90000-109999 Done, empty 0 0 Schbirid
110000-114999 Done, archived 2139 35G Schbirid
115000-115999 Done, archived 932 1.9G codebear
116000-116999 Done, archived 694 11G codebear
117000-117999 Done, archived 752 16G codebear
118000-118999 Done, archived 726 16G codebear
119000-119999 Done, locally 718 28G codebear
120000-124999 Done, locally 3463 68G codebear
125000-129999 Done, archived [1] [2] [3] [4] [5] (bad URL) 3384 78G S[h]O[r]T
130000-130999 Done, archived 603 24G codebear
131000-131999 Done, archived 640 22G codebear
132000-132999 Done, archived 626 17G codebear
133000-133999 Done, archived 602 25G codebear
134000-134999 Done, archived 551 19G codebear
135000-135999 Done, archived 763 21G codebear
136000-136999 Done, archived 728 27G codebear
137000-137999 Done, archived 601 18G codebear
138000-138999 Done, archived 689 26G codebear
139999-139999 Done, archived 705 18G codebear
140000-140999 Done, archived 750 26G S[h]O[r]T
141000-141999 Done, archived 586 30G S[h]O[r]T
142000-142999 Done, archived 337 19G S[h]O[r]T
143000-143999 Done, archived 292 14G S[h]O[r]T
144000-144999 Done, archived 328 20G S[h]O[r]T
145000-145999 Done, archived 216 25G Schbirid
146000-146999 Done, archived 383 30G Schbirid
147000-147499 Done, archived 279 20G Schbirid
147500-147999 Done, archived 309 17G Schbirid
148000-148499 Done, archived 311 15G Schbirid
148500-148999 Done, archived 229 14G Schbirid
149000-149500 Done, archived 202 8G Schbirid
149500-149999 Done, archived 221 9G Schbirid
150000-150499 Done, archived 216 15G S[h]O[r]T
150500-150999 Done, archived 270 13G S[h]O[r]T
151000-151999 Done, archived 310 19G S[h]O[r]T
151500-151999 Done, archived 244 17G S[h]O[r]T
152000-152499 Done, archived 234 19G S[h]O[r]T
152500-152999 Done, archived 255 13G S[h]O[r]T
153000-153499 Done, archived 287 19G S[h]O[r]T
153500-153999 Done, archived 269 17G S[h]O[r]T
154000-154499 Done, archived 248 18G S[h]O[r]T
154500-154999 Done, archived 173 8.7G S[h]O[r]T
155000-155499 In progress S[h]O[r]T
155500-155999 In progress S[h]O[r]T
156000-156499 Done, archived 238 13G S[h]O[r]T
156500-156999 Done, archived 185 15G S[h]O[r]T
157000-157499 In progress S[h]O[r]T
157500-157999 In progress S[h]O[r]T
158000-158499 In progress S[h]O[r]T
158500-158999 In progress S[h]O[r]T
159000-159499 In progress S[h]O[r]T
159500-159999 In progress S[h]O[r]T
160001-160500 Done, locally 214 15G NotGLaDOS
160501-161000 Done, locally 154 17G NotGLaDOS
161001-161500 Done, locally 232 22G NotGLaDOS
161501-162000 Done, locally 38G NotGLaDOS
162000-164999 In progress NotGLaDOS
165000-179999 open better use ranges of 500 here.
180000-180499 Done, archived 179 37G Schbirid
180500-180999 In progress Schbirid
181000-199999 open better use ranges of 100-500 here.
200000-200999 Done, archived (bad URL) 247 41G Schbirid
201000-219999 open better use ranges of 100-500 here.
220000-220499 Done, archived (bad URL) 250 35G Schbirid
220500-221999 In progress Debianer
222000+ open better use ranges of 100-500 here.

Graphs

Fileplanet number of IDs from the sitemaps per 1k range.png


v · t · e         Archive Team
Current events

Alive... OR ARE THEY · Deathwatch · Projects

Archiveteam.jpg
Archiving projects

APKMirror · Archive.is · BetaArchive · Government Backup (#datarefuge · ftp-gov· Gmane · Internet Archive · It Died · Megalodon.jp · OldApps.com · OldVersion.com · OSBetaArchive · TEXTFILES.COM · The Dead, the Dying & The Damned · The Mail Archive · UK Web Archive · WebCite · Vaporwave.me

Blogging

Blog.pl · Blogger · Blogster · Blogter.hu · Freeblog.hu · Fuelmyblog · Jux · LiveJournal · My Opera · Nolblog.hu · Open Diary · ownlog.com · Posterous · Powerblogs · Proust · Roon · Splinder · Tumblr · Vox · Weblog.nl · Windows Live Spaces · Wordpress.com · Xanga · Yahoo! Blog · Zapd

Cloud hosting/file sharing

aDrive · AnyHub · Box · Dropbox · Docstoc · Fast.io · Google Drive · Google Groups Files · iCloud · Fileplanet · LayerVault · MediaCrush · MediaFire · Mega · MegaUpload · MobileMe · OneDrive · Pomf.se · RapidShare · Ubuntu One · Yahoo! Briefcase

Corporations

Apple · IBM · Google · Loblaw · Lycos Europe · Microsoft · Yahoo!

Events

Arab Spring · Great Ape-Snake War · Spanish Revolution

Font Repos

DaFont · Google Web Fonts · GNU FreeFont · Fontspace

Forums/Message boards

4chan · Captain Luffy Forums · College Confidential · DSLReports · ESPN Forums · Facepunch Forums · forums.starwars.com · HeavenGames · JamiiForums · Invisionfree · NeoGAF · Textream · The Classic Horror Film Board · Yahoo! Messages · Yahoo! Neighbors · Yuku.com · Zetaboards

Gaming

Atomicgamer · Bazaar.tf · City of Heroes · Club Nintendo · Clutch · Counter-Strike: Global Offensive · CS:GO Lounge · Desura · Dota 2 · Dota 2 Lounge · Emulation Zone · ESEA · GameBanana · GameMaker Sandbox · GameTrailers · Halo · Heroes of Newerth · HLTV.org · HQ Trivia · Infinite Crisis · joinDOTA · League of Legends · Liquipedia · Minecraft.net · Player.me · Playfire · Raptr · SingStar · Steam · SteamDB · SteamGridDB · Team Fortress 2 · TF2 Outpost · Warhammer · Xfire

Image hosting

500px · AOL Pictures · Blipfoto · Blingee · Canv.as · Camera+ · Cameroid · DailyBooth · Degree Confluence Project · DeviantART · Demotivalo.net · Flickr · Fotoalbum.hu · Fotolog.com · Fotopedia · Frontback · Geograph Britain and Ireland · Giphy · GTF Képhost · ImageShack · Imgh.us · Imgur · Inkblazers · Instagram · Kepfeltoltes.hu · Kephost.com · Kephost.hu · Kepkezelo.com · Keptarad.hu · Madden GIFERATOR · MLKSHK · Microsoft Clip Art · Microsoft Photosynth · Nokia Memories · noob.hu · Odysee · Panoramio · Photobucket · Picasa · Picplz · Pixiv · Portalgraphics.net · PSharing · Ptch · puu.sh · Rawporter · Relay.im · ScreenshotsDatabase.com · Sketch · Smack Jeeves · Snapjoy · Streetfiles · Tabblo · Tinypic · Trovebox · TwitPic · Wallbase · Wallhaven · Webshots · Wikimedia Commons

Knowledge/Wikis

arXiv · Citizendium · Clipboard.com · Deletionpedia · EditThis · Encyclopedia Dramatica · Etherpad · Everything2 · infoAnarchy · GeoNames · GNUPedia · Google Books (Google Books Ngram· Horror Movie Database · Insurgency Wiki · Knol · Lost Media Wiki · Neoseeker.com · Notepad.cc · Nupedia · OpenCourseWare · OpenStreetMap · Orain · Pastebin · Patch.com · Project Gutenberg · Puella Magi · Referata · Resedagboken · SongMeanings · ShoutWiki · The Internet Movie Database · TropicalWikis · Uncyclopedia · Urban Dictionary · Urban Exploration Resource · Webmonkey · Wikia · Wikidot · WikiHow · Wikkii · WikiLeaks · Wikipedia (Simple English Wikipedia· Wikispaces · Wikispot · Wik.is · Wiki-Site · WikiTravel · Word Count Journal

Magazines/Blogs/News

Cyberpunkreview.com · Game Developer Magazine · Gigaom · Hardware Canucks · Helium · JPG Magazine · Make Magazine · The Escapist · Polygamia.pl · San Fransisco Bay Guardian · Scoop · Regretsy · Yahoo! Voices

Microblogging

Heello · Identi.ca · Jaiku · Mommo.hu · Plurk · Sina Weibo · Tencent Weibo · Twitter · TwitLonger

Music/Audio

8tracks · AOL Music · Audimated.com · Cinch · digCCmixter · Dogmazic.net · Earbits · exfm · Free Music Archive · Gogoyoko · Indaba Music · Instacast · Instaudio · Jamendo · Last.fm · Music Unlimited · MOG · PureVolume · Reverbnation · ShareTheMusic · SoundCloud · Soundpedia · Spotify · This Is My Jam · TuneWiki · Twaud.io · WinAmp

People

Aaron Swartz · Michael S. Hart · Steve Jobs · Mark Pilgrim · Dennis Ritchie · Len Sassaman Project

Protocols/Infrastructure

FTP · Gopher · IRC · Usenet · World Wide Web
BitTorrent DHT

Q&A

Askville · Answerbag · Answers.com · Ask.com · Askalo · Baidu Knows · Blurtit · ChaCha · Experts Exchange · Formspring · GirlsAskGuys · Google Answers · Google Baraza · JustAnswer · MetaFilter · Quora · Retrospring · StackExchange · The AnswerBank · The Internet Oracle · Uclue · WikiAnswers · Yahoo! Answers

Recipes/Food

Allrecipes · Epicurious · Food.com · Foodily · Food Network · Punchfork · ZipList

Social bookmarking

Addinto · Backflip · Balatarin · BibSonomy · Bkmrx · Blinklist · BlogMarks · BookmarkSync · CiteULike · Connotea · Delicious · Designer News · Digg · Diigo · Dir.eccion.es · Evernote · Excite Bookmark · Faves · Favilous · folkd · Freelish · Getboo · GiveALink.org · Gnolia · Google Bookmarks · Hacker News · HeyStaks · IndianPad · Kippt · Knowledge Plaza · Licorize · Linkwad · Menéame · Microsoft Developer Network · myVIP · Mister Wong · My Web · Mylink Vault · Newsvine · Oneview · Pearltrees · Pinboard · Pocket · Propeller.com · Reddit · sabros.us · Scloog · Scuttle · Simpy · SiteBar · Slashdot · Squidoo · StumbleUpon · Twine · Voat · Vizited · Yummymarks · Xmarks · Yahoo! Buzz · Zootool · Zotero

Social networks

Bebo · BlackPlanet · Classmates.com · Cyworld · Dogster · Dopplr · douban · Ello · Facebook · Flixster · FriendFeed · Friendster · Friends Reunited · Gaia Online · Google+ · Habbo · hi5 · Hyves · iWiW · LinkedIn · Miiverse · mixi · MyHeritage · MyLife · Myspace · myVIP · Netlog · Odnoklassniki · Orkut · Plaxo · Qzone · Renren · Skyrock · Sonico.com · Storylane · Tagged · tvtag · Upcoming · Viadeo · Vine · Vkontakte · WeeWorld · Weibo · Wretch · Yahoo! Groups · Yahoo! Stars India · Yahoo! Upcoming · more sites...

Shopping/Retail

Alibaba · AliExpress · Amazon · Apple Store · Barnes & Noble · DirectCanada · eBay · Kmart · NCIX · Printfection · RadioShack · Sears · Sears Canada · Target · The Book Depository · ThinkGeek · Toys "R" Us · Walmart

Software/code hosting

Android Development · Alioth · Assembla · BerliOS · Betavine · Bitbucket · BountySource · Codecademy · CodePlex · Freepository · Free Software Foundation · GNU Savannah · GitHost  · GitHub · GitHub Downloads · Gitorious · Gna! · Google Code · ibiblio · java.net · JavaForge · KnowledgeForge · Launchpad · LuaForge · Maemo · mozdev · OSOR.eu · OW2 Consortium · Openmoko · OpenSolaris · Ourproject.org · Ovi Store · Project Kenai · RubyForge · SEUL.org · SourceForge · Stypi · TestFlight · tigris.org · Transifex · TuxFamily · Yahoo! Downloads

Television/Radio

ABC · Austin City Limits · BBC · CBC · CBS · Computer Chronicles · CTV · Fox · G4 · Global TV · Jeopardy! · NBC · NHK · PBS · Penn & Teller: Bullshit! · The Howard Stern Show · TV News Archive (Understanding 9/11)

Torrenting/Piracy

ExtraTorrent · EZTV · isoHunt · KickassTorrents · The Pirate Bay · Torrentz · Library Genesis

Video hosting

Academic Earth · Bambuser · Blip.tv · Epic · Freshlive · Google Video · Justin.tv · Mixer · Niconico · Nokia Trailers · Oddshot.tv · Periscope · Plays.tv · Qwiki · Skillfeed · Stickam · TED Talks · Ticker.tv · Twitch.tv · Ustream · Videoplayer.hu · Viddler · Viddy · Vidme · Vimeo · Vine · Vstreamers · Yahoo! Video · YouTube · Famous Internet videos (Me at the zoo)

Web hosting

Angelfire · Brace.io · BT Internet · CableAmerica Personal Web Space · Claranet Netherlands Personal Web Pages · Comcast Personal Web Pages · Extra.hu · FortuneCity · Free ProHosting · GeoCities (patch· Google Business Sitebuilder · Google Sites · Internet Centrum · MBinternet · MSN TV · Nifty · Nwnyet · Parodius Networking · Prodigy.net · Saunalahti Iso G · Swipnet · Telenor · Tripod · University of Michigan personal webpages · Verizon Mysite · Verizon Personal Web Space · Webs · Webzdarma · Virgin Media

Web applications

Mailman · MediaWiki · phpBB · Simple Machines Forum · vBulletin

Information

A Million Ways to Die on the Web · Backup Tips · Cheap storage · Collecting items randomly · Data compression algorithms and tools · Dev · Discovery Data · DOS Floppies · Fortress of Solitude · Keywords · Naughty List · Nightmare Projects · Rescuing floppy disks · Rescuing optical media · Site exploration · The WARC Ecosystem · Working with ARCHIVE.ORG

Projects

ArchiveCorps · Audit2014 · Emularity · Faceoff · FlickrFckr · Froogle · INTERNETARCHIVE.BAK (Internet Archive Census· IRC Quotes · JSMESS · JSVLC · Just Solve the Problem · NewsGrabber · Project Newsletter · Valhalla · Web Roasting (ISP Hosting · University Web Hosting· Woohoo

Tools

ArchiveBot · ArchiveTeam Warrior (Tracker· Google Takeout · HTTrack · Video downloaders · Wget (Lua · WARC)

Teams

Bibliotheca Anonoma · LibreTeam · URLTeam · Yahoo Video Warroom · WikiTeam

Other

800notes · AOL · Akoha · Ancestry.com · April Fools' Day · Amplicate · AutoAdmit · Bre.ad · Circavie · Cobook · Co.mments · Countdown · Discourse · Distill · Dmoz · Easel · Eircode · Electronic Frontier Foundation · FanFiction.Net · Feedly · Ficlets · Forrst · FunnyExam.com · FurAffinity · Google Helpouts · Google Moderator · Google Poly · Google Reader · ICQmail · IFTTT · Jajah · JuniorNet · Lulu Poetry · Mobile Phone Applications · Mochi Media · Mozilla Firefox · MyBlogLog · NBII · Newgrounds · Neopets · Quantcast · Quizilla · Salon Table Talk · Shutdownify · Slidecast · Stack Overflow · SOPA blackout pages · starwars.yahoo.com · TechNet · Toshiba Support · USA-Gov · Volán · Widgetbox · Windows Technical Preview · Wunderlist · YTMND · Zoocasa

About Archive Team

Introduction · Philosophy · Who We Are · Our stance on robots.txt · Why Back Up? · Software · Formats · Storage Media · Recommended Reading · Films and documentaries about archiving · Talks · In The Media · FAQ