Difference between revisions of "Fileplanet"

From Archiveteam
Jump to navigation Jump to search
m (broken headline)
 
(209 intermediate revisions by 11 users not shown)
Line 2: Line 2:
| title = FilePlanet
| title = FilePlanet
| logo = Fileplanet_logo.jpg
| logo = Fileplanet_logo.jpg
| description = Website host of game content, 1996-2012
| description = Website host of game content, 1999-2012
| URL = http://www.fileplanet.com
| URL = http://www.fileplanet.com
| image = Fileplanet_snap.png
| image = Fileplanet_snap.png
| project_status = {{closing}}
| project_status = {{specialcase}} (no longer being updated)
| archiving_status = {{inprogress}}
| archiving_status = {{saved}}
| irc = fireplanet
| irc = fireplanet
| irc_network = EFnet
| irc_abandoned = true
| data = {{IA collection|archiveteam-fileplanet}}
}}
}}


[http://www.fileplanet.com FilePlanet] is no longer hosting new content, and "is in the process of being archived [by IGN]."
In 2012 [http://www.fileplanet.com FilePlanet] announced it was no longer hosting new content, and "is in the process of being archived [by IGN]."


FilePlanet hosts 87,190 download pages of game-related material (demos, patches, mods, promo stuff, etc.), which needs to be archived. These tend to be larger files, ranging from 10MB patches to 3GB clients.  We'll want all the arms we can for this one, since it gets harder the farther the archiving goes (files are numbered chronologically, and Skyrim mods are bigger than Doom ones).
FilePlanet hosted tens of thousands of game-related files (demos, patches, mods, promo stuff, etc.). These tend to be larger files, ranging from 10MB patches to 3GB clients.


To help: Download and run [https://github.com/SpiritQuaddicted/fileplanet-file-download/blob/master/download_pages_and_files_from_fileplanet.sh Schbirid's script].  This grabs the files and the source page, for its metadata.  You'll need to know what the start and stop parameters are - get those numbers from Schbirid in IRC.
===The archival===


[http://www.quaddicted.com/forum/viewtopic.php?pid=251 Here] is a thread keeping track of current status (currently on 60k).
After first downloading files [[Fileplanet/Status_of_by_id_grab|by iterating IDs on the public website fileplanet.com and upload those in chunks to archive.org]] as well as [[Fileplanet/non-id-urls|scouting the web for other public URLs]], we got FTP access to the storage servers by the staff. Thanks!


===What We Need===
https://archive.org/details/archiveteam-fileplanet is the collection. Unpacked and sorted it amounts to about ~TODO files at ~TODOTB. The ftp2 files (another ~300k files at a total of ~1.2TB) cannot be shared publically since there are private files mixed in, we save them to IA anyways so maybe in the future we can sort them out.


* Files! (approx. 15% done 5/10/12)
There is a half-assed search interface available at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?filename=yourfilenamehere and a directory browser at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?directory=/some/dir/here/
* /fileinfo/ pages - get URLs from sitemaps (Schbirid is downloading these)
* [http://blog.fileplanet.com http://blog.fileplanet.com]
* A list of all "site:www.fileplanet.com inurl:hosteddl" URLs since these files seem not to be in the simple ID range


===How to help===
===Related items===
* /fileinfo/ pages and the embedded images/thumbnails from the grab by IDs: https://archive.org/details/FileplanetFiles_fileinfo_pages_images
* /download/ pages and download logs from the grab by IDs: https://archive.org/details/Fileplanet_index.htmls_and_logs_scraped_by_id
* http://blog.fileplanet.com: https://archive.org/details/FileplanetBlogFileplanetCom
* http://www.fileplanet.com/fileblog/archives/: https://archive.org/details/FileplanetFileblog


* Have bash, wget, grep, rev, cut
{{navigation box}}
* >30 gigabytes of space per 5k increment
* Put https://github.com/SpiritQuaddicted/fileplanet-file-download/blob/master/download_pages_and_files_from_fileplanet.sh somewhere and "chmod +x" it
* Pick a free 5k increment (eg 110000-114999) and tell people about it (#fireplanet in EFnet or post it here)
* Create a new working directory for your download, named after your chunk. Eg 110000-114999/
* INSIDE that directory, run the script with your start and end IDs as arguments. Eg "<code>./download_pages_and_files_from_fileplanet.sh 110000 114999</code>"
* Take a walk for half a day.
 
* Once you are done with your chunk, you will have pages_xx000-xx999.log and files_xx000-xx999.log plus the www.fileplanet.com/ directory.
* Do a "<code>grep -i error *.log</code>" first and see if there were error messages. If so, tell us.
* "<code>cd ..</code>" and "<code>tar -cf 110000-114999.tar 110000-114999/</code>"
* "<code>du -hs 110000-114999/www.fileplanet.com/</code>" and "<code>ls -1 110000-114999/www.fileplanet.com/ | wc -l</code>" and add those numbers to the table below.
* Done! GOTO 10
 
In the end we'll upload all the parts to archive.org. If you have an account, you can use eg s3cmd.
 
<code>s3cmd --add-header x-archive-auto-make-bucket:1 --add-header "x-archive-meta-description:Files from Fileplanet (www.fileplanet.com), all files from the ID range 110000 to 114999." put 110000-114999.tar s3://FileplanetFiles_110000-114999</code>
 
<code>s3cmd put 110000-114999/*.log s3://FileplanetFiles_110000-114999/</code>
 
Mind the trailing slash.
 
===Status===
{| class="wikitable"
|-
! Range
! Status
! Number of files
| Size in gigabytes
| Downloader
|-
| 00000-09999
| Done, [http://archive.org/details/FileplanetFiles_00000-09999 archived]
| 1991
| 1G
| Schbirid
|-
| 10000-19999
| Done, [http://archive.org/details/FileplanetFiles_10000-19999 archived]
| 3159
| 9G
| Schbirid
|-
| 20000-29999
| Done, locally
| 6453
| 7G
| Schbirid
|-
| 30000-39999
| Done, locally
| 4085
| 9G
| Schbirid
|-
| 40000-49999
| Done, [http://archive.org/details/FileplanetFiles_40000-49999 archived]
| 5704
| 18G
| Schbirid
|-
| 50000-54999
| Done, locally
| 2706
| 24G
| Schbirid
|-
| 55000-59999
| Done, [http://archive.org/details/FileplanetFiles_50000-559999 archived] (bad URL)
| 2390
| 24G
| Schbirid
|-
| 60000-64999
| Done, locally
| 2349
| 24G
| Schbirid
|-
|}

Latest revision as of 15:29, 20 April 2022

FilePlanet
Fileplanet logo
Website host of game content, 1999-2012
Website host of game content, 1999-2012
URL http://www.fileplanet.com
Status Special case (no longer being updated)
Archiving status Saved!
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)
(formerly #fireplanet (on EFnet))
Data? archiveteam-fileplanet

In 2012 FilePlanet announced it was no longer hosting new content, and "is in the process of being archived [by IGN]."

FilePlanet hosted tens of thousands of game-related files (demos, patches, mods, promo stuff, etc.). These tend to be larger files, ranging from 10MB patches to 3GB clients.

The archival

After first downloading files by iterating IDs on the public website fileplanet.com and upload those in chunks to archive.org as well as scouting the web for other public URLs, we got FTP access to the storage servers by the staff. Thanks!

https://archive.org/details/archiveteam-fileplanet is the collection. Unpacked and sorted it amounts to about ~TODO files at ~TODOTB. The ftp2 files (another ~300k files at a total of ~1.2TB) cannot be shared publically since there are private files mixed in, we save them to IA anyways so maybe in the future we can sort them out.

There is a half-assed search interface available at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?filename=yourfilenamehere and a directory browser at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?directory=/some/dir/here/

Related items


v · t · e         Archive Team
Current events

Alive... OR ARE THEY · Deathwatch · Projects

Archiveteam.jpg
Archiving projects

APKMirror · Archive.is · BetaArchive · Government Backup (#datarefuge · ftp-gov· Gmane · Internet Archive · It Died · Megalodon.jp · OldApps.com · OldVersion.com · OSBetaArchive · TEXTFILES.COM · The Dead, the Dying & The Damned · The Mail Archive · UK Web Archive · WebCite · Vaporwave.me

Blogging

Blog.pl · Blogger · Blogster · Blogter.hu · Freeblog.hu · Fuelmyblog · Jux · LiveJournal · My Opera · Nolblog.hu · Open Diary · ownlog.com · Posterous · Powerblogs · Proust · Roon · Splinder · Tumblr · Vox · Weblog.nl · Windows Live Spaces · Wordpress.com · Xanga · Yahoo! Blog · Zapd

Cloud hosting/file sharing

aDrive · AnyHub · Box · Dropbox · Docstoc · Fast.io · Google Drive · Google Groups Files · iCloud · Fileplanet · LayerVault · MediaCrush · MediaFire · Mega · MegaUpload · MobileMe · OneDrive · Pomf.se · RapidShare · Ubuntu One · Yahoo! Briefcase

Corporations

Apple · IBM · Google · Loblaw · Lycos Europe · Microsoft · Yahoo!

Events

Arab Spring · Great Ape-Snake War · Spanish Revolution

Font Repos

DaFont · Google Web Fonts · GNU FreeFont · Fontspace

Forums/Message boards

4chan · Captain Luffy Forums · College Confidential · DSLReports · ESPN Forums · Facepunch Forums · forums.starwars.com · HeavenGames · JamiiForums · Invisionfree · NeoGAF · Textream · The Classic Horror Film Board · Yahoo! Messages · Yahoo! Neighbors · Yuku.com · Zetaboards

Gaming

Atomicgamer · Bazaar.tf · City of Heroes · Club Nintendo · Clutch · Counter-Strike: Global Offensive · CS:GO Lounge · Desura · Dota 2 · Dota 2 Lounge · Emulation Zone · ESEA · GameBanana · GameMaker Sandbox · GameTrailers · Halo · Heroes of Newerth · HLTV.org · HQ Trivia · Infinite Crisis · joinDOTA · League of Legends · Liquipedia · Minecraft.net · Player.me · Playfire · Raptr · SingStar · Steam · SteamDB · SteamGridDB · Team Fortress 2 · TF2 Outpost · Warhammer · Xfire

Image hosting

500px · AOL Pictures · Blipfoto · Blingee · Canv.as · Camera+ · Cameroid · DailyBooth · Degree Confluence Project · DeviantART · Demotivalo.net · Flickr · Fotoalbum.hu · Fotolog.com · Fotopedia · Frontback · Geograph Britain and Ireland · Giphy · GTF Képhost · ImageShack · Imgh.us · Imgur · Inkblazers · Instagram · Kepfeltoltes.hu · Kephost.com · Kephost.hu · Kepkezelo.com · Keptarad.hu · Madden GIFERATOR · MLKSHK · Microsoft Clip Art · Microsoft Photosynth · Nokia Memories · noob.hu · Odysee · Panoramio · Photobucket · Picasa · Picplz · Pixiv · Portalgraphics.net · PSharing · Ptch · puu.sh · Rawporter · Relay.im · ScreenshotsDatabase.com · Sketch · Smack Jeeves · Snapjoy · Streetfiles · Tabblo · Tinypic · Trovebox · TwitPic · Wallbase · Wallhaven · Webshots · Wikimedia Commons

Knowledge/Wikis

arXiv · Citizendium · Clipboard.com · Deletionpedia · EditThis · Encyclopedia Dramatica · Etherpad · Everything2 · infoAnarchy · GeoNames · GNUPedia · Google Books (Google Books Ngram· Horror Movie Database · Insurgency Wiki · Knol · Lost Media Wiki · Neoseeker.com · Notepad.cc · Nupedia · OpenCourseWare · OpenStreetMap · Orain · Pastebin · Patch.com · Project Gutenberg · Puella Magi · Referata · Resedagboken · SongMeanings · ShoutWiki · The Internet Movie Database · TropicalWikis · Uncyclopedia · Urban Dictionary · Urban Exploration Resource · Webmonkey · Wikia · Wikidot · WikiHow · Wikkii · WikiLeaks · Wikipedia (Simple English Wikipedia· Wikispaces · Wikispot · Wik.is · Wiki-Site · WikiTravel · Word Count Journal

Magazines/Blogs/News

Cyberpunkreview.com · Game Developer Magazine · Gigaom · Hardware Canucks · Helium · JPG Magazine · Make Magazine · The Escapist · Polygamia.pl · San Fransisco Bay Guardian · Scoop · Regretsy · Yahoo! Voices

Microblogging

Heello · Identi.ca · Jaiku · Mommo.hu · Plurk · Sina Weibo · Tencent Weibo · Twitter · TwitLonger

Music/Audio

8tracks · AOL Music · Audimated.com · Cinch · digCCmixter · Dogmazic.net · Earbits · exfm · Free Music Archive · Gogoyoko · Indaba Music · Instacast · Instaudio · Jamendo · Last.fm · Music Unlimited · MOG · PureVolume · Reverbnation · ShareTheMusic · SoundCloud · Soundpedia · Spotify · This Is My Jam · TuneWiki · Twaud.io · WinAmp

People

Aaron Swartz · Michael S. Hart · Steve Jobs · Mark Pilgrim · Dennis Ritchie · Len Sassaman Project

Protocols/Infrastructure

FTP · Gopher · IRC · Usenet · World Wide Web
BitTorrent DHT

Q&A

Askville · Answerbag · Answers.com · Ask.com · Askalo · Baidu Knows · Blurtit · ChaCha · Experts Exchange · Formspring · GirlsAskGuys · Google Answers · Google Baraza · JustAnswer · MetaFilter · Quora · Retrospring · StackExchange · The AnswerBank · The Internet Oracle · Uclue · WikiAnswers · Yahoo! Answers

Recipes/Food

Allrecipes · Epicurious · Food.com · Foodily · Food Network · Punchfork · ZipList

Social bookmarking

Addinto · Backflip · Balatarin · BibSonomy · Bkmrx · Blinklist · BlogMarks · BookmarkSync · CiteULike · Connotea · Delicious · Designer News · Digg · Diigo · Dir.eccion.es · Evernote · Excite Bookmark · Faves · Favilous · folkd · Freelish · Getboo · GiveALink.org · Gnolia · Google Bookmarks · Hacker News · HeyStaks · IndianPad · Kippt · Knowledge Plaza · Licorize · Linkwad · Menéame · Microsoft Developer Network · myVIP · Mister Wong · My Web · Mylink Vault · Newsvine · Oneview · Pearltrees · Pinboard · Pocket · Propeller.com · Reddit · sabros.us · Scloog · Scuttle · Simpy · SiteBar · Slashdot · Squidoo · StumbleUpon · Twine · Voat · Vizited · Yummymarks · Xmarks · Yahoo! Buzz · Zootool · Zotero

Social networks

Bebo · BlackPlanet · Classmates.com · Cyworld · Dogster · Dopplr · douban · Ello · Facebook · Flixster · FriendFeed · Friendster · Friends Reunited · Gaia Online · Google+ · Habbo · hi5 · Hyves · iWiW · LinkedIn · Miiverse · mixi · MyHeritage · MyLife · Myspace · myVIP · Netlog · Odnoklassniki · Orkut · Plaxo · Qzone · Renren · Skyrock · Sonico.com · Storylane · Tagged · tvtag · Upcoming · Viadeo · Vine · Vkontakte · WeeWorld · Weibo · Wretch · Yahoo! Groups · Yahoo! Stars India · Yahoo! Upcoming · more sites...

Shopping/Retail

Alibaba · AliExpress · Amazon · Apple Store · Barnes & Noble · DirectCanada · eBay · Kmart · NCIX · Printfection · RadioShack · Sears · Sears Canada · Target · The Book Depository · ThinkGeek · Toys "R" Us · Walmart

Software/code hosting

Android Development · Alioth · Assembla · BerliOS · Betavine · Bitbucket · BountySource · Codecademy · CodePlex · Freepository · Free Software Foundation · GNU Savannah · GitHost  · GitHub · GitHub Downloads · Gitorious · Gna! · Google Code · ibiblio · java.net · JavaForge · KnowledgeForge · Launchpad · LuaForge · Maemo · mozdev · OSOR.eu · OW2 Consortium · Openmoko · OpenSolaris · Ourproject.org · Ovi Store · Project Kenai · RubyForge · SEUL.org · SourceForge · Stypi · TestFlight · tigris.org · Transifex · TuxFamily · Yahoo! Downloads

Television/Radio

ABC · Austin City Limits · BBC · CBC · CBS · Computer Chronicles · CTV · Fox · G4 · Global TV · Jeopardy! · NBC · NHK · PBS · Penn & Teller: Bullshit! · The Howard Stern Show · TV News Archive (Understanding 9/11)

Torrenting/Piracy

ExtraTorrent · EZTV · isoHunt · KickassTorrents · The Pirate Bay · Torrentz · Library Genesis

Video hosting

Academic Earth · Bambuser · Blip.tv · Epic · Freshlive · Google Video · Justin.tv · Mixer · Niconico · Nokia Trailers · Oddshot.tv · Periscope · Plays.tv · Qwiki · Skillfeed · Stickam · TED Talks · Ticker.tv · Twitch.tv · Ustream · Videoplayer.hu · Viddler · Viddy · Vidme · Vimeo · Vine · Vstreamers · Yahoo! Video · YouTube · Famous Internet videos (Me at the zoo)

Web hosting

Angelfire · Brace.io · BT Internet · CableAmerica Personal Web Space · Claranet Netherlands Personal Web Pages · Comcast Personal Web Pages · Extra.hu · FortuneCity · Free ProHosting · GeoCities (patch· Google Business Sitebuilder · Google Sites · Internet Centrum · MBinternet · MSN TV · Nifty · Nwnyet · Parodius Networking · Prodigy.net · Saunalahti Iso G · Swipnet · Telenor · Tripod · University of Michigan personal webpages · Verizon Mysite · Verizon Personal Web Space · Webs · Webzdarma · Virgin Media

Web applications

Mailman · MediaWiki · phpBB · Simple Machines Forum · vBulletin

Information

A Million Ways to Die on the Web · Backup Tips · Cheap storage · Collecting items randomly · Data compression algorithms and tools · Dev · Discovery Data · DOS Floppies · Fortress of Solitude · Keywords · Naughty List · Nightmare Projects · Rescuing floppy disks · Rescuing optical media · Site exploration · The WARC Ecosystem · Working with ARCHIVE.ORG

Projects

ArchiveCorps · Audit2014 · Emularity · Faceoff · FlickrFckr · Froogle · INTERNETARCHIVE.BAK (Internet Archive Census· IRC Quotes · JSMESS · JSVLC · Just Solve the Problem · NewsGrabber · Project Newsletter · Valhalla · Web Roasting (ISP Hosting · University Web Hosting· Woohoo

Tools

ArchiveBot · ArchiveTeam Warrior (Tracker· Google Takeout · HTTrack · Video downloaders · Wget (Lua · WARC)

Teams

Bibliotheca Anonoma · LibreTeam · URLTeam · Yahoo Video Warroom · WikiTeam

Other

800notes · AOL · Akoha · Ancestry.com · April Fools' Day · Amplicate · AutoAdmit · Bre.ad · Circavie · Cobook · Co.mments · Countdown · Discourse · Distill · Dmoz · Easel · Eircode · Electronic Frontier Foundation · FanFiction.Net · Feedly · Ficlets · Forrst · FunnyExam.com · FurAffinity · Google Helpouts · Google Moderator · Google Poly · Google Reader · ICQmail · IFTTT · Jajah · JuniorNet · Lulu Poetry · Mobile Phone Applications · Mochi Media · Mozilla Firefox · MyBlogLog · NBII · Newgrounds · Neopets · Quantcast · Quizilla · Salon Table Talk · Shutdownify · Slidecast · Stack Overflow · SOPA blackout pages · starwars.yahoo.com · TechNet · Toshiba Support · USA-Gov · Volán · Widgetbox · Windows Technical Preview · Wunderlist · YTMND · Zoocasa

About Archive Team

Introduction · Philosophy · Who We Are · Our stance on robots.txt · Why Back Up? · Software · Formats · Storage Media · Recommended Reading · Films and documentaries about archiving · Talks · In The Media · FAQ