Fileplanet

FilePlanet

Website host of game content, 1996-2012
URL	http://www.fileplanet.com
Status	Closing
Archiving status	In progress...
Archiving type	Unknown
IRC channel	#fireplanet (on hackint)

FilePlanet is no longer hosting new content, and "is in the process of being archived [by IGN]."

FilePlanet hosts 87,190 download pages of game-related material (demos, patches, mods, promo stuff, etc.), which needs to be archived. These tend to be larger files, ranging from 10MB patches to 3GB clients. We'll want all the arms we can for this one, since it gets harder the farther the archiving goes (files are numbered chronologically, and Skyrim mods are bigger than Doom ones).

To help: Download and run Schbirid's script. This grabs the files and the source page, for its metadata. You'll need to know what the start and stop parameters are - get those numbers from Schbirid in IRC.

Here is a thread keeping track of current status (currently on 60k).

What We Need

Files! (approx. 15% done 5/10/12)
/fileinfo/ pages - get URLs from sitemaps (Schbirid is downloading these)
http://blog.fileplanet.com
A list of all "site:www.fileplanet.com inurl:hosteddl" URLs since these files seem not to be in the simple ID range

How to help

Have bash, wget, grep, rev, cut
>30 gigabytes of space per 5k increment
Put https://github.com/SpiritQuaddicted/fileplanet-file-download/blob/master/download_pages_and_files_from_fileplanet.sh somewhere and "chmod +x" it
Pick a free 5k increment (eg 110000-114999) and tell people about it (#fireplanet in EFnet or post it here)
Create a new working directory for your download, named after your chunk. Eg 110000-114999/
INSIDE that directory, run the script with your start and end IDs as arguments. Eg "./download_pages_and_files_from_fileplanet.sh 110000 114999"
Take a walk for half a day.

Once you are done with your chunk, you will have pages_xx000-xx999.log and files_xx000-xx999.log plus the www.fileplanet.com/ directory.
Do a "grep -i error *.log" first and see if there were error messages. If so, tell us.
"cd .." and "tar -cf 110000-114999.tar 110000-114999/"
"du -hs 110000-114999/www.fileplanet.com/" and "ls -1 110000-114999/www.fileplanet.com/ | wc -l" and add those numbers to the table below.
Done! GOTO 10

In the end we'll upload all the parts to archive.org. If you have an account, you can use eg s3cmd.

s3cmd --add-header x-archive-auto-make-bucket:1 --add-header "x-archive-meta-description:Files from Fileplanet (www.fileplanet.com), all files from the ID range 110000 to 114999." put 110000-114999.tar s3://FileplanetFiles_110000-114999

s3cmd put 110000-114999/*.log s3://FileplanetFiles_110000-114999/

Mind the trailing slash.

Status=

Range	Status	Number of files	Size in gigabytes	Downloader
00000-09999	Done, archived	1991	1G	Schbirid
10000-19999	Done, archived	3159	9G	Schbirid
20000-29999	Done, locally	6453	7G	Schbirid
30000-39999	Done, locally	4085	9G	Schbirid
40000-49999	Done, archived	5704	18G	Schbirid
50000-54999	Done, locally	2706	24G	Schbirid
55000-59999	Done, archived (bad URL)	2390	24G	Schbirid
60000-64999	Done, locally	2349	24G	Schbirid

Fileplanet

What We Need

How to help

Status=

Navigation menu

Search