Difference between revisions of "Fileplanet"

From Archiveteam
Jump to navigation Jump to search
(3 intermediate revisions by the same user not shown)
Line 15: Line 15:
In 2012 [http://www.fileplanet.com FilePlanet] announced it was no longer hosting new content, and "is in the process of being archived [by IGN]."
In 2012 [http://www.fileplanet.com FilePlanet] announced it was no longer hosting new content, and "is in the process of being archived [by IGN]."


FilePlanet hosted at least 87,190 download pages of game-related material (demos, patches, mods, promo stuff, etc.). These tend to be larger files, ranging from 10MB patches to 3GB clients.
FilePlanet hosted tens of thousands of game-related files (demos, patches, mods, promo stuff, etc.). These tend to be larger files, ranging from 10MB patches to 3GB clients.


===Current Situation===
===The archival===


We got direct access to the files by IGN, hooray! Mirroring is done, see https://archive.org/details/archiveteam-fileplanet for the tarballs. The ftp2 files cannot be shared publically since there are private files mixed in, we save them to IA anyways so maybe in the future we can sort them out. A detailed writeup and user-friendly interface will be available later. No help needed, everything below is outdated. Thanks for your interest!
After first downloading files [[Fileplanet/Status_of_by_id_grab|by iterating IDs on the public website fileplanet.com and upload those in chunks to archive.org]] as well as [[Fileplanet/non-id-urls|scouting the web for other public URLs]], we got FTP access to the storage servers by the staff. Thanks!
 
https://archive.org/details/archiveteam-fileplanet is the collection. Unpacked and sorted it amounts to about ~TODO files at ~TODOTB. The ftp2 files (another ~300k files at a total of ~1.2TB) cannot be shared publically since there are private files mixed in, we save them to IA anyways so maybe in the future we can sort them out.


There is a half-assed search interface available at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?filename=yourfilenamehere and a directory browser at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?directory=/some/dir/here/
There is a half-assed search interface available at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?filename=yourfilenamehere and a directory browser at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?directory=/some/dir/here/


===What We Need===
===Related items===
 
* /fileinfo/ pages and the embedded images/thumbnails from the grab by IDs: https://archive.org/details/FileplanetFiles_fileinfo_pages_images
* More file URLs, see https://archiveteam.org/index.php?title=Fileplanet/non-id-urls
* /download/ pages and download logs from the grab by IDs: https://archive.org/details/Fileplanet_index.htmls_and_logs_scraped_by_id
* Where do links like http://dl.fileplanet.com/dl/dl.asp?classicgaming/o2home/rtl.zip come from and can we rescue those too?
* http://blog.fileplanet.com: https://archive.org/details/FileplanetBlogFileplanetCom
** The non-IDed files are stuck behind the download manager - any clever way past it?  URLs to the files are of the form [http://download.direct2drive.com/ftp2/planetannihilation/mercilesscreations/opflash/opflash_-_uber_editor_tutorial.pdf?clientid=781894158 http://download.direct2drive.com/ftp2/planetannihilation/mercilesscreations/opflash/opflash_-_uber_editor_tutorial.pdf?clientid=781894158] and seem to require a valid ID to fetch.
* http://www.fileplanet.com/fileblog/archives/: https://archive.org/details/FileplanetFileblog
*** Those URLs are the ones we currently fetch too. The script "visits" the download page and extracts such URL. The problem with these files is that they open a download link in a new window and I have not yet found out how to "open" that window correctly with wget. Haven't really tried though. -Schbirid
* Files! (approx. ??% done 22 June 2012)
** The easy part (incrementing a fileID and downloading it) is pretty much done, we got ~7 Terabytes through that.
* /fileinfo/ pages - get URLs from sitemaps (Schbirid is downloading these)
** Afterwards, extract all thumbnail image links and grab the full size images (strip _sm2 from the basename)
*** grep -hPo 'http.*?_sm2.jpg' fileinfo*/fileinfo.log | sed 's/_sm2//' > fileinfo_fullsizeimages_URLs; # wgot those
**** Done! https://archive.org/details/FileplanetFiles_fileinfo_pages_images
** Schbirid is re-downloading all the fileinfo pages by incrementing the ID, the sitemaps were missing URLs
* [http://blog.fileplanet.com http://blog.fileplanet.com]
** Done! https://archive.org/details/FileplanetBlogFileplanetCom
* Schbirid mirrored http://www.fileplanet.com/fileblog/archives/ (starting from a URL like http://www.fileplanet.com/fileblog/archives/10-24-2010_10-30-2010.shtml ).
** Done! https://archive.org/details/FileplanetFileblog
 
=== Grabbing files by iterating the IDs on the website ===
First we tried to archive all the files by iterating the ID of the urls on the website. See https://wiki.archiveteam.org/index.php?title=Fileplanet/Status_of_by_id_grab for info and what we achieved.
 
=== Grabbing files from the FTP ===
* Schbirid mailed to FPOps@IGN.com and at some point in history did get FTP access to archive the files.
* TODO
 


{{navigation box}}
{{navigation box}}

Revision as of 15:29, 20 April 2022

FilePlanet
Fileplanet logo
Website host of game content, 1999-2012
Website host of game content, 1999-2012
URL http://www.fileplanet.com
Status Special case (no longer being updated)
Archiving status Saved!
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)
(formerly #fireplanet (on EFnet))
Data[how to use] archiveteam-fileplanet

In 2012 FilePlanet announced it was no longer hosting new content, and "is in the process of being archived [by IGN]."

FilePlanet hosted tens of thousands of game-related files (demos, patches, mods, promo stuff, etc.). These tend to be larger files, ranging from 10MB patches to 3GB clients.

The archival

After first downloading files by iterating IDs on the public website fileplanet.com and upload those in chunks to archive.org as well as scouting the web for other public URLs, we got FTP access to the storage servers by the staff. Thanks!

https://archive.org/details/archiveteam-fileplanet is the collection. Unpacked and sorted it amounts to about ~TODO files at ~TODOTB. The ftp2 files (another ~300k files at a total of ~1.2TB) cannot be shared publically since there are private files mixed in, we save them to IA anyways so maybe in the future we can sort them out.

There is a half-assed search interface available at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?filename=yourfilenamehere and a directory browser at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?directory=/some/dir/here/

Related items