|URL||• • •|
|Archiving status||In progress...|
|IRC channel||(on hackint)|
Friendster is an early social networking site which announced on April 25th, 2011 that most of the user-generated content on the site would be deleted on May 31st, 2011. It's estimated that Friendster has over 115 million registered users.
How to help
Alard has written a script that will download a single profile; he'll post it shortly once we've tested it a little.
We're going to break up the user ids into ranges and let individuals claim a range to download. Use this table to mark your territory:
You need a Friendster account to use this script. Add your login details to a file
username.txt and a
password.txt and save those in the directory of the download script.
Run with a numeric profile id of a Friendster user:
- the main profile page (profiles.friendster.com/$PROFILE_ID)
- the user's profile image from that page
- the list of public albums (www.friendster.com/viewalbums.php?uid=$PROFILE_ID)
- each of the album pages (www.friendster.com/viewphotos.php?a=$id&uid=$PROFILE_ID)
- the original photos from each album
- the list of friends (www.friendster.com/fans.php?uid=$PROFILE_ID)
- the shoutoutstream (www.friendster.com/shoutoutstream.php?uid=$PROFILE_ID) and the associated comments
- the Friendster blog, if any
It does not download any of the widgets.
Downloading one profile takes between 6 to 10 seconds and generates 200-400 kB of data (for normal profiles).
Content on Friendster seems to be primarily organized by the id number of the users, which were sequentially assigned starting at 1. This will make it fairly easy for wget to scrape the site and for us to break it up into convenient work units. The main components we need to scrape are the profile pages, photo albums and blogs, but there may be others. More research is needed
Urls of the form 'http://profiles.friendster.com/<userid>'. Many pictures on these pages are hosted on urls that look like 'http://photos-p.friendster.com/photos/<lk>/<ji>/nnnnnijkl/<imageid>.jpg', but these folders aren't browsable directly. Profiles will not be easy to scrape with wget.
On the individual album pages, the photo thumbnails are stored under similar paths as the main images. i.e. if the album thumb is at http://photos-p.friendster.com/photos/<lk>/<ji>/nnnnnijkl/<imageid>m.jpg, just drop the final 'm' to get the main photo (or replace it with a 't' to get an even tinier version).