Difference between revisions of "Friendster"

From Archiveteam
Jump to navigation Jump to search
(→‎Automating the process: The -x just makes it look weird.)
(→‎Automating the process: The Friendster username is the e-mail address)
Line 86: Line 86:
1. Create a Friendster account<br/>
1. Create a Friendster account<br/>
2. [http://pastebin.com/raw.php?i=pnXJy3AC Download] the script; name it 'bff.sh'.<br/>
2. [http://pastebin.com/raw.php?i=pnXJy3AC Download] the script; name it 'bff.sh'.<br/>
3. In the directory that you put the bff.sh, make a username.txt file that has your Friendster username as the text in it<br/>
3. In the directory that you put the bff.sh, make a username.txt file that has your Friendster e-mail address as the text in it<br/>
4. In the directory that you put the bff.sh, make a password.txt file that has your Friendster password as the text in it.<br/>
4. In the directory that you put the bff.sh, make a password.txt file that has your Friendster password as the text in it.<br/>
5. [[Friendster#Scrape profiles|Choose your profile range]].<br/>
5. [[Friendster#Scrape profiles|Choose your profile range]].<br/>

Revision as of 05:25, 5 May 2011

Friendster
Friendster - Home 1304442914645.png
URL http://www.friendster.com/[IAWcite.todayMemWeb]
Status Closing
Archiving status In progress...
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)

Friendster is an early social networking site which announced on April 25th, 2011 that most of the user-generated content on the site would be deleted on May 31st, 2011. It's estimated that Friendster has over 115 million registered users.

How to help

Scrape profiles

We're going to break up the user ids into ranges and let individuals claim a range to download. Use this table to mark your territory:

Start End Status Size (Uncompressed) Claimant
1 999 Claimed db48x
1000 1999 Claimed alard
2000 2999 Claimed DoubleJ
3000 3999 Claimed Teaspoon
4000 4999 Done 183MB Paradoks
5000 5999 Done 202MB robbiet48/Robbie Trencheny (Amsterdam)
6000 9999 Claimed Sketchcow/Jason Scott
10000 29999 Claimed Sketchcow/Jason Scott
30000 31999 Done 485mb Sketchcow/Jason Scott
32000 32999 Done 201MB Paradoks
33000 33999 Done 241mb closure
34000 100000 Claimed closure
100000 101000 Claimed xlene
101001 102000 Claimed robbiet48/Robbie Trencheny (Florida)
102001 103000 Claimed robbiet48/Robbie Trencheny (Amsterdam)
103001 104000 Claimed yipdw
104001 105000 Claimed Coderjoe
105001 114999 Claimed Paradoks
124328261 Pool

Tools

friendster-scrape-profile

Script to download a Friendster profile view or download

You need a Friendster account to use this script. Add your login details to a file username.txt and a password.txt and save those in the directory of the download script.

Run with a numeric profile id of a Friendster user: ./friendster-scrape-profile PROFILE_ID

Currently downloads:

  • the main profile page (profiles.friendster.com/$PROFILE_ID)
  • the user's profile image from that page
  • the list of public albums (www.friendster.com/viewalbums.php?uid=$PROFILE_ID)
  • each of the album pages (www.friendster.com/viewphotos.php?a=$id&uid=$PROFILE_ID)
  • the original photos from each album
  • the list of friends (www.friendster.com/fans.php?uid=$PROFILE_ID)
  • the shoutoutstream (www.friendster.com/shoutoutstream.php?uid=$PROFILE_ID) and the associated comments
  • the Friendster blog, if any

It does not download any of the widgets.

Downloading one profile takes between 6 to 10 seconds and generates 200-400 kB of data (for normal profiles).

Automating the process

(This is all unix-only; it won't work in Windows.)
1. Create a Friendster account
2. Download the script; name it 'bff.sh'.
3. In the directory that you put the bff.sh, make a username.txt file that has your Friendster e-mail address as the text in it
4. In the directory that you put the bff.sh, make a password.txt file that has your Friendster password as the text in it.
5. Choose your profile range.
6. Edit that section to say what range you'll do.
7. On the command line, type (with your range replacing the '#'s.):
for i in {#..#}; do bash bff.sh $i; done

Site Organization

Content on Friendster seems to be primarily organized by the id number of the users, which were sequentially assigned starting at 1. This will make it fairly easy for wget to scrape the site and for us to break it up into convenient work units. The main components we need to scrape are the profile pages, photo albums and blogs, but there may be others. More research is needed

Profiles

Urls of the form 'http://profiles.friendster.com/<userid>'. Many pictures on these pages are hosted on urls that look like 'http://photos-p.friendster.com/photos/<lk>/<ji>/nnnnnijkl/<imageid>.jpg', but these folders aren't browsable directly. Profiles will not be easy to scrape with wget.

Photo Albums

A user's photo albums are at urls that look like 'http://www.friendster.com/viewalbums.php?uid=<userid>' with individual albums at 'http://www.friendster.com/viewphotos.php?a=<album id>&uid=<userid>'. It appears that the individual photo pages use javascript to load the images, so they will be very hard to scrape.

On the individual album pages, the photo thumbnails are stored under similar paths as the main images. i.e. if the album thumb is at http://photos-p.friendster.com/photos/<lk>/<ji>/nnnnnijkl/<imageid>m.jpg, just drop the final 'm' to get the main photo (or replace it with a 't' to get an even tinier version).

Blogs

Unknown.