Difference between revisions of "Xfire"

From Archiveteam
Jump to navigation Jump to search
Line 31: Line 31:
  
 
Content should be modified afterwards (some addresses replaced, e.g. the 208.whatever to social.xfire.com or classic.xfire.com). See the reddit thread.
 
Content should be modified afterwards (some addresses replaced, e.g. the 208.whatever to social.xfire.com or classic.xfire.com). See the reddit thread.
 +
 +
Does not include page requisites, they must be downloaded once.
  
 
<pre>
 
<pre>

Revision as of 10:30, 11 June 2015

June 10, #xfired:

<bzc6p> So, I've spent some hours on some algorithm despite the fact that an ArchiveTeam go wouldn't be appropriate at the moment.
<bzc6p> Users won't be served even until the deadline.
<bzc6p> And the site has a fuckton of content, discovery would be necessary – All for which we don't have the time.
<bzc6p> If nothing changes, we must see it burn away. Users, save your content while you can.
<bzc6p> In case some change comes, e.g. deadline changes, and we have a chance to archive, a good start may be what I've written on the wiki page.
<bzc6p> But until that, I don't waste a second on it – hope you'll understand, it's all in vain now.
<bzc6p> (It does not prevent anyone from working on it if he has nothing better to do, though.)

Shutdown

On June 12, 2015. Notice on June 10.

http://www.reddit.com/r/Games/comments/39a41v/xfire_social_profiles_shutdown_save_your/

Archiving

Users have to wait days for their content to be exported. An ArchiveTeam fire wouldn't probably be appropriate.

By the way, there may be hundreds of millions of videos, tens of millions of profiles, who knows how many screenshots.

Probably too late for ArchiveTeam.

bzc6p's dirty bash script, may be buggy

Just for the algorithm. Should be rewritten in lua/python/whatever if we really saved it. Not really tested, may be incomplete.

There may be other things to be saved (e.g. games, communities (database broken?)), this does just the most important: videos, screenshots, friends, profile page, avatar. Spits out a LIST with the list of files to be downloaded (WARCed). Favorite servers and gaming history may be already on the page. For them to work and for the screenshots to be shown the javascripts should be available, but the links are broken, js-es not found on xfire.com but found e.g. on 208.whatever

Sole parameter the user id.

Content should be modified afterwards (some addresses replaced, e.g. the 208.whatever to social.xfire.com or classic.xfire.com). See the reddit thread.

Does not include page requisites, they must be downloaded once.

#!/bin/bash
HOST="208.88.178.38"
rm LIST 2>/dev/null
echo "http://$HOST/profile/$1/" >> LIST
echo "http://$HOST/friends/$1/" >> LIST
wget "http://$HOST/profile/$1/" -O - | grep "src='http://screenshot.xfire.com/avatar/" | cut -d"'" -f 6 >> LIST
wget "http://$HOST/profile/$1/screenshots/" -O - | grep "href=\"/profile/$1/screenshots/" | cut -d'"' -f 2 | sed "s/^/http:\/\/$HOST/g" > screenshots
while read line
do
  MAXPAGE=32000
  OLDMAXPAGE=0
  while [ $OLDMAXPAGE -ne $MAXPAGE ]
  do
    OLDMAXPAGE=$MAXPAGE
    MAXPAGE=`wget $line -O - | grep "page=" | tail -1 | cut -d"=" -f 3 | cut -d"&" -f 1`
    if [ -z "$MAXPAGE" ]; then
      MAXPAGE=0
      break
    fi
  done
  rm albumpages
  echo $line >> albumpages
  [ $MAXPAGE -ge 1 ] && echo "$line?page=0&count=24" >> LIST
  for (( i=1; i<=$MAXPAGE; i++))
  do
    echo "$line?page=$i&count=24" >> albumpages
  done
  cat albumpages >> LIST
  while read line2
  do
    rm album
    wget $line2 -O album
    grep 'src="http://screenshot.xfire.com/s/' album | cut -d'"' -f 2 > thumbnames
    cat thumbnames >> LIST
    cat thumbnames | sed "s/-1/-3/g" >> LIST
    cat thumbnames | sed "s/-1/-4/g" >> LIST
    rm thumbnames
    grep 'href="?view#' album | cut -d'"' -f 2 | sed "s/^/http:\/\/$HOST\/profile\/screenshots\/$1\//g" | sed "s/^/`echo $line2 | cut -d'/' -f 7`\//g" >> LIST
    rm album
    # TODO: support for comments
  done < albumpages
  rm albumpages
done < screenshots
rm screenshots

echo "http://$HOST/profile/$1/videos/" >> LIST
wget "http://$HOST/profile/$1/videos/" -O - | grep "href=\"/profile/$1/videos/" | cut -d'"' -f 2 | sed "s/^/http:\/\/$HOST/g" > videos
while read line
do
  # This is just speculative, but probably it works just as with the screenshots.
  MAXPAGE=32000
  OLDMAXPAGE=0
  while [ $OLDMAXPAGE -ne $MAXPAGE ]
  do
    OLDMAXPAGE=$MAXPAGE
    MAXPAGE=`wget $line -O - | grep "page=" | tail -1 | cut -d"=" -f 3 | cut -d"&" -f 1`
    if [ -z "$MAXPAGE" ]; then
      MAXPAGE=0
      break
    fi
  done
  rm albumpages
  echo $line >> albumpages
  [ $MAXPAGE -ge 1 ] && echo "$line?page=0&count=24"
  for (( i=1; i<=$MAXPAGE; i++))
  do
    echo "$line?page=$i&count=24" >> albumpages
  done
  cat albumpages >> LIST
  while read line2
  do
    rm album
    wget $line2 -O album
    grep "video.xfire.com" album | cut -d'"' -f 4 >> LIST
    grep "video.xfire.com" album | cut -d'"' -f 2 | sed "s/^/http:\/\/$HOST/g" > videos
    rm album
    cat videos >> LIST
    while read line3
    do
      wget $line3 -O - | grep "\.mp4" | cut -d"'" -f 2 >> LIST
      # TODO: support for comments
    done < videos
    rm videos
  done < albumpages
  rm albumpages
done < videos
rm videos