Tabblo
Tabblo | |
URL | http://www.tabblo.com/[IA•Wcite•.today•MemWeb] |
Status | Offline |
Archiving status | Saved! User lookup |
Archiving type | Unknown |
Project source | https://github.com/ArchiveTeam/tabblo-grab |
Project tracker | http://tabb.heroku.com |
IRC channel | #archiveteam-bs (on hackint) |
A post called Goodbye Tabblo by Ned Batchelder (former Tabblo employee).
Tabblo Lifeboat
Ned Batchelder (former Tabblo employee) wrote Tabblo Lifeboat, a Python script that helps users to download their tabblos.
We have everything. Thanks!
Thanks to your help we've downloaded almost everything. We're currently organizing the files before uploading them to Archive.org.
How to help archiving
Easy option: You can also do this with the ArchiveTeam Warrior, a virtual machine you can download from [1]. Install the appliance, boot and choose the Tabblo project from the menu.
There is a distributed download script that gets usernames from a tracker and downloads the data.
Make sure you are on Linux, that you have curl, git, a recent version of Bash. Your system must also be able to compile wget with the Lua extensions.
- Get the code:
git clone git://github.com/ArchiveTeam/tabblo-grab.git
- Get and compile the latest version of wget-warc-lua:
./get-wget-warc-lua.sh
- Think of a nickname for yourself (preferably use your IRC name).
- Run the download script with
./seesaw.sh "<YOURNICK>"
- To stop the script gracefully, run
touch STOP
in the script's working directory. It will finish the current task and stop.
OS X
Note that these instructions require Homebrew
brew tap ArchiveTeam/tools
brew install tabblo
cd `brew --prefix tabblo`
- Run the download script with
./seesaw.sh "<YOURNICK>"
- To stop the script gracefully, run
touch STOP
in the script's working directory. It will finish the current task and stop.
Notes
- Compiling wget-warc will require dev packages for the various libraries that it needs. Most questions have been about gnutls; install the gnutls-devel or gnutls-dev package with your favorite package manager. You'll also need the liblua library (liblua5.1-0-dev on Ubuntu) or lua-devel on RPM based distributions.
- Downloading one user's data can take between 10 seconds and a few hours.
- The data for one user is equally varied, from a few kB to several MB.
- The downloaded data will be saved in the
./data/
subdirectory. - Download speeds from Tabblo.com are not that high. You can run multiple clients to speed things up.
Downloading ZIPs
There is a script to download the Tabblo ZIP files. (This includes pictures and text, but no comments, profile pages et cetera.) The script downloads a range of 1000 Tabblos and uploads the ZIP files to Archive.org. For example, see the first range.
To participate:
- Get the code from [2]. You need Bash and Curl to run it.
- Claim one or more ranges (each range includes up to 1,000 Tabblos, so try claiming one or two ranges first). Add your name to the table below.
- Run the script:
./dld-tabblo-zip.sh $RANGE
, e.g../dld-tabblo-zip.sh 12
to download and upload Tabblos 12,000 to 12,999.
To speed things up a range can be divided in 10 parts (of 100 Tabblos each), so you can download several parts at the same time. For example:
for i in 0 1 2 3 4 5 6 7 8 9 ; do ./dld-tabblo-zip.sh $RANGE $i & done
Once you've run the script once, rerun it to check if everything was down- and uploaded successfully.
Ranges | Downloader | Status |
---|---|---|
0 - 9 | alard | Done |
10 | underscor | Downloading |
11 - 15 | bsmith093 | Done |
16 - 99 | alard | Done |
100 - 499 | alard | Done |
500 - 549 | underscor | Downloading |
550 - 599 | Short | Downloading |
600 - 699 | Hydriz | Done |
700 - 720 | closure | Done |
721 - 999 | Short | Downloading |
1000 - 1099 | closure | Done |
1100 - 1399 | alard | Done |
1400 - 1499 | Hydriz | Done |
1500 - 1845 | alard | Downloading |
1846 - 1850 | Wait... | This is the newest range, please download other ranges first |
Site structure
Tabblos
Tabblos have an url of the form http://www.tabblo.com/studio/stories/view/#ID#/
, where #ID#
is the numeric id of the tabblo. Tabblos are numbered sequentially, the last number at the time of writing is 1843370.
A tabblo consists of one HTML page with some text and one or more images. You can click on the images to get a large version, but apart from the larger image that won't give you more than is on the tabblo page. Most tabblos have comments, which are included in the page's HTML.
Running wget --page-requisites
on a tabblo url will probably save all available information.
From the Tabblo Lifeboat we learn that Tabblo offers a nice way to download a tabblo in a zip file. This zip file will also give you the original photo files. Download url: http://www.tabblo.com/studio/stories/zip/#ID#/?orig=1
. You have to log in before you can download this zip file (but once you're in you can download any tabblo, not just your own).
There's one other catch: the zip download will fail first. The first time you download it you'll get an incomplete zip file, the next time you try it you'll get a little bit more. Repeat downloading until you get the complete zip file. (Probably has something to do with caching.)
Conclusion, to download a tabblo we'll probably want to do something like this:
wget --page-requisites --warc-file tabblo http://www.tabblo.com/studio/stories/view/#ID#/ while ! unzip -t all.zip ; do wget -O all.zip --header="Cookie: tabblosesh=###" http://www.tabblo.com/studio/stories/zip/#ID#/?orig=1 done
Users
TODO The user pages (e.g. http://www.tabblo.com/studio/person/chilla/) have everything you'd expect from a social network: comments, photos, friends, favorites, messages.