Tabblo

From Archiveteam
Jump to navigation Jump to search
Tabblo
Tabblo logo
Tabblo-com.png
URL http://www.tabblo.com/[IAWcite.todayMemWeb]
Status Offline
Archiving status Saved! User lookup
Archiving type Unknown
Project source https://github.com/ArchiveTeam/tabblo-grab
Project tracker http://tabb.heroku.com
IRC channel #archiveteam-bs (on hackint)

A post called Goodbye Tabblo by Ned Batchelder (former Tabblo employee).

Tabblo Lifeboat

Ned Batchelder (former Tabblo employee) wrote Tabblo Lifeboat, a Python script that helps users to download their tabblos.

We have everything. Thanks!

Thanks to your help we've downloaded almost everything. We're currently organizing the files before uploading them to Archive.org.

How to help archiving

Easy option: You can also do this with the ArchiveTeam Warrior, a virtual machine you can download from [1]. Install the appliance, boot and choose the Tabblo project from the menu.

There is a distributed download script that gets usernames from a tracker and downloads the data.

Make sure you are on Linux, that you have curl, git, a recent version of Bash. Your system must also be able to compile wget with the Lua extensions.

  • Get the code:
    git clone git://github.com/ArchiveTeam/tabblo-grab.git
  • Get and compile the latest version of wget-warc-lua:
    ./get-wget-warc-lua.sh
  • Think of a nickname for yourself (preferably use your IRC name).
  • Run the download script with
    ./seesaw.sh "<YOURNICK>"
  • To stop the script gracefully, run
    touch STOP
    in the script's working directory. It will finish the current task and stop.

OS X

Note that these instructions require Homebrew

  • brew tap ArchiveTeam/tools
  • brew install tabblo
  • cd `brew --prefix tabblo`
  • Run the download script with
    ./seesaw.sh "<YOURNICK>"
  • To stop the script gracefully, run
    touch STOP
    in the script's working directory. It will finish the current task and stop.

Notes

  • Compiling wget-warc will require dev packages for the various libraries that it needs. Most questions have been about gnutls; install the gnutls-devel or gnutls-dev package with your favorite package manager. You'll also need the liblua library (liblua5.1-0-dev on Ubuntu) or lua-devel on RPM based distributions.
  • Downloading one user's data can take between 10 seconds and a few hours.
  • The data for one user is equally varied, from a few kB to several MB.
  • The downloaded data will be saved in the ./data/ subdirectory.
  • Download speeds from Tabblo.com are not that high. You can run multiple clients to speed things up.

Downloading ZIPs

There is a script to download the Tabblo ZIP files. (This includes pictures and text, but no comments, profile pages et cetera.) The script downloads a range of 1000 Tabblos and uploads the ZIP files to Archive.org. For example, see the first range.

To participate:

  1. Get the code from [2]. You need Bash and Curl to run it.
  2. Claim one or more ranges (each range includes up to 1,000 Tabblos, so try claiming one or two ranges first). Add your name to the table below.
  3. Run the script: ./dld-tabblo-zip.sh $RANGE, e.g. ./dld-tabblo-zip.sh 12 to download and upload Tabblos 12,000 to 12,999.

To speed things up a range can be divided in 10 parts (of 100 Tabblos each), so you can download several parts at the same time. For example:

for i in 0 1 2 3 4 5 6 7 8 9 ; do
  ./dld-tabblo-zip.sh $RANGE $i &
done

Once you've run the script once, rerun it to check if everything was down- and uploaded successfully.

Ranges Downloader Status
0 - 9 alard Done
10 underscor Downloading
11 - 15 bsmith093 Done
16 - 99 alard Done
100 - 499 alard Done
500 - 549 underscor Downloading
550 - 599 Short Downloading
600 - 699 Hydriz Done
700 - 720 closure Done
721 - 999 Short Downloading
1000 - 1099 closure Done
1100 - 1399 alard Done
1400 - 1499 Hydriz Done
1500 - 1845 alard Downloading
1846 - 1850 Wait... This is the newest range, please download other ranges first

Site structure

Tabblos

Tabblos have an url of the form http://www.tabblo.com/studio/stories/view/#ID#/, where #ID# is the numeric id of the tabblo. Tabblos are numbered sequentially, the last number at the time of writing is 1843370.

A tabblo consists of one HTML page with some text and one or more images. You can click on the images to get a large version, but apart from the larger image that won't give you more than is on the tabblo page. Most tabblos have comments, which are included in the page's HTML.

Running wget --page-requisites on a tabblo url will probably save all available information.

From the Tabblo Lifeboat we learn that Tabblo offers a nice way to download a tabblo in a zip file. This zip file will also give you the original photo files. Download url: http://www.tabblo.com/studio/stories/zip/#ID#/?orig=1. You have to log in before you can download this zip file (but once you're in you can download any tabblo, not just your own).

There's one other catch: the zip download will fail first. The first time you download it you'll get an incomplete zip file, the next time you try it you'll get a little bit more. Repeat downloading until you get the complete zip file. (Probably has something to do with caching.)

Conclusion, to download a tabblo we'll probably want to do something like this:

  wget --page-requisites --warc-file tabblo http://www.tabblo.com/studio/stories/view/#ID#/
  while ! unzip -t all.zip ; do
    wget -O all.zip --header="Cookie: tabblosesh=###" http://www.tabblo.com/studio/stories/zip/#ID#/?orig=1
  done

Users

TODO The user pages (e.g. http://www.tabblo.com/studio/person/chilla/) have everything you'd expect from a social network: comments, photos, friends, favorites, messages.


v · t · e         Archive Team
Current events

Alive... OR ARE THEY · Deathwatch · Projects

Archiveteam.jpg
Archiving projects

APKMirror · Archive.is · BetaArchive · Government Backup (#datarefuge · ftp-gov· Gmane · Internet Archive · It Died · Megalodon.jp · OldApps.com · OldVersion.com · OSBetaArchive · TEXTFILES.COM · The Dead, the Dying & The Damned · The Mail Archive · UK Web Archive · WebCite · Vaporwave.me

Blogging

Blog.pl · Blogger · Blogster · Blogter.hu · Freeblog.hu · Fuelmyblog · Jux · LINE BLOG · LiveJournal · My Opera · Nolblog.hu · Open Diary · ownlog.com · Posterous · Powerblogs · Proust · Roon · Splinder · Tumblr · Vox · Weblog.nl · Windows Live Spaces · Wordpress.com · Xanga · Yahoo! Blog · Zapd

Cloud hosting/file sharing

aDrive · AnyHub · Box · Dropbox · Docstoc · Fast.io · Google Drive · Google Groups Files · iCloud · Fileplanet · LayerVault · MediaCrush · MediaFire · Mega · MegaUpload · MobileMe · OneDrive · Pomf.se · RapidShare · Ubuntu One · Yahoo! Briefcase

Corporations

Apple · IBM · Google · Loblaw · Lycos Europe · Microsoft · Yahoo!

Events

Arab Spring · Great Ape-Snake War · Spanish Revolution

Font Repos

DaFont · Google Web Fonts · GNU FreeFont · Fontspace

Forums/Message boards

4chan · Captain Luffy Forums · College Confidential · Discourse · DSLReports · ESPN Forums · Facepunch Forums · forums.starwars.com · HeavenGames · JamiiForums · Invisionfree · NeoGAF · Textream · The Classic Horror Film Board · Yahoo! Messages · Yahoo! Neighbors · Yuku.com · Zetaboards

Gaming

Atomicgamer · Bazaar.tf · City of Heroes · Club Nintendo · Clutch · Counter-Strike: Global Offensive · CS:GO Lounge · Desura · Dota 2 · Dota 2 Lounge · Emulation Zone · ESEA · GameBanana · GameMaker Sandbox · GameTrailers · Halo · Heroes of Newerth · HLTV.org · HQ Trivia · Infinite Crisis · joinDOTA · League of Legends · Liquipedia · Minecraft.net · Player.me · Playfire · Raptr · SingStar · Steam · SteamDB · SteamGridDB · Team Fortress 2 · TF2 Outpost · Warhammer · Xfire

Image hosting

500px · AOL Pictures · Blipfoto · Blingee · Canv.as · Camera+ · Cameroid · DailyBooth · Degree Confluence Project · DeviantART · Demotivalo.net · Flickr · Fotoalbum.hu · Fotolog.com · Fotopedia · Frontback · Geograph Britain and Ireland · Giphy · GTF Képhost · ImageShack · Imgh.us · Imgur · Inkblazers · Instagram · Kepfeltoltes.hu · Kephost.com · Kephost.hu · Kepkezelo.com · Keptarad.hu · Madden GIFERATOR · MLKSHK · Microsoft Clip Art · Microsoft Photosynth · Nokia Memories · noob.hu · Odysee · Panoramio · Photobucket · Picasa · Picplz · Pixiv · Portalgraphics.net · PSharing · Ptch · puu.sh · Rawporter · Relay.im · ScreenshotsDatabase.com · Sketch · Smack Jeeves · Snapjoy · Streetfiles · Tabblo · Tinypic · Trovebox · TwitPic · Wallbase · Wallhaven · Webshots · Wikimedia Commons

Knowledge/Wikis

arXiv · Citizendium · Clipboard.com · Deletionpedia · EditThis · Encyclopedia Dramatica · Etherpad · Everything2 · infoAnarchy · GeoNames · GNUPedia · Google Books (Google Books Ngram· Horror Movie Database · Insurgency Wiki · Knol · Lost Media Wiki · Neoseeker.com · Notepad.cc · Nupedia · OpenCourseWare · OpenStreetMap · Orain · Pastebin · Patch.com · Project Gutenberg · Puella Magi · Referata · Resedagboken · SongMeanings · ShoutWiki · The Internet Movie Database · TropicalWikis · Uncyclopedia · Urban Dictionary · Urban Exploration Resource · Webmonkey · Wikia · Wikidot · WikiHow · Wikkii · WikiLeaks · Wikipedia (Simple English Wikipedia· Wikispaces · Wikispot · Wik.is · Wiki-Site · WikiTravel · Word Count Journal

Magazines/Blogs/News

Cyberpunkreview.com · Game Developer Magazine · Gigaom · Hardware Canucks · Helium · JPG Magazine · Make Magazine · The Escapist · Polygamia.pl · San Fransisco Bay Guardian · Scoop · Regretsy · Yahoo! Voices

Microblogging

Heello · Identi.ca · Jaiku · Mommo.hu · Plurk · Sina Weibo · Tencent Weibo · Twitter · TwitLonger

Music/Audio

8tracks · AOL Music · Audimated.com · Cinch · digCCmixter · Dogmazic.net · Earbits · exfm · Free Music Archive · Gogoyoko · Indaba Music · Instacast · Instaudio · Jamendo · Last.fm · Music Unlimited · MOG · PureVolume · Reverbnation · ShareTheMusic · SoundCloud · Soundpedia · Spotify · This Is My Jam · TuneWiki · Twaud.io · WinAmp

People

Aaron Swartz · Michael S. Hart · Steve Jobs · Mark Pilgrim · Dennis Ritchie · Len Sassaman Project

Protocols/Infrastructure

FTP · Gopher · IRC · Usenet · World Wide Web
BitTorrent DHT

Q&A

Askville · Answerbag · Answers.com · Ask.com · Askalo · Baidu Knows · Blurtit · ChaCha · Experts Exchange · Formspring · GirlsAskGuys · Google Answers · Google Baraza · JustAnswer · MetaFilter · Quora · Retrospring · StackExchange · The AnswerBank · The Internet Oracle · Uclue · WikiAnswers · Yahoo! Answers

Recipes/Food

Allrecipes · Epicurious · Food.com · Foodily · Food Network · Punchfork · ZipList

Social bookmarking

Addinto · Backflip · Balatarin · BibSonomy · Bkmrx · Blinklist · BlogMarks · BookmarkSync · CiteULike · Connotea · Delicious · Designer News · Digg · Diigo · Dir.eccion.es · Evernote · Excite Bookmark · Faves · Favilous · folkd · Freelish · Getboo · GiveALink.org · Gnolia · Google Bookmarks · Hacker News · HeyStaks · IndianPad · Kippt · Knowledge Plaza · Licorize · Linkwad · Menéame · Microsoft Developer Network · myVIP · Mister Wong · My Web · Mylink Vault · Newsvine · Oneview · Pearltrees · Pinboard · Pocket · Propeller.com · Reddit · sabros.us · Scloog · Scuttle · Simpy · SiteBar · Slashdot · Squidoo · StumbleUpon · Twine · Voat · Vizited · Yummymarks · Xmarks · Yahoo! Buzz · Zootool · Zotero

Social networks

Bebo · BlackPlanet · Classmates.com · Cyworld · Dogster · Dopplr · douban · Ello · Facebook · Flixster · FriendFeed · Friendster · Friends Reunited · Gaia Online · Google+ · Habbo · hi5 · Hyves · iWiW · LinkedIn · Miiverse · mixi · MyHeritage · MyLife · Myspace · myVIP · Netlog · Odnoklassniki · Orkut · Plaxo · Qzone · Renren · Skyrock · Sonico.com · Storylane · Tagged · tvtag · Upcoming · Viadeo · Vine · VK · WeeWorld · Weibo · Wretch · Xuite · Yahoo! Groups · Yahoo! Stars India · Yahoo! Upcoming · more sites...

Shopping/Retail

Alibaba · AliExpress · Amazon · Apple Store · Barnes & Noble · DirectCanada · eBay · Kmart · NCIX · Printfection · RadioShack · Sears · Sears Canada · Target · The Book Depository · ThinkGeek · Toys "R" Us · Walmart

Software/code hosting

Android Development · Alioth · Assembla · BerliOS · Betavine · Bitbucket · BountySource · Codecademy · CodePlex · Freepository · Free Software Foundation · GNU Savannah · GitHost  · GitHub · GitHub Downloads · Gitorious · Gna! · Google Code · ibiblio · java.net · JavaForge · KnowledgeForge · Launchpad · LuaForge · Maemo · mozdev · OSOR.eu · OW2 Consortium · Openmoko · OpenSolaris · Ourproject.org · Ovi Store · Project Kenai · RubyForge · SEUL.org · SourceForge · Stypi · TestFlight · tigris.org · Transifex · TuxFamily · Yahoo! Downloads

Television/Radio

ABC · Austin City Limits · BBC · CBC · CBS · Computer Chronicles · CTV · Fox · G4 · Global TV · Jeopardy! · NBC · NHK · PBS · Penn & Teller: Bullshit! · The Howard Stern Show · TV News Archive (Understanding 9/11)

Torrenting/Piracy

ExtraTorrent · EZTV · isoHunt · KickassTorrents · The Pirate Bay · Torrentz · Library Genesis

Video hosting

Academic Earth · Bambuser · Blip.tv · Epic · Freshlive · Google Video · Justin.tv · Mixer · Niconico · Nokia Trailers · Oddshot.tv · Periscope · Plays.tv · Qwiki · Skillfeed · Stickam · TED Talks · Ticker.tv · Twitch.tv · Ustream · Videoplayer.hu · Viddler · Viddy · Vidme · Vimeo · Vine · Vstreamers · Yahoo! Video · YouTube · Famous Internet videos (Me at the zoo)

Web hosting

Angelfire · Brace.io · BT Internet · CableAmerica Personal Web Space · Claranet Netherlands Personal Web Pages · Comcast Personal Web Pages · Extra.hu · FortuneCity · Free ProHosting · GeoCities (patch· Google Business Sitebuilder · Google Sites · Internet Centrum · MBinternet · MSN TV · Nifty · Nwnyet · Parodius Networking · Prodigy.net · Saunalahti Iso G · Swipnet · Telenor · Tripod · University of Michigan personal webpages · Verizon Mysite · Verizon Personal Web Space · Webs · Webzdarma · Virgin Media

Web applications

Mailman · MediaWiki · phpBB · Simple Machines Forum · vBulletin

Information

A Million Ways to Die on the Web · Backup Tips · Cheap storage · Collecting items randomly · Data compression algorithms and tools · Dev · Discovery Data · DOS Floppies · Fortress of Solitude · Keywords · Naughty List · Nightmare Projects · Rescuing floppy disks · Rescuing optical media · Site exploration · The WARC Ecosystem · Working with ARCHIVE.ORG

Projects

ArchiveCorps · Audit2014 · Emularity · Faceoff · FlickrFckr · Froogle · INTERNETARCHIVE.BAK (Internet Archive Census· IRC Quotes · JSMESS · JSVLC · Just Solve the Problem · NewsGrabber · Project Newsletter · Valhalla · Web Roasting (ISP Hosting · University Web Hosting· Woohoo

Tools

ArchiveBot · ArchiveTeam Warrior (Tracker· Google Takeout · HTTrack · Video downloaders · Wget (Lua · WARC)

Teams

Bibliotheca Anonoma · LibreTeam · URLTeam · Yahoo Video Warroom · WikiTeam

Other

800notes · AOL · Akoha · Ancestry.com · April Fools' Day · Amplicate · AutoAdmit · Bre.ad · Circavie · Cobook · Co.mments · Countdown · Discourse · Distill · Dmoz · Easel · Eircode · Electronic Frontier Foundation · FanFiction.Net · Feedly · Ficlets · Forrst · FunnyExam.com · FurAffinity · Google Helpouts · Google Moderator · Google Poly · Google Reader · ICQmail · IFTTT · Jajah · JuniorNet · Lulu Poetry · Mobile Phone Applications · Mochi Media · Mozilla Firefox · MyBlogLog · NBII · Newgrounds · Neopets · Quantcast · Quizilla · Salon Table Talk · Shutdownify · Slidecast · Stack Overflow · SOPA blackout pages · starwars.yahoo.com · TechNet · Toshiba Support · USA-Gov · Volán · Widgetbox · Windows Technical Preview · Wunderlist · YTMND · Zoocasa

About Archive Team

Introduction · Philosophy · Who We Are · Our stance on robots.txt · Why Back Up? · Software · Formats · Storage Media · Recommended Reading · Films and documentaries about archiving · Talks · In The Media · FAQ