Facebook

From Archiveteam
Revision as of 20:27, 5 March 2023 by Random (talk | contribs) (Proposal for end-to-end prototype Facebook Group archiver)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Facebook
Facebook Logo
Facebook Logo
URL https://www.facebook.com/
Status Online!
Archiving status Not saved yet
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)

Facebook is a social networking site whose popularity has exploded in recent years. As of January 2014, there are more than 1.2 billion active users of the site.[1] Facebook hosts untold billions of users' photos, videos, thoughts, conversations, and other content. The total size of Facebook's databases is easily in the hundreds of petabytes.[2][3]

The judicious user will have a well-designed backup plan for all that content that they retain full control over, but it is a reasonable assumption that the majority of users rely totally on Facebook to safeguard their data. This is a mistake.

It might seem completely unthinkable that a site as massive and as popular as Facebook could ever disappear, taking your data with it. The reality is that websites, even hugely popular ones, can decline in popularity over time and eventually go away, taking your data with them with little or no warning. We've seen it happen.

While Facebook may not be in any immediate danger, you should consider that the data you put on Facebook may be immensely important to you in 10 or 20 years, similar to your family's photo albums. Facebook could be long dead by then. Start planning for this eventuality right now.

Download Your Data From Facebook

Facebook has created a tool to download an incomplete archive of your Facebook account. This includes your own photos and videos, chat conversations, status updates and wall posts. Please bear in mind that the archive is INCOMPLETE. It does NOT include any content you posted on any Facebook group. It does NOT include any comments from your wall posts. It also does NOT include photos and videos belonging to other people even if you are tagged in them.

To create your archive, click the little down arrow next to your name in the upper right area of the page, click on "Settings", then click on "Your Facebook Information", then click on "Download Your Information".

The next screen will explain what's going on. Press "Create File" and you will be presented with a popup telling you that this will take some time - around one hour is not unheard of. This may indeed take several minutes. In the mean time you can continue using Facebook as usual. They will email you when the archive is ready for download.

Your email will contain a link to download your archive. Follow that link and enter your Facebook password to continue. The next page presents you with a download button and an estimate of the archive's size. Download that somewhere convenient for you. This file contains highly personal and potentially sensitive information so keep it safe!. The easiest way to browse the information is to extract the contents of the zip file, and then open the index.html file with your browser of choice. From there you can look at your profile, your wall posts, photos and videos, and private messages.

Note that as of April, 2012, this download tool seems to have some bugs -- in my tests it failed to completely back up all of my conversations, for example. It's better than nothing but for now at least I don't trust that it is perfect.

Unofficial Backup tools

  • friendstogmail exports your contacts to CSV files, ostensibly for import to Gmail. Claims to work only in-browser. Retrieves your friends' birthdays, locations, bios, work history, and hometowns.
  • Facedown downloads photo albums from Facebook. Official distribution of the software has stopped.
  • Give Me My Data (gmmd) was a Facebook application that helped users export their data out of Facebook for reuse in visualizations, archives, or any possible method of digital storytelling. Data could be exported in common formats like CSV, XML, and JSON as well as customized network graph formats.
  • Photograbber was an open-source photo album downloader, it was capable of retrieving ragged photos, albums, tags, comments and likes. It could also download the albums of your friends.

Public pages and groups

While Facebook contains personal profiles and Facebook versions of many blogs/news sites, it also contains blogs that only exist on Facebook. It could be nice to archive these pages, which are the most endangered:

  • Facebook is known to have banned totally reasonable pages in the past. They usually come back with a new profile, but the old content is gone.
  • Admins could delete pages and groups.
  • Pages can be bought and repurposed for commercial/spammy purposes.
  • Archival of pages and groups can be difficult (especially for general-purpose archivers), because Facebook is based on a (quite messy) implementation of AJAX. APIs are tightly locked and it is no longer possible to use them for interacting with/reading posts from groups.[4]
  • Alternatively, one could disable JavaScript and download web pages from the mobile website, m.facebook.com. Archival of pages and groups using this method is much easier.

Proposal

We could start archiving the most popular public pages that 1) are not backed by an external website and 2) are not simply used for boring promotional purposes (eg. pages belonging to brands).

Another proposal

  • In software engineering terms, the above proposal is a "waterfall". It will never get started because an agreement about which pages are important will never be reached.
  • Instead an "end-to-end" archive of a single Facebook page/group needs to be achieved.
  • Start by archiving just one test group, including all posts, all comments, all images, etc.
  • The archive should be browseable, or at least a viewer needs to exist, otherwise it will be difficult to verify its success.
  • After one good quality archive has been completed, and the bugs have been ironed out, then allow the public to submit requests for pages/groups to be archived, similar to the current process for other general websites.

Portability

In 2020 it will become possible to export one's own Facebook photos to Google Photos.[5] In December 2019 the feature was made available in Ireland.[6] The move comes after the GDPR requirements for data portability and may be extended beyond EU and to further target services.

Facebook ostensibly participates in the Data transfer project by the GAFA + Twitter.

Vital Signs

Currently stable.

References

External links


v · t · e         Archive Team
Current events

Alive... OR ARE THEY · Deathwatch · Projects

Archiveteam.jpg
Archiving projects

APKMirror · Archive.is · BetaArchive · Government Backup (#datarefuge · ftp-gov· Gmane · Internet Archive · It Died · Megalodon.jp · OldApps.com · OldVersion.com · OSBetaArchive · TEXTFILES.COM · The Dead, the Dying & The Damned · The Mail Archive · UK Web Archive · WebCite · Vaporwave.me

Blogging

Blog.pl · Blogger · Blogster · Blogter.hu · Freeblog.hu · Fuelmyblog · Jux · LiveJournal · My Opera · Nolblog.hu · Open Diary · ownlog.com · Posterous · Powerblogs · Proust · Roon · Splinder · Tumblr · Vox · Weblog.nl · Windows Live Spaces · Wordpress.com · Xanga · Yahoo! Blog · Zapd

Cloud hosting/file sharing

aDrive · AnyHub · Box · Dropbox · Docstoc · Fast.io · Google Drive · Google Groups Files · iCloud · Fileplanet · LayerVault · MediaCrush · MediaFire · Mega · MegaUpload · MobileMe · OneDrive · Pomf.se · RapidShare · Ubuntu One · Yahoo! Briefcase

Corporations

Apple · IBM · Google · Loblaw · Lycos Europe · Microsoft · Yahoo!

Events

Arab Spring · Great Ape-Snake War · Spanish Revolution

Font Repos

DaFont · Google Web Fonts · GNU FreeFont · Fontspace

Forums/Message boards

4chan · Captain Luffy Forums · College Confidential · Discourse · DSLReports · ESPN Forums · Facepunch Forums · forums.starwars.com · HeavenGames · JamiiForums · Invisionfree · NeoGAF · Textream · The Classic Horror Film Board · Yahoo! Messages · Yahoo! Neighbors · Yuku.com · Zetaboards

Gaming

Atomicgamer · Bazaar.tf · City of Heroes · Club Nintendo · Clutch · Counter-Strike: Global Offensive · CS:GO Lounge · Desura · Dota 2 · Dota 2 Lounge · Emulation Zone · ESEA · GameBanana · GameMaker Sandbox · GameTrailers · Halo · Heroes of Newerth · HLTV.org · HQ Trivia · Infinite Crisis · joinDOTA · League of Legends · Liquipedia · Minecraft.net · Player.me · Playfire · Raptr · SingStar · Steam · SteamDB · SteamGridDB · Team Fortress 2 · TF2 Outpost · Warhammer · Xfire

Image hosting

500px · AOL Pictures · Blipfoto · Blingee · Canv.as · Camera+ · Cameroid · DailyBooth · Degree Confluence Project · DeviantART · Demotivalo.net · Flickr · Fotoalbum.hu · Fotolog.com · Fotopedia · Frontback · Geograph Britain and Ireland · Giphy · GTF Képhost · ImageShack · Imgh.us · Imgur · Inkblazers · Instagram · Kepfeltoltes.hu · Kephost.com · Kephost.hu · Kepkezelo.com · Keptarad.hu · Madden GIFERATOR · MLKSHK · Microsoft Clip Art · Microsoft Photosynth · Nokia Memories · noob.hu · Odysee · Panoramio · Photobucket · Picasa · Picplz · Pixiv · Portalgraphics.net · PSharing · Ptch · puu.sh · Rawporter · Relay.im · ScreenshotsDatabase.com · Sketch · Smack Jeeves · Snapjoy · Streetfiles · Tabblo · Tinypic · Trovebox · TwitPic · Wallbase · Wallhaven · Webshots · Wikimedia Commons

Knowledge/Wikis

arXiv · Citizendium · Clipboard.com · Deletionpedia · EditThis · Encyclopedia Dramatica · Etherpad · Everything2 · infoAnarchy · GeoNames · GNUPedia · Google Books (Google Books Ngram· Horror Movie Database · Insurgency Wiki · Knol · Lost Media Wiki · Neoseeker.com · Notepad.cc · Nupedia · OpenCourseWare · OpenStreetMap · Orain · Pastebin · Patch.com · Project Gutenberg · Puella Magi · Referata · Resedagboken · SongMeanings · ShoutWiki · The Internet Movie Database · TropicalWikis · Uncyclopedia · Urban Dictionary · Urban Exploration Resource · Webmonkey · Wikia · Wikidot · WikiHow · Wikkii · WikiLeaks · Wikipedia (Simple English Wikipedia· Wikispaces · Wikispot · Wik.is · Wiki-Site · WikiTravel · Word Count Journal

Magazines/Blogs/News

Cyberpunkreview.com · Game Developer Magazine · Gigaom · Hardware Canucks · Helium · JPG Magazine · Make Magazine · The Escapist · Polygamia.pl · San Fransisco Bay Guardian · Scoop · Regretsy · Yahoo! Voices

Microblogging

Heello · Identi.ca · Jaiku · Mommo.hu · Plurk · Sina Weibo · Tencent Weibo · Twitter · TwitLonger

Music/Audio

8tracks · AOL Music · Audimated.com · Cinch · digCCmixter · Dogmazic.net · Earbits · exfm · Free Music Archive · Gogoyoko · Indaba Music · Instacast · Instaudio · Jamendo · Last.fm · Music Unlimited · MOG · PureVolume · Reverbnation · ShareTheMusic · SoundCloud · Soundpedia · Spotify · This Is My Jam · TuneWiki · Twaud.io · WinAmp

People

Aaron Swartz · Michael S. Hart · Steve Jobs · Mark Pilgrim · Dennis Ritchie · Len Sassaman Project

Protocols/Infrastructure

FTP · Gopher · IRC · Usenet · World Wide Web
BitTorrent DHT

Q&A

Askville · Answerbag · Answers.com · Ask.com · Askalo · Baidu Knows · Blurtit · ChaCha · Experts Exchange · Formspring · GirlsAskGuys · Google Answers · Google Baraza · JustAnswer · MetaFilter · Quora · Retrospring · StackExchange · The AnswerBank · The Internet Oracle · Uclue · WikiAnswers · Yahoo! Answers

Recipes/Food

Allrecipes · Epicurious · Food.com · Foodily · Food Network · Punchfork · ZipList

Social bookmarking

Addinto · Backflip · Balatarin · BibSonomy · Bkmrx · Blinklist · BlogMarks · BookmarkSync · CiteULike · Connotea · Delicious · Designer News · Digg · Diigo · Dir.eccion.es · Evernote · Excite Bookmark · Faves · Favilous · folkd · Freelish · Getboo · GiveALink.org · Gnolia · Google Bookmarks · Hacker News · HeyStaks · IndianPad · Kippt · Knowledge Plaza · Licorize · Linkwad · Menéame · Microsoft Developer Network · myVIP · Mister Wong · My Web · Mylink Vault · Newsvine · Oneview · Pearltrees · Pinboard · Pocket · Propeller.com · Reddit · sabros.us · Scloog · Scuttle · Simpy · SiteBar · Slashdot · Squidoo · StumbleUpon · Twine · Voat · Vizited · Yummymarks · Xmarks · Yahoo! Buzz · Zootool · Zotero

Social networks

Bebo · BlackPlanet · Classmates.com · Cyworld · Dogster · Dopplr · douban · Ello · Facebook · Flixster · FriendFeed · Friendster · Friends Reunited · Gaia Online · Google+ · Habbo · hi5 · Hyves · iWiW · LinkedIn · Miiverse · mixi · MyHeritage · MyLife · Myspace · myVIP · Netlog · Odnoklassniki · Orkut · Plaxo · Qzone · Renren · Skyrock · Sonico.com · Storylane · Tagged · tvtag · Upcoming · Viadeo · Vine · VK · WeeWorld · Weibo · Wretch · Yahoo! Groups · Yahoo! Stars India · Yahoo! Upcoming · more sites...

Shopping/Retail

Alibaba · AliExpress · Amazon · Apple Store · Barnes & Noble · DirectCanada · eBay · Kmart · NCIX · Printfection · RadioShack · Sears · Sears Canada · Target · The Book Depository · ThinkGeek · Toys "R" Us · Walmart

Software/code hosting

Android Development · Alioth · Assembla · BerliOS · Betavine · Bitbucket · BountySource · Codecademy · CodePlex · Freepository · Free Software Foundation · GNU Savannah · GitHost  · GitHub · GitHub Downloads · Gitorious · Gna! · Google Code · ibiblio · java.net · JavaForge · KnowledgeForge · Launchpad · LuaForge · Maemo · mozdev · OSOR.eu · OW2 Consortium · Openmoko · OpenSolaris · Ourproject.org · Ovi Store · Project Kenai · RubyForge · SEUL.org · SourceForge · Stypi · TestFlight · tigris.org · Transifex · TuxFamily · Yahoo! Downloads

Television/Radio

ABC · Austin City Limits · BBC · CBC · CBS · Computer Chronicles · CTV · Fox · G4 · Global TV · Jeopardy! · NBC · NHK · PBS · Penn & Teller: Bullshit! · The Howard Stern Show · TV News Archive (Understanding 9/11)

Torrenting/Piracy

ExtraTorrent · EZTV · isoHunt · KickassTorrents · The Pirate Bay · Torrentz · Library Genesis

Video hosting

Academic Earth · Bambuser · Blip.tv · Epic · Freshlive · Google Video · Justin.tv · Mixer · Niconico · Nokia Trailers · Oddshot.tv · Periscope · Plays.tv · Qwiki · Skillfeed · Stickam · TED Talks · Ticker.tv · Twitch.tv · Ustream · Videoplayer.hu · Viddler · Viddy · Vidme · Vimeo · Vine · Vstreamers · Yahoo! Video · YouTube · Famous Internet videos (Me at the zoo)

Web hosting

Angelfire · Brace.io · BT Internet · CableAmerica Personal Web Space · Claranet Netherlands Personal Web Pages · Comcast Personal Web Pages · Extra.hu · FortuneCity · Free ProHosting · GeoCities (patch· Google Business Sitebuilder · Google Sites · Internet Centrum · MBinternet · MSN TV · Nifty · Nwnyet · Parodius Networking · Prodigy.net · Saunalahti Iso G · Swipnet · Telenor · Tripod · University of Michigan personal webpages · Verizon Mysite · Verizon Personal Web Space · Webs · Webzdarma · Virgin Media

Web applications

Mailman · MediaWiki · phpBB · Simple Machines Forum · vBulletin

Information

A Million Ways to Die on the Web · Backup Tips · Cheap storage · Collecting items randomly · Data compression algorithms and tools · Dev · Discovery Data · DOS Floppies · Fortress of Solitude · Keywords · Naughty List · Nightmare Projects · Rescuing floppy disks · Rescuing optical media · Site exploration · The WARC Ecosystem · Working with ARCHIVE.ORG

Projects

ArchiveCorps · Audit2014 · Emularity · Faceoff · FlickrFckr · Froogle · INTERNETARCHIVE.BAK (Internet Archive Census· IRC Quotes · JSMESS · JSVLC · Just Solve the Problem · NewsGrabber · Project Newsletter · Valhalla · Web Roasting (ISP Hosting · University Web Hosting· Woohoo

Tools

ArchiveBot · ArchiveTeam Warrior (Tracker· Google Takeout · HTTrack · Video downloaders · Wget (Lua · WARC)

Teams

Bibliotheca Anonoma · LibreTeam · URLTeam · Yahoo Video Warroom · WikiTeam

Other

800notes · AOL · Akoha · Ancestry.com · April Fools' Day · Amplicate · AutoAdmit · Bre.ad · Circavie · Cobook · Co.mments · Countdown · Discourse · Distill · Dmoz · Easel · Eircode · Electronic Frontier Foundation · FanFiction.Net · Feedly · Ficlets · Forrst · FunnyExam.com · FurAffinity · Google Helpouts · Google Moderator · Google Poly · Google Reader · ICQmail · IFTTT · Jajah · JuniorNet · Lulu Poetry · Mobile Phone Applications · Mochi Media · Mozilla Firefox · MyBlogLog · NBII · Newgrounds · Neopets · Quantcast · Quizilla · Salon Table Talk · Shutdownify · Slidecast · Stack Overflow · SOPA blackout pages · starwars.yahoo.com · TechNet · Toshiba Support · USA-Gov · Volán · Widgetbox · Windows Technical Preview · Wunderlist · YTMND · Zoocasa

About Archive Team

Introduction · Philosophy · Who We Are · Our stance on robots.txt · Why Back Up? · Software · Formats · Storage Media · Recommended Reading · Films and documentaries about archiving · Talks · In The Media · FAQ