Picplz

From Archiveteam
Revision as of 15:55, 3 June 2012 by Alard (talk | contribs)
Jump to navigation Jump to search
Picplz
Picplz logo
Picplz-com.png
URL http://picplz.com/[IAWcite.todayMemWeb]
Status Closing
Archiving status Not saved yet
Archiving type Unknown
Project source https://github.com/ArchiveTeam/picplz-grab
Project tracker http://picplz.heroku.com/
IRC channel #archiveteam-bs (on hackint)

On July 3, 2012, picplz will shut down permanently and all photos and data will be deleted. We have provided download links for existing users to save their photos. Thank you for your support of picplz and we apologize for any inconvenience this may cause you.

Screengrab of farewell blog post

API

Picplz has an API that returns data if you give it (incremental!) numerical ids.

https://sites.google.com/site/picplzapi/

http://api.picplz.com/api/v2/pic.json?ids=11183559
http://api.picplz.com/api/v2/user.json?id=1515537

Image formats: the API has several image formats available. We may or may not want all of them. The web interface uses the following formats: 56s,64s,100s,400r,640r,1024r (1024r being the largest version the API provides), so we might want to get those.

Download scripts

A first version of a download script is available in the GitHub repository. When given a numerical user ID the script uses the API to download all data and images of that user.

Something to decide: we can get all the information from the API, but what else should we save? Should the WARC file include any of the web pages, even though that doesn't provide extra information? At the moment the script downloads the user's homepage and the page of every picture.

The infinite scrolling of the user page poses a problem. The JavaScript functions add a _ parameter with a timestamp to their API request. We can't really archive the timestamps: the resurrected page will request URLs that aren't archived. The download script therefore archives the API urls without a timestamp: remove the timestamp parameter from the JavaScript to fix the archived page.