Mapillary

From Archiveteam
Revision as of 19:43, 18 June 2020 by Nicolas17v2 (talk | contribs) (Update to current photo count, and to the last data-size figure I heard from staff (last year))
Jump to navigation Jump to search
Mapillary
URL https://www.mapillary.com[IAWcite.todayMemWeb]
Status Online!
Archiving status Not saved yet
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)


Mapillary is a crowdsourced Google StreetView-like platform that allows users to take photos of their local area with a supplied smartphone app (or their own equipment) and will assemble the sequences in a semi-intelligent way for easy navigation and use. Photos are shared publicly under a Creative Commons license (CC-BY-SA 4.0 International license at time of writing).

Many OpenStreetMap users make use of Mapillary and Mapillary images as their source and documentation of their mapping process (a "source of truth") and as such it would be tremendously useful to be able to rely on Mapillary images being there "forever". Mapillary is a startup and as such may or may not be around forever, so having a safe backup would be a great relief to these mappers.

The publicly-accessible photos aren't the original uploads, they are processed. Sensitive information is blurred out (faces and license plates), a watermark is added on the corner, they have maximum 2048px width, and (apparently) 75% JPEG quality. Only the user who uploaded a photo can re-download the original unmodified file.

However, Mapillary are very supportive of openness and freedom and it may be possible to get a lot of cooperation from them, such as access to the original photos, or a dump of the processed photos without having to scrape them.

Mapillary currently (as of June 2020) has 1195 million photos. The 2048px processed photos seem to be around 300KB, which would add up to 307TB. Staff has indicated they are storing 2.5PB of data (as of April 2019), but it's unclear if this is only the originals, or if it also includes the processed images in multiple resolutions. All data is stored in the AWS S3 service in Europe (which may add costs if we want to transfer it to the Internet Archive DCs in North America).

See also