DocumentCloud

From Archiveteam
Jump to navigation Jump to search

DocumentCloud is a document-sharing website that allows users to upload, analyze, annotate, collaborate on, and publish primary-source documents, like court filings. It seems to be associated with the FOIA-supporting nonprofit MuckRock (https://www.muckrock.com/[IAWcite.today]).

The document pages have, in the right side column, "Download File" links to the corresponding PDF URLs on s3, or you can convert the preview URL to a PDF URL like this:

Given a url like:

You can use sed 's,www.,s3.,;s,-,/,;s,/$,,;s,$,.pdf,' to change them to get the s3 url for archival via ArchiveBot.

That gives the urls:

Archiveteam1.png This article is a stub. You can help by expanding it.