MS Paint Fan Adventures

From Archiveteam
Revision as of 17:17, 6 March 2018 by Riking (talk | contribs) (add instructions for running, add more status entries)
Jump to navigation Jump to search

Stub to write down notes about project in progress

Archiving the contents of - uses a JS app to read text-based stories with embedded images and Flash content. Often link to YouTube alternates for flashes.

Custom archiver code is at . Giant tangled mess of Go scripts.

Contact @riking to get upload permissions for the Archive collection

Operating the custom archiver


  1. Clone into $GOPATH - there will probably be updates to the script
  2. Get wpull: pip3 install wpull==1.2.3; pip3 install -r
  3. Symlink it into the current directory: ln -s $(which wpull)
  4. Get youtube-dl, youtube-dl -U as needed
  5. If needed, symlink ./target to a bigger drive or specify -o=folder every time you run it
  6. Build the code (TODO: patches to datatogether/warc) go build -v .
  7. Put IAS3 credentials into ./ias3.json - {"AccessKeyID": "...", "SecretAccessKey": "..."}


While testing, make sure to include both -test -ident MSPFA_Test_12345 so that the uploaded Archive items go into test_collection. When testing changes to the script, use -devScript

Basic usage: ./mspfa-archiver -dl -ident auto -s 1234

If a download step fails (e.g. broken URL passed to photobucket step) run the archiver again with the -fu ("F"orce "U"pload) flag.

If you encounter a dead domain and don't want to wait for wpull, include -wpullArgs '--exclude-domains,' etc etc.

TODO - Script to automatically run on each story ID and save a list of failures

Work Division


Archiver: riking

Stage: Initial archiving

Problematic story IDs:

(note: this table should not be taken as an example, it's translated from my bad notes)

Story ID Archived OK? Problems
1 Yes Single failed Photobucket URL.
4 Yes Photobucket
12 Yes Photobucket
14 Yes 404s
17 No Dead domain:
19 Mostly broken URL: pasted twice in a row
21 No Contact Author "410 Gone" from; Dead domain:
22 Mostly broken URL: pasted twice in a row
24 No Photobucket
25 No 404s
26 No 404s: imageshack
33 No Dead domain:
35 No broken URL
45 No 404s
46 No Dead domain: myfrogbag
48 No 404s: imageshack
53 No Dead domain:
55 No 404s: imageshack
59 No Photobucket
61 No 404s
63 No 404s
66 No Dead domain: myfrogbag
67 No 404s: photobucket
76 No 404s: photobucket
77 No 404s: imageshack
78 No 404s: imageshack
81 No Dead domain: TBD
87 No 404s; Pulling HTML pages
89 Yes 404s: imageshack
93 No Dead domain: TBD
104 No Dead domain: suspended webhost
106 No Dead domain: TBD
108 No 404s: imagebin
109 No 404s; Dead domain
111 No 404s: imageshack
135 No 404s: photobucket
158 No 404s
160 Yes 404s: imageshack
227 No uses blob: urls??
241 No Dropbox public folder
263 No HTML: imageshack homepage
270 No Dropbox public folder
277 No Dead domain:
285 No HTML
307 No 404s
308 No 404s
314 No Dead domain:
319 No Dropbox public folder
323 No
325 No Dead domain:
331 No Dropbox public folder
339 Mostly A Single Photobucket BWE
341 No Dropbox public folder
350 No Dead domain:
351 Some Imageshack 404s
352 Mostly SWFs hosted at
353 Mostly SWFs hosted at
728 No Dropbox public folder