User:Vitzli

Saved stuff

JBG Travels youtube channel, partial download, 847 videos total: part 1, part 2, part 3.
Several videos were either marked private or removed at the request of his employer, although they contained only road video.
Encyclopedia Astronautica snapshot (2015-10-22) according to Alive... OR ARE THEY - is on the watchlist
Pole shift survival library — hasn't been updated since 2013, was quite popular among survival/prepping folks, not endangered as website is still online, but torrent is decaying.
Amazon reviews webdata 1995-2013 — still available, but links were hidden.
CGP Grey youtube channel, tar archive per year: 2010,2011, 2012, 2013, 2014, 2015
SmarterEveryDay youtube channel, tar archive per year: 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015

Prospecting IA.BAK collections

Tools required: Python 3 libraries/modules - internetarchive, ia-mine; jq - json processing; parallel - run multiple programs in for each fashion.

archive.org account required (S3 keys) for ia-mine and internetarchive (ia) tools

2016-02-03 census

10 shards
79 collections
142462 items total, 106054 unique items (my mistake, do uniq before doing large batch)

jq code

Remove 'collection' items:

parallel --jobs 4 'jq '"'"'. | select(.mediatype != "collection") | .identifier'"'"' '"$F_PREFIX"'/{}.col.json | tr -d '"'"'"'"'" ' > '"$F_PREFIX"'/{}.items.json'

Remove 'uploader' field:

parallel --jobs 4 'jq -c '"'"'del(.metadata.uploader)'"'"' '"$F_PREFIX"'/{}.mined.json > '"SHARDS-20160203-cleaned/$F_PREFIX"'/{}.cleaned.json'

User:Vitzli

Contents

Saved stuff

Prospecting IA.BAK collections

2016-02-03 census

jq code

Navigation menu

User:Vitzli

Saved stuff

Prospecting IA.BAK collections

2016-02-03 census

jq code

Navigation menu

Search