User talk:Meteor

From Archiveteam
Revision as of 17:27, 15 August 2016 by Bzc6p (talk | contribs) (→‎Re: Chelyabinsk meteor: done)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Re: Chelyabinsk meteor

Hi. Thanks for collecting the sources, it must have been the more difficult part – compiling a list of download URLs and then bulk downloading them seems to be relatively easy. I'll look into it in the following days and will arrange their upload to the Internet Archive, with the help of Archive Team. bzc6p (talk) 17:08, 27 January 2016 (EST)

Yes, I've noticed that. I'm going through the URLs and categorize them, as different kinds of URLs need different treatment. bzc6p (talk) 06:26, 2 February 2016 (EST)

So the sources you gave contain links to many different kinds of websites, many of them needing different treatment and special care, in order to save them properly and completely (e.g. collecting out links to image pages from Yandex galleries, collecting URLs to full resolution images on image pages on; checking whether the video sites are all supported by youtube-dl, etc.). Also, once I've compiled the list of the links, I'll need to do test grabs, check them, and if something didn't go well, fix the problems arising from today's "responsive" (i.e. shitload of Javascript that automatic grabbers can't do anything with) websites – and only after everything is fine, can I, partially with the help of ArchiveBot, start the stage of finally downloading everything.

What I want to tell is that I'll make quality archives containing everything that they should, and I'll gladly do this, but preparing this takes some time. I'm halfway through this, and I'll be able to do it, but probably won't have much free time in the near future, so it will take some time, but will definitely be done in the foreseeable future. I hope this is okay for you. bzc6p (talk) 14:37, 13 February 2016 (EST)

I'll do it in July this year. bzc6p (talk) 15:29, 4 May 2016 (EDT)

I'm sorry for the delay. Now I've started to download – and then upload – the content. As for images, there may be a few that won't be saved correctly. (Simply no capacity to check&correct all one by one.) However, most of them are from VK and Yandex, which I treated with special care, so they, and most of the others too, should be okay. As for videos, probably not all of them will be saved – those that are not supported by youtube-dl or just don't work for some reason can be proclaimed "unsavable" with current tools, I can't do anything about that. Also, especially in case of videos, they may not play back correctly in the Wayback Machine, but if they are saved, they can be extracted later regardless, or even the Wayback might support them one day. By the way, you may have seen that vitzli already saved the YouTube videos in a different way, so that they can be downloaded directly from the Archive. I'll try to save them into WARCs too, regardless.

I said that the result of this saving process may not be 100%, but I think it will be above 90%.

I'll inform you when I've finished uploading. Thank you again for your patience. bzc6p (talk) 14:27, 29 July 2016 (EDT)

I simply wasn't able to download some videos, even with ArchiveBot. However, 12 GB of videos have been saved, and although I haven't counted how many that is, probably a larger part of the videos still could be saved. Their WARC files can be found in this item.

I downloaded the photos myself, how successful it was I already explained. 12 GB total, but contains other photos of the users too. They are here, in 3 batches. The uploads will be finished in like 10 days from now.

The photos will be later, the videos should already be ingested into the Wayback Machine. However, there may be difficulties with the playback of the videos, as I already mentioned. The files are there in some way, though, and this is the most important.

I can say the archiving process is over. Thank you again for your patience. I tried to do quality work, and I didn't have time back then, and was able to finish it only now.


bzc6p (talk) 13:27, 15 August 2016 (EDT)