Talk:YouTube

From Archiveteam
Jump to navigation Jump to search

youtube2internearchive

https://github.com/emijrp/youtube2internetarchive contains a script which also handled upload to Internet Archive, but I can't find it any longer. --Nemo 06:28, 26 January 2015 (EST)

I've found something with Google. bzc6p (talk) 12:25, 26 January 2015 (EST)

If YouTube needed to be quickly captured for some unforeseen reason, it might make sense to download only the XML and SRT files, so then at least some record would be saved. Google's subtitle recognition is currently far from accurate, but it's certainly improving. wtron 06:48, 12 June 2015 (EST)

Options

Is it really necessary to explicitly call best A/V when youtube-dl it by default?

Also, why not embed the subs and thumbnail instead of make a separate file? Also why not xattrs for those of us with unix filesystems? Xattrs is only one extra flag.

My command is currently

youtube-dl -t --embed-subs --add-metadata --xattrs --console-title --embed-thumbnails

although I'm going to be incorporating elements from the suggested one into mine. The reasoning behind this is it's one file to send. That command is how I archive currently, it's changing though.

I'd appreciate hearing your input about why I may be wrong though. Thanks in advance,

--Vxbinaca 21:24, 29 May 2015 (EDT)

On your second note, I strongly believe it's better to have different things (video, thumbnail, subtitle) in separate files. Easier to access, process, categorize, recognize. I think it's worth the "trouble" of having three files (with the same name) instead of one.

bzc6p (talk) 07:08, 31 May 2015 (EDT)

xattrs are not portable and will get lost when copying to a file system that doesn't have them (or when uploading it somewhere, like to IA) --Darkstar 08:53, 31 May 2015 (EDT)

Solid reasoning. I've now switched to your way of doing things. --Vxbinaca 19:32, 2 August 2015 (EDT)

--ignore-errors shouldn't be youtube-dl archiving default best practices

Theres a myriad of reasons this isn't a good idea to have by default. Downloads getting snapped off on channel rips could go unnoticed (I search for these with ls *.part). Problems with various versions of youtube-dl could lead to a channel rip with half-processed videos, see this issue on github.

Perhaps for a well-tested version that works on YouTube running in a warrior, --ignore-errors is appropriate, but for an attended rip we should by default suggest people not use it and instead just make sure all of it got ripped and if theres an error try to work resolve that particular video, and if it's a problem they can't get around then just go --ignore-errors.

I'm open to being told why I may be wrong though. --Vxbinaca 19:32, 2 August 2015 (EDT)

On September 1, 2015 The Verge reported about oncoming paid subscription option (ad-free+premium videos), although paid ad-free videos may not be a nail in the coffin, previously free content could become a premium. According to the article, transition could happen in a few months -- Vitzli 06:46, 2 September 2015 (EDT)

YouTube has never been profitable. bzc6p (talk) 08:46, 2 September 2015 (EDT)

As we don't have 500 petabytes of free storage...

... a solution may be to discard "low-value" videos. I mean, if we discard duplicates (films, music, etc) and we set a limit of 1 PB (1000 TB) for quality content, what could be the lucky videos that would be downloaded and preserved? We can work in a approach like this. Just because we don't have space for all, doesn't mean that we don't download anything. Emijrp 09:33, 20 October 2015 (EDT)

Google will probably give a notice long before that date. So we'll have the time to find stuff worth saving. We could create a "warroom" or such, and users (ArchiveTeam members and other people) could suggest channels worth saving, with description, average views and size estimate, say, in a table, clear-cut. If someone suggests videos that are otherwise available or not too popular, they can be striked out (with proper reasoning). (The reviewing of suggestions can be done by everyone continously.) A deadline of suggestions could be given (say, 2 months before the end), and after that a Committee could select the "lucky" 1000 TB that could end up in the Archive.

  • In the meanwhile, the Archive itself would be queried for already saved videos, and those wouldn't be saved again.
  • The Archive and the Team should expect other preserving actions, and ours should be in accordance with those (no duplicates).
  • There could be national limits, e.g. not only a global limit, but also language or country-specific, say, 500 TB English videos, and 10–50 terabytes per other countries (just ad-hoc numbers, see the concept).

The importance of the last point and the importance of saving some of YouTube at all is, I think, in the fact that, without too much exaggaration, a substantial part of today's culture is stored and represented there, on the most popular video sharing site on Earth. bzc6p (talk) 15:41, 20 October 2015 (EDT)

Saving YouTube Comments

YouTube comments are a surefire sign of just how awful the internet can be at times. Shouldn't they be archived as well? There's already a script for it, youtube-comment-downloader. --Powerkitten (talk) 16:49, 26 October 2016 (EDT)