Difference between revisions of "YouTube/Technical details"

From Archiveteam
Jump to navigation Jump to search
m (Swap underscores and dashes in patterns since some regex engines don't like the previous syntax)
(→‎Playlists: Add alternative TLGG generation)
Line 98: Line 98:
| <code><nowiki>SP([0-9A-F]{16}|[A-Za-z0-9_-]{32})</nowiki></code> || || Fully functional || || Behaves exactly as the same ID with <code>PL</code>
| <code><nowiki>SP([0-9A-F]{16}|[A-Za-z0-9_-]{32})</nowiki></code> || || Fully functional || || Behaves exactly as the same ID with <code>PL</code>
|-
|-
| <code>TLGG[A-Za-z0-9_-]{22} || Temporary list || || || Produced by <code><nowiki>https://www.youtube.com/watch_videos?video_ids=&lt;videoID&gt;,&lt;videoID&gt;,...</nowiki></code>
| <code>TLGG[A-Za-z0-9_-]{22} || Temporary list || || || Produced by <code><nowiki>https://www.youtube.com/watch_videos?video_ids=&lt;videoID&gt;,&lt;videoID&gt;,...</nowiki></code> and <code><nowiki>https://www.youtube-nocookie.com/embed/&lt;videoID&gt;?playlist=&lt;videoID&gt;,&lt;videoID&gt;,...</nowiki></code>
|-
|-
| <code>TLPQ[A-Za-z0-9_-]{22} || || || ||
| <code>TLPQ[A-Za-z0-9_-]{22} || || || ||

Revision as of 10:41, 28 November 2021

This page documents some of the publicly known technical details of YouTube.

ID formats

In most places, IDs are expressed as base64 using the modified character set A-Za-z0-9-_.

Videos

Videos have a 64-bit ID. Because an 11-character base64 string is equivalent to 66 bit, the last character of a video ID can only take one of 16 values.

Video ID regex pattern: [A-Za-z0-9_-]{10}[048AEIMQUYcgkosw]

Channels

Channels have a 128-bit ID. In base64, this turns into a 22-character string where the last character can take 4 values.

Channel ID regex pattern: [A-Za-z0-9_-]{21}[AQgw]

(Note that this does not include the UC prefix used e.g. in /channel/ URLs. The channel ID appears without that prefix in several places, most notably some playlist IDs.)

Playlists

Over the years, there has been a large number of playlist types, many of which have since gone the way of the dodo. This section attempts to document them all.

In the table below, pattern is a regex with the additional syntax of <videoID> to indicate that the ID contains a video ID (and likewise for channel and playlist IDs).

Pattern Purpose Status Examples Notes
AL[A-Za-z0-9_-]+ Broken
AV[A-Za-z0-9_-]{32} Broken IDs do not appear to be PL ones
CL<videoID>
EC([0-9A-F]{16}|[A-Za-z0-9_-]{32}) Courses Unviewable, playnext and watch functional IDs are also fully functional with PL
EL<videoID>
FL<channelID> Favourites Fully functional
HL[0-9]{10} Unviewable, playnext broken
LE[A-Za-z0-9_-]{23}
LL<channelID> Likes Dead Made private in December 2019
LP<videoID> Broken
MC[0-9]{8} Values appear to be dates
MCUS
ML[A-Za-z0-9_-]{32} Broken IDs do not appear to be PL ones
OLAK5uy_[klmn][A-Za-z0-9_-]{32}
PL[0-9A-F]{16} Normal playlist (old) Fully functional
PL[A-Za-z0-9_-]{32} Normal playlist Fully functional
PU<channelID> Popular uploads Unviewable, playnext and watch functional
RD<videoID> Mix aka radio Unviewable, playnext and watch functional
RD[0-4][0-9]<videoID>
RDAMVM[A-Za-z0-9_-]{22} Artist mix?
RDAO[A-Za-z0-9_-]{22} Artist mix
RDAMPL<playlistID>
RDCLAK5uy_[klmn][A-Za-z0-9_-]{32}
RDCMUC<channelID> Channel mix
RDEM[A-Za-z0-9_-]{22} Artist mix?
RDGMEM[A-Za-z0-9_-]{22} Genre mix?
RDGMEM[A-Za-z0-9_-]{22}VM<videoID> Genre mix?
RDHC<videoID>
RDKM[A-Za-z0-9_-]{22}
RDLV<videoID>
RDMM My Mix Tied to the account accessing YouTube
RDMM<videoID>
RDQM<videoID>
RDTMAK5uy_[klmn][A-Za-z0-9_-]{32}
SL
SL<videoID> Broken
SP([0-9A-F]{16}|[A-Za-z0-9_-]{32}) Fully functional Behaves exactly as the same ID with PL
TLGG[A-Za-z0-9_-]{22} Temporary list Produced by https://www.youtube.com/watch_videos?video_ids=<videoID>,<videoID>,... and https://www.youtube-nocookie.com/embed/<videoID>?playlist=<videoID>,<videoID>,...
TLPQ[A-Za-z0-9_-]{22}
UL<videoID> User uploads Watch functional Only valid on watch pages. Triggers display of uploads by the same user as the watched video. The ID in the list parameter must be a valid video ID from any channel; in the past, any 11 characters were accepted, and even further in the past, a sole UL worked as well.
UU<channelID> User/channel uploads Fully functional
UUSH<channelID> User shorts Fully functional
WL Watch Later Tied to the account accessing YouTube

Further prefixes that are known or suspected to have existed but whose exact format isn't known yet: BP, MLCA, MQ, TT

URL formats

A large number of URL formats have been in use over the years, too many to be listed here. User:JustAnotherArchivist's youtube-extract script contain regex patterns for most of the ones that were at least somewhat common.

Domains

Domains actively serving content as of 2021:

  • www.youtube.com (main site)
  • m.youtube.com (mobile site)
  • youtu.be (short URLs)
  • www.youtube-nocookie.com (embeds only)
  • music.youtube.com (YouTube Music)
  • www.youtubekids.com (YouTube Kids)
  • tv.youtube.com (YouTube TV)
  • i.ytimg.com (static images like thumbnails; avatars are on subdomains of ggpht.com as they are shared between Google services)
  • subdomains of googlevideo.com (video content)

A long time ago, there were YouTube domains under a number of ccTLDs. Nowadays, these all redirect to the main site: youtube.at, by, ca, co.uk, cz, de, dk, ee, es, fi, fr, gr, hr, hu, it, lt, lv, no, pl, pt, ro, rs, ru, se, si, sk.

In addition, there are some subdomains under youtube.com which redirect to the main site but add the gl parameter to the query string, which changes the interface language: br.youtube.com and likewise for es, it, jp, pl, ru, uk.