Difference between revisions of "The Pirate Bay"

From Archiveteam
Jump to navigation Jump to search
(Add note about official dumps)
 
(39 intermediate revisions by 11 users not shown)
Line 1: Line 1:
{{Infobox project
{{Infobox project
| title = The Pirate Bay
| title = The Pirate Bay
| image =  
| logo = ThePirateBay.png
| image = Thepiratebay_homepage_screenshot.png
| description =  
| description =  
| URL = http://www.thepiratebay.org/
| URL = http://www.thepiratebay.org/
| project_status = {{online}}
| project_status = {{online}}
| archiving_status = {{nosavedyet}}
| archiving_status = {{partiallysaved}}, complete grab {{upcoming}}
| irc = yarharfiddlededee
}}
}}


{{expand}}
'''[[The Pirate Bay]]''' is one of the largest and most popular torrent search engines.


'''[[The Pirate Bay]]''' is one of the largest and most popular torrent search engines. It's still having persistent legal problems. The tracker went down in November, but the site still serves torrents and magnet links. If a torrent is lost, it becomes impossible to connect to other computers distributing the shared files. Considering that there are links to TPB on '''THIS VERY PAGE''', this site is pretty dang important.  
It's still having persistent legal problems. The tracker went down in November 2012, but the site still serves torrents and magnet links. If a torrent is lost, it becomes impossible to connect to other computers distributing the shared files. Considering that there are links to TPB ''all over this wiki'', this site is pretty dang important.


==Archival Methods==
On December 2014, the website went offline due to an alleged raid<ref>https://torrentfreak.com/swedish-police-raid-the-pirate-bay-site-offline-141209/</ref>. A few weeks later, the site was updated with a banner page and a mysterious key string<ref>https://torrentfreak.com/pirate-bay-domain-back-online-waving-a-pirate-flag-141221/</ref> revealed to be a cryptographic puzzle with the solution "I'll be back" <ref>https://www.reddit.com/r/piratebayaes/comments/2rvn1c/observation_weired_css_for_the_pipe_class/cnllylk</ref><ref>https://torrentfreak.com/pirate-bays-mystery-aes-code-finally-decrypted-150112/</ref>. The website returned back online on February 1, 2015<ref>https://torrentfreak.com/pirate-bay-back-online-150131/</ref>.


Since the Pirate Bay is only a search engine nowadays, it would be preferable to simply scrape the magnet links, descriptions, and comments. Since the magnet links also include a tracker address, torrent files might just be a thing of the past (might be wrong though, but [https://torrentfreak.com/the-pirate-bay-will-stop-serving-torrents-120112/ the Pirate Bay itself] seems to be betting on it)
==In case of Fire==


==Archival Tools==
To prevent damage to the Archive Team if The Pirate Bay ever goes down, we should include a Magnet Link next to every TPB link we have.


We're going to have to make our own tools if we ever want to backup the entire Pirate Bay. To make it usable, we would also have to make it easily editable and searchable, perhaps with website scripts...
=== Archival Methods ===


A good idea would be to put magnet links, comments, and the descriptions into a Markdown-formatted wikipage, and then check the files into a git repository. Then, we could simply run the [https://github.com/github/gollum Gollum] wiki engine on it, and voila, Pirate Bay on the go.
We can simply scrape the magnet links, descriptions, and comments. The hard part would probably be keeping it all updated... (Maybe we could use a git repository, and pull as necessary?)


Of course, we would have to make a few modifications to Gollum, such as getting it to support tags. catergories, and other metadata essential to searching such a huge database. It shouldn't be too hard, though.
Magnet links are provided in the Pirate Bay Magnet Archive below, and descriptions and comments are in the siterip.


===Existing Archival Tools===
=== Archival Tools ===


* [http://kat.ph/the-pirate-bay-scraper-t2833005.html The Pirate Bay Scraper] - An old set of scripts that saves torrents and their metadata from the Pirate Bay into PHP websites, complete with a SQL database and bzipped torrents. It was instrumental to the creation of the [https://thepiratebay.se/torrent/5051418/Pirate_Bay_Archive_-_884004_torrents___MySQL_db___PHP_demo_site 2009 Pirate Bay Archive]
* [http://pastebin.com/8RXXthXB Magnet link Dumper]: A perl script that dumps magnet links into a single text file. It was used to make the below magnet archive.
 
* [http://github.com/andronikov/tpb2csv tpb2csv]: scrapes the pirate bay website and strips out all the html crap, leaving only pure sweet metadata.
** details.csv: Title, Type, Files, Size, IMDB, Spoken Languages, Texted Languages, Tags, Quality (+), Quality (-), Uploaded, By, User Type, Seeders, Leechers, Info Hash, Picture, Capture Date
** description.txt
** comments.csv: User Type, Username, Date, Text
 
== Backups ==
 
* [https://archive.org/details/PirateBayComplete20130219 rich.xml.7z] 662MB 7z database dump from 2013-02-19
** torrent:urn:sha1:e4b6f847647211b930219492ecf1a9c7bc696d29
** '''Magnet link''': ''magnet:?xt=urn:btih:e4b6f847647211b930219492ecf1a9c7bc696d29
 
* [http://thepiratebay.se/torrent/7016365/The_whole_Pirate_Bay_magnet_archive The entire Pirate Bay Magnet Archive]: Every magnet link on the Pirate Bay, all in a tiny little text file. No comments, though.
** '''Magnet link''': ''magnet:?xt=urn:btih:938802790a385c49307f34cca4d30f80b03df59c&dn=The+whole+Pirate+Bay+magnet+archive&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.ccc.de%3A80''
* [https://thepiratebay.se/torrent/8156416 Updated (February 2013) Listing]
** '''Magnet Link''': ''magnet:?xt=urn:btih:277e1afa0038db7299cd8274310556526599f67c&dn=Small+pirate+bay+archive+%28february+2013%29&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.istole.it%3A6969&tr=udp%3A%2F%2Ftracker.ccc.de%3A80''
* [http://thepiratebay.se/torrent/7046494/Pirate_bay_Magnet_Archive_viewer Magnet Archive Viewer]: Parsing text files can be a pain, so this program makes it easy to search and look at the magnet links.
** '''Magnet Link''': ''magnet:?xt=urn:btih:f7a08a62a11ba6dfe39f1cd0b7e8a5a50d5379aa&dn=Pirate+bay+Magnet+Archive+viewer&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.ccc.de%3A80''
* [http://thepiratebay.se/torrent/7028505/The_Pirate_Bay_full_siterip_2012 Siterip]: This 3GB archive saves all the html pages from the Piratebay, including comments. No torrent files, as usual.
** '''Magnet Link''': ''magnet:?xt=urn:btih:3ab8dd096aea63ddf668a127b81ba7fb6799364d&dn=The+Pirate+Bay+full+siterip+2012&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.ccc.de%3A80''
* [https://thepiratebay.se/torrent/7706886/Backup_of_The_Pirate_Bay_%28IDs__3200000_-_7700000%29 IDs 3200000-7699999]: tpb2csv 1.23GB 7z Backup as of 2012-10-06. It includes all comments, filelists, and details in csv format.
** '''Magnet Link''': ''magnet:?xt=urn:btih:0dfe31d5d91058bcbe5cfbcf98646700890afea0&dn=Backup+of+The+Pirate+Bay+%28IDs%3A+3200000+-+7700000%29&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.istole.it%3A6969&tr=udp%3A%2F%2Ftracker.ccc.de%3A80''
* [https://thepiratebay.se/torrent/8044295/Backup_of_The_Pirate_Bay_%28IDs__7700000_-_7999999%29 IDs 7700000-7999999]: tpb2csv 69MB 7z Backup as of 2013-01-09. It includes all comments, filelists, and details in csv format.
** '''Magnet Link''': ''magnet:?xt=urn:btih:9f9c8cab8b68956a25d6e8c190e5e8dc8cf7186c&dn=Backup+of+The+Pirate+Bay+%28IDs%3A+7700000+-+7999999%29&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.istole.it%3A6969&tr=udp%3A%2F%2Ftracker.ccc.de%3A80''
 
 
* '''Pirate Bay Metadata Git Repo''': https://github.com/tpb-archive 596MB zip tpb2csv scraped metadata including comments, file lists, descriptions, details. link broken, github censorship?
** [https://github.com/tpb-archive/8xxxxxx IDs 8000000-8999999] tpb2csv scraped metadata not included by "Backup" torrents listed above, fetched on 2013-06-02
 
* TPB used to make official CSV dumps available under {{URL|https://thepiratebay.org/static/dump/}}, but these seem to have disappeared sometime between early March and mid-May 2020.
 
at the end of 2014 there were 11000000 ish torrent ID's
 
== FAQ ==
 
=== Is there a complete backup before The Pirate Bay was shut down? ===
 
We don't know. If we did, it should and would be listed in this page already.
 
=== I have incomplete/complete archives. What do I do? ===
 
Upload it to the Internet Archive and put a link in the list. If you can't upload it yourself, ask someone to do it for you.
 
=== What's with the IsoHunt Old Pirate Bay? ===
 
We don't know. You can read up on FAQ about [[IsoHunt]] wiki page however.
 
== References ==
<references/>


{{Navigation box}}
{{Navigation box}}
[[Category:Peer to peer]]

Latest revision as of 16:44, 7 September 2020

The Pirate Bay
The Pirate Bay logo
Thepiratebay homepage screenshot.png
URL http://www.thepiratebay.org/
Status Online!
Archiving status Partially saved, complete grab Upcoming...
Archiving type Unknown
IRC channel #yarharfiddlededee (on hackint)

The Pirate Bay is one of the largest and most popular torrent search engines.

It's still having persistent legal problems. The tracker went down in November 2012, but the site still serves torrents and magnet links. If a torrent is lost, it becomes impossible to connect to other computers distributing the shared files. Considering that there are links to TPB all over this wiki, this site is pretty dang important.

On December 2014, the website went offline due to an alleged raid[1]. A few weeks later, the site was updated with a banner page and a mysterious key string[2] revealed to be a cryptographic puzzle with the solution "I'll be back" [3][4]. The website returned back online on February 1, 2015[5].

In case of Fire

To prevent damage to the Archive Team if The Pirate Bay ever goes down, we should include a Magnet Link next to every TPB link we have.

Archival Methods

We can simply scrape the magnet links, descriptions, and comments. The hard part would probably be keeping it all updated... (Maybe we could use a git repository, and pull as necessary?)

Magnet links are provided in the Pirate Bay Magnet Archive below, and descriptions and comments are in the siterip.

Archival Tools

  • Magnet link Dumper: A perl script that dumps magnet links into a single text file. It was used to make the below magnet archive.
  • tpb2csv: scrapes the pirate bay website and strips out all the html crap, leaving only pure sweet metadata.
    • details.csv: Title, Type, Files, Size, IMDB, Spoken Languages, Texted Languages, Tags, Quality (+), Quality (-), Uploaded, By, User Type, Seeders, Leechers, Info Hash, Picture, Capture Date
    • description.txt
    • comments.csv: User Type, Username, Date, Text

Backups


  • Pirate Bay Metadata Git Repo: https://github.com/tpb-archive 596MB zip tpb2csv scraped metadata including comments, file lists, descriptions, details. link broken, github censorship?
    • IDs 8000000-8999999 tpb2csv scraped metadata not included by "Backup" torrents listed above, fetched on 2013-06-02

at the end of 2014 there were 11000000 ish torrent ID's

FAQ

Is there a complete backup before The Pirate Bay was shut down?

We don't know. If we did, it should and would be listed in this page already.

I have incomplete/complete archives. What do I do?

Upload it to the Internet Archive and put a link in the list. If you can't upload it yourself, ask someone to do it for you.

What's with the IsoHunt Old Pirate Bay?

We don't know. You can read up on FAQ about IsoHunt wiki page however.

References