4shared

From Archiveteam
Jump to navigation Jump to search

4shared is a file-sharing service. It does not seem to have an inactivity time bomb, as files from 2009[IAWcite.today] are still there. Downloading will not be simple, as it uses a JavaScripty 60-second limit that seems to only be bypassable with a paid subscription. In addition, it seems to require login in the first place. Netsurf, a browser without JavaScript, does not bypass the time limit . User agents such as Googlebot have not been tested yet, nor has reverse-engineering the JavaScript, at least by User:TheTechRobo, but he plans to try this. It doesn't seem to have a relationship with 4chan.

There appears to be a sitemap at https://www.4shared.com/web/sitemap.xml. This should come in handy when archiving. However, this sitemap is incomplete. As of October 2021[update], it lists about 62.4 million files. According to https://blog.4shared.com/infographic-4shared-2020-review/[IAWcite.today], there were 193 million uploaded files as of December 2020[update]. The statistics box on the blog page further claims that the total size of the hosted files is 940 TB as of October 2021[update], but this figure had been there already in 2010.

While logged in, TheTechRobo has found a few things.

  • The download link is stored in the DOM as an input element with with the ID "baseDownloadLink".
  • baseDownloadLink seems to be used in d2Script.js, on line 96.
  • It looks like you still need cookies when requesting the download link, as I see nothing else special about the request. I might be missing something though.