4shared
| 4shared | |
| URL | http://4shared.com |
| Status | Unknown |
| Archiving status | Unknown |
| Archiving type | Unknown |
| IRC channel | #archiveteam-bs (on hackint) |
4shared is a file-sharing service. It does not seem to have an inactivity time bomb, as files from 2009[IA•Wcite•.today] are still there. Downloading will not be simple, as it uses a JavaScripty 60-second limit that seems to only be bypassable with a paid subscription. In addition, it seems to require login in the first place. Netsurf, a browser without JavaScript, does not bypass the time limit . User agents such as Googlebot have not been tested yet, nor has reverse-engineering the JavaScript, at least by User:TheTechRobo, but he plans to try this. It doesn't seem to have a relationship with 4chan.
There appears to be a sitemap at https://www.4shared.com/web/sitemap.xml. This should come in handy when archiving. However, this sitemap is incomplete. As of October 2021[update], it lists about 62.4 million files. According to https://blog.4shared.com/infographic-4shared-2020-review/[IA•Wcite•.today], there were 193 million uploaded files as of December 2020[update]. The statistics box on the blog page further claims that the total size of the hosted files is 940 TB as of October 2021[update], but this figure had been there already in 2010.
While logged in, TheTechRobo has found a few things.
- The download link is stored in the DOM as an input element with with the ID "baseDownloadLink".
- baseDownloadLink seems to be used in d2Script.js, on line 96.
- It looks like you still need cookies when requesting the download link, as I see nothing else special about the request. I might be missing something though.