Difference between revisions of "4shared"

From Archiveteam
Jump to navigation Jump to search
(Add a quick analysis of 4shared.)
 
(Add stuff that I just found out through experimentation.)
 
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Infobox project
{{Infobox project
| title = 4shared
| URL = http://4shared.com
| URL = http://4shared.com
}}
}}


'''4shared''' is a file-sharing service. It does not seem to have an inactivity time bomb, as {{URL|https://www.4shared.com/file/119613160/34b859ec/_3__StartPatch41_USA.html|files from 2009}} are still there. Downloading will not be simple, as it uses a JavaScripty 60-second limit that seems to only be bypassable with a paid subscription. In addition, it seems to require login in the first place. Netsurf, a browser without JavaScript, ''' does not bypass the time limit '''. User agents such as Googlebot have not been tested yet, nor has reverse-engineering the JavaScript, at least by [[User:TheTechRobo]], but he plans to try this. It doesn't ''seem'' to have a relationship with 4chan.


'''4shared''' is a file-sharing service. It does not seem to have an inactivity time bomb, as {{URL|https://www.4shared.com/file/119613160/34b859ec/_3__StartPatch41_USA.html|files from 2009}} are still there. Downloading will not be simple, as it uses a JavaScripty 60-second limit that seems to only be bypassable with a paid subscription. It has not been further tested (mobile version, noscript version, Googlebot?, etc...), at least by [[User:TheTechRobo]]. A more thorough analysis should be performed by him soon. It doesn't ''seem'' to have a relationship with 4chan.
There appears to be a sitemap at https://www.4shared.com/web/sitemap.xml. This should come in handy when archiving. However, this sitemap is incomplete. As of 2021-10-31, it lists about 62.4 million files. According to {{URL|https://blog.4shared.com/infographic-4shared-2020-review/}}, there were 193 million uploaded files as of December 2020. The statistics box on the blog page further claims that the total size of the hosted files is 940 TB as of 2021-10-31, but this figure had been there [https://web.archive.org/web/20101223183944/https://blog.4shared.com/about/ already in 2010].
 
While logged in, TheTechRobo has found a few things.
 
* The download link is stored in the DOM as an input element with with the ID "baseDownloadLink".
* baseDownloadLink seems to be used in d2Script.js, on line 96.
* <!--The baseDownloadLink seems to be inaccessible by direct URL;--> It looks like you still need cookies when requesting the download link, as I see nothing else special about the request. I might be missing something though.

Latest revision as of 20:27, 22 November 2021

4shared
URL http://4shared.com
Status Unknown
Archiving status Unknown
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)

4shared is a file-sharing service. It does not seem to have an inactivity time bomb, as files from 2009[IAWcite.todayMemWeb] are still there. Downloading will not be simple, as it uses a JavaScripty 60-second limit that seems to only be bypassable with a paid subscription. In addition, it seems to require login in the first place. Netsurf, a browser without JavaScript, does not bypass the time limit . User agents such as Googlebot have not been tested yet, nor has reverse-engineering the JavaScript, at least by User:TheTechRobo, but he plans to try this. It doesn't seem to have a relationship with 4chan.

There appears to be a sitemap at https://www.4shared.com/web/sitemap.xml. This should come in handy when archiving. However, this sitemap is incomplete. As of 2021-10-31, it lists about 62.4 million files. According to https://blog.4shared.com/infographic-4shared-2020-review/[IAWcite.todayMemWeb], there were 193 million uploaded files as of December 2020. The statistics box on the blog page further claims that the total size of the hosted files is 940 TB as of 2021-10-31, but this figure had been there already in 2010.

While logged in, TheTechRobo has found a few things.

  • The download link is stored in the DOM as an input element with with the ID "baseDownloadLink".
  • baseDownloadLink seems to be used in d2Script.js, on line 96.
  • It looks like you still need cookies when requesting the download link, as I see nothing else special about the request. I might be missing something though.