Search results

Jump to navigation Jump to search
  • ...dn't be able to complete. ''Note: The original data was not lost, just the compressed files.'' In total, this was less than 300mb of files, but one of them was p A new set of files are being compressed to replace these broken files, and are currently being hosted [http://www.a
    967 bytes (146 words) - 02:46, 5 December 2017
  • ...ile|ia-bak-census_20150304|metamgr-norm-ids-20150304205357.txt.gz}}(135.7M compressed; 372M uncompressed). It contains 14,926,080 item identifiers (including exa ...0304|public-file-size-md_20150304205357.json.gz}} (6073671780 bytes (5.7G) compressed; 22522862598 bytes (21G) uncompressed). It contains one item without any id
    10 KB (1,460 words) - 00:27, 5 December 2017
  • Partially saved, 223 MB compressed WARC on [https://archive.org/details/warc-notepad_cc archive.org]. There's
    2 KB (249 words) - 18:58, 31 October 2021
  • ...s/files.myopera.com-initialgrab Initial grab of files.myopera.com], 6.2 GB compressed. ...chive.org/details/my.opera.com-wallpapers-20140115 Wallpaper grab], 1.7 GB compressed.
    8 KB (1,285 words) - 07:37, 8 January 2022
  • The dataset consists of 125 compressed text files, each containing the data The dataset consists of one compressed text file.
    5 KB (860 words) - 16:06, 17 January 2017
  • ...ndCommentsDump Dump of Hacker News stories and comments up to 2014-05-29] (compressed JSON files), made through one of HN's official API. Total: ~10.7GB uncompre
    3 KB (369 words) - 16:27, 10 September 2017
  • Size of archives (compressed): 2.4 TB. The [[WARC]] files can be found in the {{IA item|archiveteam_frie ...'t share its content without the user's consent!) Upload them – preferably compressed – to an item (or several items) to the Internet Archive, add metadata (de
    5 KB (843 words) - 01:09, 18 November 2021
  • ...ter.s3.amazonaws.com_20140616_partial Rawporter.com Panic Download] 27.8GB compressed (all images, some videos, in a WARC file)
    3 KB (435 words) - 21:40, 28 June 2015
  • | 1.5GB compressed
    5 KB (605 words) - 13:09, 24 November 2023
  • ...ILES.COM Time Capsule] - 17024875023 bytes uncompressed, 11591761683 bytes compressed, 285552 files, 5608 folders)
    2 KB (389 words) - 16:07, 17 January 2017
  • ...2013-07-20/urlteam.torrent URL Shortener Backup Torrent v4] || [[URLTeam]] compressed backups of various URL shorteners ([http://urlte.am/releases/2013-07-20/REA ...nt URL Shortener Backup Torrent v3] '''outdated, use v4''' || [[URLTeam]] compressed backups of various URL shorteners ([http://urlte.am/releases/2013-01-02/REA
    12 KB (1,777 words) - 07:01, 10 December 2023
  • ...rs to data in the half to a few gigabytes range; CD-ROMs, DVD-ROMs, highly compressed movies, WARCs of small to medium websites, etc. This is 1000 gigabytes; a pile of losslessly compressed movies, WARCs of large (or very inefficient) websites, a library of printed
    6 KB (1,088 words) - 09:45, 27 April 2019
  • * All quotes should be compressed into an archive
    5 KB (738 words) - 03:09, 12 August 2023
  • ...ll size. When decompressed, however, the XML document points at ~10k other compressed XML sitemap documents, enumerating all photos on the site with limited meta
    4 KB (556 words) - 02:02, 15 November 2021
  • ...issued & uploaded this page will be updated. 31 million items totalling a compressed size of 648GB was captured (the complete bucket).</blockquote>
    4 KB (576 words) - 13:08, 24 April 2022
  • Data is compressed before it is uploaded. The ratio depends on the project, but the upload sho Media heavy sites will typically be more equal, while text is compressed efficiently.
    9 KB (1,535 words) - 16:39, 14 February 2024
  • ...he (at least one) .gz file, which contains the actual drawing data in zlib-compressed AMF3 format. The .xml only exists if the drawing was made in multiple sessi
    5 KB (808 words) - 06:34, 15 November 2021
  • ...ve.org/details/ubuntuone-panicgrab-20140405 Preliminary Bing crawl], 2.9GB compressed
    4 KB (742 words) - 18:50, 31 October 2021
  • ...what they wish. Site operators may release users' text contributions as a compressed archive, since text compresses efficiently, and may send users who have pro
    4 KB (683 words) - 12:01, 31 December 2022
  • ...finalized in May 2011, underscor provided a server to upload the 40 GB of compressed data files that had been collected. The upload finished on May 31st, and un Previously all data was stored in sorted and unsorted text files, often compressed to save space. Using the name tinyarchive, soultcer created some tools to m
    12 KB (1,942 words) - 23:04, 4 December 2017
  • ...n URLs across the various domains were grabbed, resulting in 102.6 GiB of (compressed) WARCs.
    6 KB (1,017 words) - 23:41, 29 December 2023
  • ...an be exported with the "Google Takeout" interface which sends a series of compressed archives with data from the various services. It's not always reliable.<ref
    11 KB (1,629 words) - 06:01, 25 November 2023
  • ...ch record is compressed via gzip. A gzip file supports multiple "members"; compressed warcs end in .warc.gz. According to the guidelines, WARC files should top o
    18 KB (2,481 words) - 01:00, 24 March 2024
  • ...ML text, but doesn't help at all when downloading material that is already compressed, like JPEG or PNG files. To enable compression, use:
    7 KB (1,114 words) - 16:27, 17 January 2017
  • ...| bigint(20) | NO | | NULL | | (length of the (compressed) individual record)
    13 KB (1,827 words) - 16:45, 14 November 2021
  • ...13_common_crawl_index_urls Common Crawl index] is a very big (21 gigabytes compressed) list of URLs in the Common Crawl corpus. Grepping this list may well revea
    9 KB (1,436 words) - 02:35, 18 September 2023
  • | 2,200,001 || 2,300,000 || '''Uploaded''' || 50gb compressed || Darkstar | 2,300,001 || 2,400,000 || '''Uploaded''' || 70gb compressed || Darkstar
    54 KB (6,859 words) - 16:44, 14 November 2021
  • The file is a tar archive compressed with [http://tukaani.org/xz/ `xz(1)`] from 674MB to 39MB. It contains the c
    12 KB (1,788 words) - 20:15, 14 March 2021
  • ! Archive Name !! Archive Type !! Size (Compressed) !! Size (Uncompressed) !! # of Profiles !! Volunteer
    10 KB (1,143 words) - 01:09, 15 November 2021
  • ...9/8c0e7aae4607412f82bf4a7a4486fe36/fat.jpg~tplv-banciyuan-obj.image is the compressed version of <!-- Referer ACL is enabled on img5, so don't make it a hyperlin
    20 KB (2,985 words) - 21:02, 16 July 2023
  • project pages and random other files wget got. Size: 400 mb compressed.
    14 KB (2,057 words) - 01:47, 11 November 2018
  • ...cly available Reddit comment for research. ~ 1.7 billion comments @ 250 GB compressed. Any interest in this?]
    18 KB (2,797 words) - 21:11, 11 September 2023
  • ...es, with the largest ones being a few tens (less than 100) megabytes (WARC compressed). Note that this is a rough estimate with a small sample. (That would mean
    22 KB (3,273 words) - 00:34, 5 December 2017
  • ...in our torrents too, just in a different format (we use pipe-delimited, xz-compressed files while 301works uses comma-delimited uncompressed files). | divided up into 3,835 files in the last old-style dump, totaling 39 GB (compressed!). Also worked on as a Warrior job, see below.
    82 KB (13,434 words) - 16:01, 25 April 2024
  • ...the original video files in (semi-)offline storage, and store transcoded (compressed) versions on the Internet Archive.
    32 KB (4,950 words) - 22:40, 30 October 2023
  • ...it. So we put ourselves up on The Pirate Bay, we have a 641GB - because it compressed well - torrent, with 7,854 files that were basically 7zs, and we put that s
    41 KB (7,606 words) - 02:37, 12 December 2017
  • ...tation archive is available at {{IA collection|youtubeannotations}}, and a compressed copy can be found at {{IA item|youtubeannotations.tar.zstd}}. 16GB of just
    53 KB (7,698 words) - 07:32, 26 March 2024
  • we have a 641GB - because it compressed well
    64 KB (8,282 words) - 04:09, 25 June 2015