Difference between revisions of "Valhalla"

From Archiveteam
Jump to navigation Jump to search
Line 10: Line 10:


== Options ==
== Options ==
{| border="1"
{| class="wikitable sortable"
| Storage type
| Storage type
| Cost ($/TB/year)
| Cost ($/TB/year)

Revision as of 22:46, 18 September 2014

This wiki page is a collection of ideas for Project Valhalla.

<SketchCow> Basically, we have this situation where we have stuff that is being threatened,
and it's huge, and then it's either not so threatened or it's in a weird quantum state.
So, this really stretches the bounds of what IA does. It's a huge amount of data, it's not likely 
to be overly touched if the originals are up, and IA will spend/lose a lot of money pulling it into their infrastructure.
So maybe we can discuss actual, not pie-in-the-sky possibilities of what we can do to have some sort of not-IA pile of storage.

Join the discussion in #huntinggrounds.

Options

Storage type Cost ($/TB/year) Theoretical lifespan Practical, tested lifespan Notes
Hard drives These would have to be live. HDDs decay quickly, and if they're not spinning, you can't detect failures. Possible software for this kind of thing; syncthing, Tahoe-LAFS, ...?
Commercial / archival-grade tapes
Consumer tape systems (VHS, Betamax, cassette tapes, ...)
Vinyl
PaperBack
Optar
Blu-Ray Lasts a LOT longer than CD/DVD, but should not be assumed to last more than a decade. Raidz3 with Blu-rays Doing a backup in groups of 15 disks
M-DISC Unproven technology, but potentially interesting.
Flash media Wears out quickly, not-so-good long term storage. Soliciting donations for old flash media from people, or sponsorship from flash companies?
Glass/metal etching

Blu-Ray:

<Drevkevac> still, if its true, you could do, perhaps, raidz3s in groups of 15 disks or so?
<SketchCow> Please add paperbak to the wiki page.
<SketchCow> Fuck Optical Media. not an option;.
<Drevkevac> that would give you ~300GB per disk group, with 3 disks

Non-options

  • Ink-based Consumer Optical Media (CDs, DVD, etc.)
    • Differences between Blu-Ray and DVD? DVDs do not last very long.
  • BitTorrent Sync
    • Proprietary (currently), so not a good idea to use as an archival format/platform
  • Amazon Glacier
    • Amazon Glacier seems like a a great idea, until you realize they mean 1 cent per gigabyte per month. This is $120 per terabyte per year. The transfer out of 100TB would also run over $10,000 the month its pulled from the system.
  • Floppies
    • "Because 1.4 trillion floppies exists less than 700 billion floppies. HYPOTHETICALLY, if you set twenty stacks side by side, figure a quarter centimeter per floppy thickness, excluded the size of the drive needed to read the floppies you would still need a structure 175,000 ft. high to house them. Let's also assume that the failure rate for floppies is about 5% (everyone knows that varies by brand, usage, time of manufacture, materials used, etc, but lets say 5% per year). 70 million of those 1.4 trillion floppies are unusuable. Figuring 1.4 MB per floppy disk, you are losing approximately 100MB of porn each year. Assuming it takes 5 seconds to replace a bad floppy, you would have to spend 97,222 hrs/yr to replace them. Considering there are only 8,760 hrs per year, you would require a staff of 12 people replacing floppies around the clock or 24 people on 12 hr shifts. Figuring $7/hr you would spend $367,920 on labor alone. Figuring a nickel per bad floppy, you would need $3,500,000 annually in floppy disks, bringing your 1TB floppy raid operating costs (excluding electricity, etc) to $3,867, 920 and a whole landfill of corrupted porn. Thank you for destroying the planet and bankrupting a small country with your floppy based porn RAID." (source)

From IRC

<Drevkevac> we are looking to store 100TB+ of media offline for 25+ years
<Drevkevac> if anyone wants to drop in, I will pastebin the chat log
<rat> DVDR and BR-R are not high volume. When you have massive amounts of data, raid arrays have too many points of failure.
<rat> Drevkevac: I work in a tv studio. We have 30+ years worth of tapes. And all of them are still good.
<rat> find a hard drive from 30 years ago and see how well it hooks up ;)
<brousch_> 1500 Taiyo Yuden Gold CD-Rs http://www.mediasupply.com/taiyo-yuden-gold-cd-rs.html
"<Drevkevac> still, if its true, you could do, perhaps, raidz3s in groups of 15 disks or so?
<SketchCow> Please add paperbak to the wiki page.
<SketchCow> Fuck Optical Media. not an option;.
<Drevkevac> that would give you ~300GB per disk group, with 3 disks

Costs

These are just estimates. Calculation: $/TB = Total Cost / Total Capacity

Purpose Cost (USD) Per TB
Tape Media $36.4
Hard drives $43
BD-R(1) ~ $110

(1) assuming ~ 2$ per disk, 25gb per disk, and raidz3 with 15 disks per array