From Archiveteam
Revision as of 14:55, 22 January 2014 by Sfan00 IMG (talk | contribs) (Draft)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

USENET is a mailing list based collection of assorted forum groups accessed via the NNTP protocol.

Currently the major archive of this important forum is Google Groups...

However, there are some concerns raised by this.

1. Google could pull Groups at it's whim, with no clear donation policy to other archives. 2. Despite Google's claims not to routinely monitor the service, postings are removed or supressed for various reasons. 3. Google currently offers little to distinguish USENET from it's own propriatery groups. 4. The interface used for Groups has issues with some browsers, and accessing text versions of postings is an involved process.

Therefore there should be an alternative.

It is suggested that Archiveteam members form an effort to begin a parallel archive to groups

Such an alternative should offer a credible search facility, indexing by header fields and over date ranges, broadly similar to those offered by the Groups UI. Such a search could also extend beyond that offered by the current Google offering in enabling grep style expressions to be used ( subject to appropriate limitations on resource uages)

Technical issues :

  • How big is the current USENET colloquia?
  • By how much is this likely to grow?
  • How should postings be stored? ( Ideally text postings should be stored as plain text+headers as they would typically be on a newserver)
  • Should NNTP style direct access be allowed, or should posting only be accessible via a neogtiated read only API?
  • Binaries - Leaves as encoded or translate?

Logistic issues:

  • How to recover pre 2014 material from alternative sources?
  • How to upload and index?

Non-Technical Issues:

  • Spam - Some less used groups are in effect mostly spam.. is it worth acrchiving the spam along with genuine postings?
  • Cancelmsg - Google Groups doesn't respond to them, but some newservers will respond to genuine cancelmsg, as well as issuing their own in respect of material found to be in breach of applicable laws.
  • Impersonation of headers- Mis attribution of sources is an issue because of the potential for legal consquences.
  • Legally questionable material - Should an archive of USENET respect archival principles (and challange legal threats) or

have takedown procedure?

  • The New York 22 Banned list - No responsible archive would support the deliberate inclusion of clearly illegal 'child abuse' images but these are not always easy to identify such, and should an archive be the one to report previously unknown crimes?
  • Libel(i.e Defamation) - In some countries the 'publisher' of a libel (ie an archive) can be held liable for it as well as the original source. Some postings which would be libel are nonetheless retained in the archive as they form part of the public debate. (This is especailly true of high-profile cases). However , libel of course has to be proven in court.

  • Infringement of copyright - Whilst the DMCA has a takedown procedure, it's sometimes overreaching, meaning materials posted in good faith are removed unfairly. Precusors to the DMCA takewdown have also been used for SLAPP purposes and to supress