IRC Quotes

From Archiveteam
Jump to navigation Jump to search
Archiveteam1.png Historical content

This page or section is not really edited any more, probably because the project got abandoned, information is collected somewhere else in a different form etc.

However, this is a good and important record of ArchiveTeam's ancient times, thus must be preserved, but merging it into an other article would be difficult and/or some pieces of information are missing for a new form.

So feel free to read this, but it has probably nothing to be added now. However, if you resurrect the project or find a way to move this data to a fresh place, you can remove this template.

What's this, then?

In 2011 ,Auguste, BlueMax and Dr-Spangle scraped IRC quote databases (e.g. Bash.org). The channel was #bashup on EFNet.

Project Hosting

Auguste hosted the scrapes here before that site went down. As far as we can tell, despite this wiki page previously encouraging people to mirror this directory, the closest thing that ever happened was this subset on IA.

QdbScraper

QdbScraper is a Perl script for scraping QdbS-powered quote databases (e.g. Auguste's ArchiveTeam QDB, which ironically does not seem to have been mirrored anywhere itself). At the moment, it's only capable of scraping QdbS installations that use the default template and/or minimal customization. Support for other common templates is coming soon.

Helping Out

Scraping doesn't take a lot of work; the QDBs are all more or less the same. You only need to write one script, then make a few changes to adapt it to any other QDB you want to scrape. The actual scraping process should easily take under 10 minutes.

If you do want to help with the scraping, please follow the existing scrape format:

  • Each quote has its own file
  • Each file is named 'n.txt', where 'n' is the quote's ID number
  • All quotes should be compressed into an archive
  • The archive name should identify the original location (URL) and date of scraping (e.g. 'QuoteIRC.com Quote Collection 2011-04-04.7z', or 'DOMAIN.TLD Quote Collection YYYY-MM-DD.EXT').
    • If the original location (URL) has subdirectories (e.g. 'Foobar.com/baz'), replace forward slashes with hyphens: 'Foobar.com-baz'.

Tips

  • Scrape from the browse page (e.g. http://bash.org/?browse). This way you can scrape 10-50 quotes per page request, rather than cycling through thousands of individual quote pages.

Project Status

Database Has been scraped Scraper Notes
Bash.org Yes Dr-Spangle The quote database that pretty much created all others.
BombLol.net/ornot Yes Auguste Generic QdbS QDB
DeadDyingDamned.com/qdb/ No The unofficial ArchiveTeam QDB. I'll have the server automatically save these somewhere. --Auguste 13:36, 7 April 2011 (UTC)
DeanyDerkheiser.net/qdb Yes Auguste Generic QdbS QDB
FreqBase.com/qdb Yes Auguste Generic QdbS QDB
Frostfall-Guild.com/ff/qdb Yes Auguste Generic QdbS QDB
german-bash.org Yes (here) Darkstar German version of bash.org
I-Rox.com Yes Auguste
ibash.de Yes (here) Darkstar Another German quotes DB
JDL.Host.HK-DIY.net/quote Yes Auguste Generic QdbS QDB
LinuxCult.org/quotes Yes Auguste Generic QdbS QDB
LolImBanned.com Yes Auguste Generic QdbS QDB
Mandaliet.com/furcqdb/ Yes Auguste The Furcadia quote database
MoarPupr.com/quotes Yes Auguste Generic QdbS QDB
NotSafeForSanity.com/quotes Yes Auguste Generic QdbS QDB
Pilkipedia.co.uk/qdb Yes Auguste Generic QdbS QDB
QDB.Honk-Honk.org Yes Auguste Generic QdbS QDB
QDB.MIT.edu Yes Auguste The MIT quote database
QDB.PesterChum.net Yes Auguste Generic QdbS QDB
QDB.us Yes Auguste
QDBS.ChanOps.org Yes Auguste Generic QdbS QDB
QuoteIRC.com Yes Auguste
Quotes.BurntElectrons.org Yes Auguste The IRC.Mozilla.org quote database
WarpDrive.se Yes Auguste Quotes are in Swedish
WQDB.org Yes Auguste The Worms quote database
xkcdb.com Yes Auguste The xkcd quote database