DokuWiki

From Archiveteam
Jump to navigation Jump to search
DokuWiki
DokuWiki logo
Status Special case
Archiving status In progress... (manual)
Archiving type Unknown
IRC channel #wikiteam (on hackint)
WikiBot
IRC bot run by DigitalDragon using Dokuwiki-Dumper tools
IRC bot run by DigitalDragon using Dokuwiki-Dumper tools
Status Special case
Archiving status In progress... (manual)
Archiving type other
Project source WikiBot GitHub
IRC channel #wikibot (on hackint)

DokuWiki is a wiki engine. It is often recommended as a more lightweight, easier-to-customize alternative to MediaWiki.

You can use dokuWikiDumper to archive DokuWiki sites and upload them to Internet Archive.

Currently (2023-05-15), hundreds of DokuWikis on this sheet have not been archived yet.

You can check archived DokuWiki items at Internet Archive.

Archiving tools

WikiBot (#wikibot (on hackint)) is an interface to automatically save DokuWiki, PukiWiki and MediaWiki instances.

Strategy

Although ArchiveBot can be used to dump DokuWiki sites, it tends to explode into a large number of URLs due to the media manager and sitemap being repeated on each page with identical content (for instance, https://www.dokuwiki.org/dokuwiki?do=index and https://www.dokuwiki.org/manual?do=index give the same content, but are distinct URLs). This grows further with the number of subpages and files. This can be resolved with one of the following ignores, but the main page has different names on different wikis and needs to be replaced (some wikis use doku.php directly and others don't, but generally only one is needed):

^https?://www\.dokuwiki\.org/(?!dokuwiki\?).*[?&]do=(index|media)(&|$)
^https?://www\.dokuwiki\.org/doku\.php\?id=(?!dokuwiki&).*&do=(index|media)(&|$)

Additionally, the media manager has separate parameters for the namespace and the displayed URL. This can lead to the whole list of namespaces being saved for each image. For instance, this URL is for the image devel:ideas:graphy-nosidebar-desktop.png, which is in the namespace devel:ideas, with the image and ns parameters matching. But this URL shows the same image, but with the (empty!) plugin:plugin namespace. Note that clicking the link to a namespace on the UI does clear the image, but this is done via javascript; opening the link in a new tab does not do this (and archivebot does not use javascript). The following ignore resolves this (using (?=) to assert that both the image and ns parameters are present, and (?!) to have the regex not ignore URLs where image and ns have matching namespaces). It does not need a specific page name as that is handled by the previous ignore, though the domain does need to be updated.

^https?://www\.dokuwiki\.org/(?!.*[?&]image=([^&]+)%3A(?![^&]*%3A).*&ns=\1|.*[?&]ns=([^&]+)&(.*&)?image=\2%3A(?![^&]*%3A))(?=.*[?&]image=[^&])(?=.*[?&]ns=[^&])

It also is not useful to save the upload and search pages in the media manager:

^https?://www\.dokuwiki\.org/.*[?&]tab_files=(upload|search)(&|$)

The taskrunner page produces an invisible image that is embedded in every page, with a timestamp in the URL to prevent caching. Its purpose is to update the serverside search index. This generally shouldn't be ignored since it is actually embedded, but if the above ignores were not added and a large number of junk URLs were already saved, then an equal number of taskrunner URLs will also appear and should be ignored.

^https?://dokuwiki\.org/lib/exe/taskrunner\.php

See also

v · t · e         Knowledge and Wikis
Software

DokuWiki · MediaWiki · MoinMoin · Oddmuse · PukiWiki · UseModWiki · YukiWiki

Wikifarms

atwiki · Battlestar Wiki · BluWiki · Communpedia · EditThis · elwiki.com · Fandom · Miraheze · Neoseeker.com · Orain · Referata · ScribbleWiki · Seesaa · ShoutWiki · SourceForge · TropicalWikis · Wik.is · Wiki.Wiki · Wiki-Site · Wikidot · WikiHub · Wikispaces · WikiForge · WikiTide · Wikkii · YourWiki.net

Wikimedia

Wikipedia · Wikimedia Commons · Wikibooks · Wikidata · Wikinews · Wikiquote · Wikisource · Wikispecies · Wiktionary · Wikiversity · Wikivoyage · Wikimedia Incubator · Meta-Wiki

Other

Anarchopedia · Citizendium · Conservapedia · Creation Wiki · EcuRed · Enciclopedia Libre Universal en Español · GNUPedia · Moegirlpedia · Nico Nico Pedia · Nupedia · OmegaWiki · OpenStreetMap · Pixiv Encyclopedia

Indexes and stats

WikiApiary · WikiIndex · Wikistats