Difference between revisions of "List of major MediaWiki wikis with the LinkSearch extension"

From Archiveteam
Jump to navigation Jump to search
m (MOTHERFUCKER ! ! !)
(add cs, sk wikipedias)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
== '''MOTHERFUCKER ! ! !''' ==
This is a short '''list of major MediaWiki wikis with the LinkSearch extension installed'''. It is designed to be machine-readable, such that it can be accessed by automated Archive Team exploration tools to find pages on dying websites. It's currently used by [https://github.com/lewiscollard/mwlinkscrape mwlinkscrape.py].


== '''MOTHERFUCKER ! ! !''' ==
For wiki maintainers: This list needs expanding. Put the path to the site's index.php script, one-per-line, within the <code>&lt;pre id="x_wiki_list"&gt;...&lt;/pre&gt;</code> tags below.


== '''MOTHERFUCKER ! ! !''' ==
For automated tool writers: Fetch this page and parse out the text within the <code>pre</code> tag with the ID "x_wiki_list".
 
<!-- The class here -->
<pre id="x_wiki_list">
# major Wikimedia wikis
https://en.wikipedia.org/w/index.php
https://commons.wikimedia.org/w/index.php
https://de.wikipedia.org/w/index.php
https://pl.wikipedia.org/w/index.php
https://fr.wikipedia.org/w/index.php
https://ja.wikipedia.org/w/index.php
https://es.wikipedia.org/w/index.php
https://ru.wikipedia.org/w/index.php
https://nl.wikipedia.org/w/index.php
https://sv.wikipedia.org/w/index.php
https://ru.wikipedia.org/w/index.php
https://pt.wikipedia.org/w/index.php
https://zh.wikipedia.org/w/index.php
https://cs.wikipedia.org/w/index.php
https://sk.wikipedia.org/w/index.php
# more Wikimedia wikis
https://wikimediafoundation.org/w/index.php
https://en.wikibooks.org/w/index.php
https://en.wikiversity.org/w/index.php
https://en.wiktionary.org/w/index.php
https://en.wikiquote.org/w/index.php
https://en.wikivoyage.org/w/index.php
https://www.wikidata.org/w/index.php
https://www.mediawiki.org/w/index.php
https://species.wikimedia.org/w/index.php
https://wikisource.org/w/index.php
https://en.wikinews.org/w/index.php
https://meta.wikimedia.org/w/index.php
https://incubator.wikimedia.org/w/index.php
https://wikitech.wikimedia.org/w/index.php
# Japanese Wikimedia sites
https://ja.wikiquote.org/w/index.php
https://ja.wikisource.org//w/index.php
https://ja.wikinews.org/w/index.php
https://ja.wikiversity.org/w/index.php
https://ja.wiktionary.org/w/index.php
https://ja.wikibooks.org/w/index.php
# :)
http://archiveteam.org/index.php
# miscellaneous
#https://encyclopediadramatica.se/index.php
#ED crashes:
#Traceback (most recent call last):
#  File "mwlinkscrape.py", line 162, in <module>
#    ex_url = li.find("a", attrs={"class": "external"}).renderContents()
#AttributeError: 'NoneType' object has no attribute 'renderContents'
</pre>

Latest revision as of 15:32, 18 January 2021

This is a short list of major MediaWiki wikis with the LinkSearch extension installed. It is designed to be machine-readable, such that it can be accessed by automated Archive Team exploration tools to find pages on dying websites. It's currently used by mwlinkscrape.py.

For wiki maintainers: This list needs expanding. Put the path to the site's index.php script, one-per-line, within the <pre id="x_wiki_list">...</pre> tags below.

For automated tool writers: Fetch this page and parse out the text within the pre tag with the ID "x_wiki_list".

# major Wikimedia wikis
https://en.wikipedia.org/w/index.php
https://commons.wikimedia.org/w/index.php
https://de.wikipedia.org/w/index.php
https://pl.wikipedia.org/w/index.php
https://fr.wikipedia.org/w/index.php
https://ja.wikipedia.org/w/index.php
https://es.wikipedia.org/w/index.php
https://ru.wikipedia.org/w/index.php
https://nl.wikipedia.org/w/index.php
https://sv.wikipedia.org/w/index.php
https://ru.wikipedia.org/w/index.php
https://pt.wikipedia.org/w/index.php
https://zh.wikipedia.org/w/index.php
https://cs.wikipedia.org/w/index.php
https://sk.wikipedia.org/w/index.php
# more Wikimedia wikis
https://wikimediafoundation.org/w/index.php
https://en.wikibooks.org/w/index.php
https://en.wikiversity.org/w/index.php
https://en.wiktionary.org/w/index.php
https://en.wikiquote.org/w/index.php
https://en.wikivoyage.org/w/index.php
https://www.wikidata.org/w/index.php
https://www.mediawiki.org/w/index.php
https://species.wikimedia.org/w/index.php
https://wikisource.org/w/index.php
https://en.wikinews.org/w/index.php
https://meta.wikimedia.org/w/index.php
https://incubator.wikimedia.org/w/index.php
https://wikitech.wikimedia.org/w/index.php
# Japanese Wikimedia sites
https://ja.wikiquote.org/w/index.php
https://ja.wikisource.org//w/index.php
https://ja.wikinews.org/w/index.php
https://ja.wikiversity.org/w/index.php
https://ja.wiktionary.org/w/index.php
https://ja.wikibooks.org/w/index.php
# :)
http://archiveteam.org/index.php
# miscellaneous
#https://encyclopediadramatica.se/index.php
#ED crashes:
#Traceback (most recent call last):
#  File "mwlinkscrape.py", line 162, in <module>
#    ex_url = li.find("a", attrs={"class": "external"}).renderContents()
#AttributeError: 'NoneType' object has no attribute 'renderContents'