Difference between revisions of "List of websites excluded from the Wayback Machine"

From Archiveteam
Jump to navigation Jump to search
Line 1: Line 1:
There are two ways webmasters keep the Wayback Machine out of their website: Through [[robots.txt]] ia_archiver exclusion (“user-agent:ia_archiver  disallow:/”) or through a manual exclusion request.
There are two ways webmasters keep the Wayback Machine out of their website: through [[robots.txt]] ia_archiver exclusion (“user-agent:ia_archiver  disallow:/”) or through a manual exclusion request.


While the first, more common way of exclusion shows “This page cannot be crawled or displayed due to Robots.txt” when trying to access it through the Wayback Machine, the second way displays “This page has been excluded from the Wayback Machine”.
While the first, more common way of exclusion shows “This page cannot be crawled or displayed due to Robots.txt” when trying to access it through the Wayback Machine, the second way displays “This page has been excluded from the Wayback Machine”.
This page collects only the latter cases.


* https://www.11alive.com/
* https://www.11alive.com/
Line 360: Line 362:
* https://mdn.mainichi-msn.co.jp/
* https://mdn.mainichi-msn.co.jp/
* http://meaning-of-dreams.com/ <!--dead (Mar 8 2019)-->
* http://meaning-of-dreams.com/ <!--dead (Mar 8 2019)-->
* http://mediafire.com/robots.txt (robots.txt exclusion only) <!-- Maybe put robots.txt Archive enemies into a separate feud list?-->
* https://www.medicom.de/
* https://www.medicom.de/
* http://www.megrock.com/
* http://www.megrock.com/

Revision as of 14:04, 26 April 2019

There are two ways webmasters keep the Wayback Machine out of their website: through robots.txt ia_archiver exclusion (“user-agent:ia_archiver disallow:/”) or through a manual exclusion request.

While the first, more common way of exclusion shows “This page cannot be crawled or displayed due to Robots.txt” when trying to access it through the Wayback Machine, the second way displays “This page has been excluded from the Wayback Machine”.

This page collects only the latter cases.