Difference between revisions of "List of websites excluded from the Wayback Machine"

From Archiveteam
Jump to navigation Jump to search
(Clarify irrelevance of www prefix and remove duplicate mentions of subdomains)
Line 1: Line 1:
This page collects sites that are manually excluded from the Wayback Machine. When a site is manually excluded, attempting to access it returns the error "This URL has been excluded from the Wayback Machine". This applies to all subdomains as well. This page does not track websites that disallow IA crawlers in their robots.txt file or block them. This list is not provided by the Internet Archive.
This page collects sites that are manually excluded from the Wayback Machine. When a site is manually excluded, attempting to access it returns the error "This URL has been excluded from the Wayback Machine". This applies to all subdomains as well, and as usual in the Wayback Machine, a leading <code>www.</code> is insignificant. This page does not track websites that disallow IA crawlers in their robots.txt file or block them. This list is not provided by the Internet Archive.


This page only collects entire websites (domains). For cases where only some parts of a domain are excluded, see the [[/Partial exclusions]] subpage.
This page only collects entire websites (domains). For cases where only some parts of a domain are excluded, see the [[/Partial exclusions]] subpage.
Line 257: Line 257:
* http://www.autohotkey.net/
* http://www.autohotkey.net/
* https://autoleht.ee/
* https://autoleht.ee/
* https://avalos.me/ and all subdomains
* https://avalos.me/
* http://www.avions-militaires.net/
* http://www.avions-militaires.net/
* http://awesomeandrew.net/ <!--dead (Sep 23 2019)-->
* http://awesomeandrew.net/ <!--dead (Sep 23 2019)-->
Line 1,698: Line 1,698:
* http://www.tvradiobits.co.uk/ <!--redirect to https://www.desandmick.co.uk/ (Mar 12 2019)-->
* http://www.tvradiobits.co.uk/ <!--redirect to https://www.desandmick.co.uk/ (Mar 12 2019)-->
* https://tvshowtranscripts.ourboard.org/
* https://tvshowtranscripts.ourboard.org/
* https://typeform.com/ and all subdomains
* https://typeform.com/
* http://www.typeit.org/
* http://www.typeit.org/
* https://ucp.nordvpn.com
* https://ucp.nordvpn.com

Revision as of 00:14, 27 February 2024

This page collects sites that are manually excluded from the Wayback Machine. When a site is manually excluded, attempting to access it returns the error "This URL has been excluded from the Wayback Machine". This applies to all subdomains as well, and as usual in the Wayback Machine, a leading www. is insignificant. This page does not track websites that disallow IA crawlers in their robots.txt file or block them. This list is not provided by the Internet Archive.

This page only collects entire websites (domains). For cases where only some parts of a domain are excluded, see the /Partial exclusions subpage.

Past exclusions that are no longer active are tracked on the /Former exclusions subpage.

This list currently contains 1818 URLs.