Difference between revisions of "DNS History"

From Archiveteam
Jump to navigation Jump to search
(Lesson about causing issues to users of a site.)
m (date format)
 
(29 intermediate revisions by 11 users not shown)
Line 5: Line 5:
| URL = {{url|https://dnshistory.org/|dnshistory.org}}
| URL = {{url|https://dnshistory.org/|dnshistory.org}}
| description = "Here at DNS History we have been crawling DNS records since 2009, our database currently contains over 500 million domains and over 2 billion DNS records."
| description = "Here at DNS History we have been crawling DNS records since 2009, our database currently contains over 500 million domains and over 2 billion DNS records."
| project_status = {{closing}}
| project_status = {{endangered}}
| archiving_status = {{inprogress}}
| archiving_status = {{notsavedyet}}
| tracker = [http://tracker.archiveteam.org/dnshistory dnshistory]
| tracker = [http://tracker.archiveteam.org/dnshistory dnshistory]
| source = [https://github.com/ArchiveTeam/dnshistory-grab dnshistory-grab]
| source = [https://github.com/ArchiveTeam/dnshistory-grab dnshistory-grab]
| irc = greatlookup
| irc = greatlookup
| irc_network = EFnet
| irc_abandoned = true
}}
}}


'''DNS History''' is a [http://en.wikipedia.org/wiki/Domain_Name_System DNS] record archive provided by [http://8086.net 8086 Consulting].
'''DNS History''' is a [[wikipedia:Domain Name System|DNS]] record archive provided by {{URL|http://8086.net|2=8086 Consultancy}}.


On July 10th, 2016, DNS History will be shut down due to "funding issues".  
It was scheduled to close on {{datetime|2016-07-10}} due to "funding issues".  


== Announcement ==
== Announcement ==


On the main page, on the right:
According to a {{URL|https://twitter.com/dnshistory/status/747021758075703300|2=Twitter entry}} dated {{datetime|2016-06-26}}:
 
:"dnshistory.org will be closing down in a couple of weeks - it has been a fun project over the years thanks folks."
 
Also, on the {{URL|http://dnshistory.org|2=main page}}, on the right:


:"This site will shut down on the 10th July 2016 due to funding issues."
:"This site will shut down on the 10th July 2016 due to funding issues."


There has been no further annoucement which we have been able to find at this time, the operators have not replied, and the forums are inaccessible.
== Final announcement ==
On the main page, appeared on {{datetime|2016-08-23}}:
 
:"'''DNS History has now shutdown''', any updates are for my personal requirements only - the site may go up/down as I need the resources it uses. The number of servers dedicated to the DNS History project is now ZERO. Due to the ArchiveTeams self-righteous attitude CloudFlares DDoS protection has been enabled. They are quite open about their attitude of ignoring robots.txt files and work around blocks on their user agent - this as abuse."


== Discovery results ==
== Discovery results ==
A scrape of all TLDs which are crawled by DNS History currently is available [http://pastebin.com/AWnrkiEY here].
Out of the 1365 TLDs,
Out of the 1365 TLDs,
* 1114 do have pages
* 1114 do have pages
** 1110 are [http://pastebin.com/raw/hQ7bXcem here]
** 1110 are {{URL|https://pastebin.com/raw/hQ7bXcem|2=here}}
** .info: 251346 pages
** .info: 251346 pages
** .org: 288405 pages
** .org: 288405 pages
** .net: 422569 pages
** .net: 422569 pages
** .com: discovery still running (expected to be millions of pages)
** .com: unknown (expected to be millions of pages)
* 251 don't have a single domain: [http://pastebin.com/raw/JVXRDhru they]
* 251 don't have a single domain: {{URL|https://pastebin.com/raw/JVXRDhru|2=they}}
 
== Lack of cooperation ==
 
After ArchiveTeam started the project, dnshistory.org staff shortly activated CloudFlare in order to block our access to the site.
 
[[user:bzc6p]] contacted them asking for cooperation, but they [https://secure.8086.net/portal/viewticket.php?tid=NTM-405143&c=9JIYc46J replied] that ''"there is no way you can archive the >billion pages on the site and trying to do so is causing issues for other users on the site."''
 
== How can I help? ==
 
=== Running a Warrior ===
 
You can start up a [[Warrior]] and there select ''DNSHistory''. (If you don't really care what you are archiving, select ''ArchiveTeam's Choice'' instead, as at some points ArchiveTeam may priorize another project.)
 
=== Running the script manually ===
 
If you use Linux and you're a bit familiar with it, you can try running the script directly. - When it's written!
 
The instructions can be found at [https://github.com/ArchiveTeam/coursera-grab github.com/ArchiveTeam/coursera-grab].


{| class="mw-collapsible mw-collapsed" style="text-align:left;"
== Archiving ==
! Some additional information
|-
| Don't forget to replace YOURNICKHERE with your nickname.


The number after <code>--concurrent</code> determines how many threads run at the same time. You can increase this number if your resources (RAM, CPU, bandwidth) are sufficient. However, if you constantly see messages about rate limiting, there is no need to increase the concurrency.
After ArchiveTeam started the project on {{datetime|2016-07-04}}, dnshistory.org staff shortly activated CloudFlare in order to block our access to the site.


If you want to stop the script, please do it gracefully if possible. To do so, create an empty file named '''STOP''' in the folder of the script (terminal command: <code>touch STOP</code>). The script finishes the current item(s) and stops only after that. (If you kill the script immediately, the items get broken, and they will need to be reassigned to another user.) – Before starting the script again, don't forget to remove the STOP file.
[[user:bzc6p|bzc6p]] contacted them asking for cooperation, but they {{URL|1=https://secure.8086.net/portal/viewticket.php?tid=NTM-405143&c=9JIYc46J|2=replied}} that ''"there is no way you can archive the >billion pages on the site and trying to do so is causing issues for other users on the site."''


If you see "Project code is out of date", kill the script, go to its folder (<code>cd coursera-grab</code>) and issue <code><nowiki>git pull https://github.com/ArchiveTeam/</nowiki>coursera-grab</code>. After the updating has finished, re-launch the script.
Later CloudFlare seemed to have been deactivated, but the site was broken for a month, giving 500 errors. However, as of {{datetime|2016-08-17}}, the site is up, with various queries, including search, still functioning, with the closure notice still up.
|}


=== Donating to the Internet Archive ===
On {{datetime|2016-08-20}}, ArchiveTeam restarted its efforts. On {{datetime|2016-08-21}}, although the site seemed to deal with the load, and the site has officially been closed for 42 days, they started to ban our crawlers. On {{datetime|2016-08-22}}, they deployed their impenetrable Cloudflare browser checker again.


Content downloaded by the ArchiveTeam will be uploaded to the [[Internet Archive]], where it will be stored and be available – hopefully – forever. However, storing it costs thousands of dollars in the long run. So, if you can afford, please consider donating to the Internet Archive, so that this piece of history can be kept for us all. http://archive.org/donate
On {{datetime|2016-08-23}}, a final shutdown notice appeared (see above).


=== Do you like our cause? ===
ArchiveTeam was hardly able to save anything.


If you want to help in other projects, want to learn more about ArchiveTeam, or even help in development in general, navigate to the [[Main Page]] of this wiki, from there you can reach a lot of information. The Team consists of volunteers working on the projects in their free time, so helping hands (and resources) are always welcome.
Since at least 2018, the site has had numerous prolonged downtimes returning blank HTTP 500 responses on all pages except the homepage.


{{Navigation box}}
{{Navigation box}}

Latest revision as of 08:33, 10 December 2023

DNS History
DNS History logo
"Here at DNS History we have been crawling DNS records since 2009, our database currently contains over 500 million domains and over 2 billion DNS records."
"Here at DNS History we have been crawling DNS records since 2009, our database currently contains over 500 million domains and over 2 billion DNS records."
URL dnshistory.org[IAWcite.todayMemWeb]
Status Endangered
Archiving status Not saved yet
Archiving type Unknown
Project source dnshistory-grab
Project tracker dnshistory
IRC channel #archiveteam-bs (on hackint)
(formerly #greatlookup (on EFnet))

DNS History is a DNS record archive provided by 8086 Consultancy[IAWcite.todayMemWeb].

It was scheduled to close on 2016-07-10 due to "funding issues".

Announcement

According to a Twitter entry[IAWcite.todayMemWeb] dated 2016-06-26:

"dnshistory.org will be closing down in a couple of weeks - it has been a fun project over the years thanks folks."

Also, on the main page[IAWcite.todayMemWeb], on the right:

"This site will shut down on the 10th July 2016 due to funding issues."

Final announcement

On the main page, appeared on 2016-08-23:

"DNS History has now shutdown, any updates are for my personal requirements only - the site may go up/down as I need the resources it uses. The number of servers dedicated to the DNS History project is now ZERO. Due to the ArchiveTeams self-righteous attitude CloudFlares DDoS protection has been enabled. They are quite open about their attitude of ignoring robots.txt files and work around blocks on their user agent - this as abuse."

Discovery results

Out of the 1365 TLDs,

  • 1114 do have pages
    • 1110 are here[IAWcite.todayMemWeb]
    • .info: 251346 pages
    • .org: 288405 pages
    • .net: 422569 pages
    • .com: unknown (expected to be millions of pages)
  • 251 don't have a single domain: they[IAWcite.todayMemWeb]

Archiving

After ArchiveTeam started the project on 2016-07-04, dnshistory.org staff shortly activated CloudFlare in order to block our access to the site.

bzc6p contacted them asking for cooperation, but they replied[IAWcite.todayMemWeb] that "there is no way you can archive the >billion pages on the site and trying to do so is causing issues for other users on the site."

Later CloudFlare seemed to have been deactivated, but the site was broken for a month, giving 500 errors. However, as of 2016-08-17, the site is up, with various queries, including search, still functioning, with the closure notice still up.

On 2016-08-20, ArchiveTeam restarted its efforts. On 2016-08-21, although the site seemed to deal with the load, and the site has officially been closed for 42 days, they started to ban our crawlers. On 2016-08-22, they deployed their impenetrable Cloudflare browser checker again.

On 2016-08-23, a final shutdown notice appeared (see above).

ArchiveTeam was hardly able to save anything.

Since at least 2018, the site has had numerous prolonged downtimes returning blank HTTP 500 responses on all pages except the homepage.