Difference between revisions of "Comcast Personal Web Pages"

From Archiveteam
Jump to navigation Jump to search
m
m (→‎Items: odp and common crawl scrapes are incomplete)
Line 24: Line 24:
* TODO: Scrape Reddit
* TODO: Scrape Reddit
* TODO: Scrape links from MediaWiki wikis
* TODO: Scrape links from MediaWiki wikis
* [http://paste.archivingyoursh.it/raw/jawacafexo Open Directory Project scrape]
* TODO: Scrape the Open Directory Project
* [http://paste.archivingyoursh.it/raw/busosagonu Common Crawl scrape]
* TODO: Scrape the Common Crawl Index
* TODO: Scrape the Wayback Machine
* TODO: Scrape the Wayback Machine
* TODO: Scrape URLTeam dumps
* TODO: Scrape URLTeam dumps
* TODO: Scrape a list of subdomains from DNSdumpster.com


{{Navigation box}}
{{Navigation box}}
[[Category:ISP hosting]]
[[Category:ISP hosting]]

Revision as of 03:53, 27 July 2015

Comcast Personal Web Pages
Comcast Personal Web Pages logo
URL home.comcast.net
Status Online!
Archiving status Upcoming...
Archiving type Unknown
IRC channel #comclose (on hackint)


Discovery

Sites follow two patterns:

Items

  • TODO: Scrape Google
  • TODO: Scrape Bing
  • TODO: Scrape DuckDuckGo
  • TODO: Scrape Twitter
  • TODO: Scrape Reddit
  • TODO: Scrape links from MediaWiki wikis
  • TODO: Scrape the Open Directory Project
  • TODO: Scrape the Common Crawl Index
  • TODO: Scrape the Wayback Machine
  • TODO: Scrape URLTeam dumps
  • TODO: Scrape a list of subdomains from DNSdumpster.com