From Archiveteam
Jump to navigation Jump to search
Nifty logo
Japanese ISP with web hosting
Japanese ISP with web hosting
Project status Offline
Archiving status Saved!
Project source
Project tracker Unknown
IRC channel #niftyjanai (on EFnet)
Project lead User:Sanqui, User:DoomTay

Japanese ISP providing web hosting. Will be closing about 140,000 unclaimed homepages by 2016-11-10 15:00. Termination notice[IAWcite.todayMemWeb] (Japanese)

URL harvesting

Let's follow Site exploration.

<polm> One thing I would recommend is searching Hatena Bookmarks, which is like a Japanese free Pinboard
<polm> Like so:
<polm> the "of" query parameter paginates like so:
<zout> there's some here.


Next steps

  • GoogleScraper is no good. Make attempts at scraping, Bing, Twitter using hints on Site exploration
  • Put chunks of up to 100k URLs onto high speed (20160911.01) ArchiveBot pipelines