Difference between revisions of "Distributed recursive crawls"

From Archiveteam
Jump to navigation Jump to search
(Created page with "{{Infobox project | project_status = {{specialcase}} | archiving_status = {{in progress}} | archiving_type = DPoS | source = [https://github.com/ArchiveTeam/grab-grab grab-grab] | tracker = [https://tracker.archiveteam.org/grab/ grab] | irc = Y }} This is a project to recursively crawl large websites that have no clear structure that can easily be split into work items the way we usually do on DPoS projects. It is somewhat comparable to ArchiveBot in that crawls...")
 
(Add IA collection)
Line 6: Line 6:
| tracker = [https://tracker.archiveteam.org/grab/ grab]
| tracker = [https://tracker.archiveteam.org/grab/ grab]
| irc = Y
| irc = Y
| data = {{IA collection|archiveteam_grab}}
}}
}}


This is a project to recursively crawl large websites that have no clear structure that can easily be split into work items the way we usually do on [[DPoS]] projects. It is somewhat comparable to [[ArchiveBot]] in that crawls are started manually for specific sites of interest.
This is a project to recursively crawl large websites that have no clear structure that can easily be split into work items the way we usually do on [[DPoS]] projects. It is somewhat comparable to [[ArchiveBot]] in that crawls are started manually for specific sites of interest.

Revision as of 18:45, 25 March 2022

Distributed recursive crawls
Status Special case
Archiving status In progress...
Archiving type DPoS
Project source grab-grab
Project tracker grab
IRC channel #Y (on hackint)
Data[how to use] archiveteam_grab

This is a project to recursively crawl large websites that have no clear structure that can easily be split into work items the way we usually do on DPoS projects. It is somewhat comparable to ArchiveBot in that crawls are started manually for specific sites of interest.