Difference between revisions of "Dev/Source Code"

From Archiveteam
< Dev
Jump to navigation Jump to search
(→‎Client code: update for a dockerized warrior3 setup)
Line 9: Line 9:
Client code includes code that the [[Warrior]] executes.
Client code includes code that the [[Warrior]] executes.


'''[https://github.com/ArchiveTeam/warrior-preseed warrior-preseed]''' - shell
;[https://github.com/ArchiveTeam/Ubuntu-Warrior warrior3]''' - bootstrap and tools to build the image
:For constructing the warrior virtual appliance image
:Bootstrap code that is pulled from GitHub by the appliance and starts a docker container
'''[https://github.com/ArchiveTeam/Ubuntu-Warrior warrior3]''' - shell
;[https://github.com/ArchiveTeam/warrior-dockerfile archiveteam/warrior-dockerfile] - the container
:Bootstrap code that is pulled from GitHub by the appliance
:Instructions to boostrap the docker container
'''[https://github.com/ArchiveTeam/seesaw-kit seesaw-kit]''' - Python
;[https://github.com/ArchiveTeam/warrior-code2 warrior2]''' - warrior runner code
:Main code that runs inside of the docker container
;[https://github.com/ArchiveTeam/seesaw-kit seesaw-kit]'''
:Library that helps build grab scripts, the web interface, and pipeline engine for the warrior. The name "seesaw" comes from its original behavior: download, upload, and repeat.
:Library that helps build grab scripts, the web interface, and pipeline engine for the warrior. The name "seesaw" comes from its original behavior: download, upload, and repeat.



Revision as of 20:42, 30 June 2018

Fork me on GitHub! File and triage issues, fix bugs, refactor code, submit pull requests… all welcome!

See this link for all issues.

The warrior uses the following repos:

Client code

Client code includes code that the Warrior executes.

warrior3 - bootstrap and tools to build the image
Bootstrap code that is pulled from GitHub by the appliance and starts a docker container
archiveteam/warrior-dockerfile - the container
Instructions to boostrap the docker container
warrior2 - warrior runner code
Main code that runs inside of the docker container
seesaw-kit
Library that helps build grab scripts, the web interface, and pipeline engine for the warrior. The name "seesaw" comes from its original behavior: download, upload, and repeat.

Projects

Projects are in separate repositories typically with the name -grab as a suffix.

Item lists that are loaded into the tracker are sometimes saved into a repo with -items as a suffix. Scripts to build searchable index HTML pages are usually suffixed with -index.

Server code

Server code includes code that the Tracker executes.

universal-tracker - Ruby

The server of which the Seesaw contacts

warrior-hq - Ruby

The server of which the warrior appliances contact for project metadata

archiveteam-megawarc-factory - shell

The scripts that bundles the WARC files.

URLTeam code

URLTeam code is independent from the tracker and warrior.

Old:

tinyback

The client code that scrapes the shortlinks. It includes a pipeline shim to run the code.

tinyarchive

The server code for the tracker.

New:

terroroftinytown-client-grab

A pipeline shim to run the code.

terroroftinytown

The code for both the client library and tracker.

Misc

warrior-dockerfile

Dockerfile that runs the warrior inside a Docker container.

ArchiveBot - Ruby, Python, Lua

An IRC bot for archiving websites.

wget-lua - C, Lua

A patched version of Wget for web crawling.

standalone-readme-template - Markdown

A template for readme files included in grab repositories.

archiveteam-dev-env - Shell

Ubuntu preseed for a developer environment for ArchiveTeam projects.


Developer Documentation