Difference between revisions of "Dev/Source Code"
(label repos with programming language) |
(Add wpull and -dev IRC channel) |
||
(17 intermediate revisions by 7 users not shown) | |||
Line 1: | Line 1: | ||
[https://github.com/ArchiveTeam/ Fork me on GitHub!] File and triage issues, fix bugs, refactor code, submit pull requests… all welcome! Discussion in {{IRC|archiveteam-dev}}. | |||
'''[https://github.com/search?l=&q=user%3AArchiveTeam+state%3Aopen&type=Issues See this link for all issues]'''. | |||
The warrior uses the following repos: | The warrior uses the following repos: | ||
Line 7: | Line 9: | ||
Client code includes code that the [[Warrior]] executes. | Client code includes code that the [[Warrior]] executes. | ||
'''[https://github.com/ArchiveTeam/warrior- | ;[https://github.com/ArchiveTeam/Ubuntu-Warrior warrior3]''' - bootstrap and tools to build the image | ||
: | :Bootstrap code that is pulled from GitHub by the appliance and starts a docker container | ||
;[https://github.com/ArchiveTeam/warrior-dockerfile archiveteam/warrior-dockerfile] - the container | |||
: | :Instructions to boostrap the docker container | ||
;[https://github.com/ArchiveTeam/warrior-code2 warrior2]''' - warrior runner code | |||
:Main code that runs inside of the docker container | |||
;[https://github.com/ArchiveTeam/seesaw-kit seesaw-kit]''' | |||
:Library that helps build grab scripts, the web interface, and pipeline engine for the warrior. The name "seesaw" comes from its original behavior: download, upload, and repeat. | :Library that helps build grab scripts, the web interface, and pipeline engine for the warrior. The name "seesaw" comes from its original behavior: download, upload, and repeat. | ||
Line 17: | Line 21: | ||
Projects are in separate repositories typically with the name <code>-grab</code> as a suffix. | Projects are in separate repositories typically with the name <code>-grab</code> as a suffix. | ||
Item lists that are loaded into the tracker are sometimes saved into a repo with <code>-items</code> as a suffix. Scripts to build searchable index HTML pages are usually suffixed with <code>-index</code>. | |||
== Server code == | == Server code == | ||
Line 32: | Line 38: | ||
URLTeam code is independent from the tracker and warrior. | URLTeam code is independent from the tracker and warrior. | ||
Old: | |||
'''[https://github.com/ArchiveTeam/tinyback tinyback]''' | '''[https://github.com/ArchiveTeam/tinyback tinyback]''' | ||
Line 37: | Line 45: | ||
'''[https://github.com/ArchiveTeam/tinyarchive tinyarchive]''' | '''[https://github.com/ArchiveTeam/tinyarchive tinyarchive]''' | ||
: The server code for the tracker. | : The server code for the tracker. | ||
New: | |||
'''[https://github.com/ArchiveTeam/terroroftinytown-client-grab terroroftinytown-client-grab]''' | |||
: A pipeline shim to run the code. | |||
'''[https://github.com/ArchiveTeam/terroroftinytown terroroftinytown]''' | |||
: The code for both the client library and tracker. | |||
== Misc == | == Misc == | ||
Line 46: | Line 62: | ||
'''[https://github.com/ArchiveTeam/wget-lua wget-lua]''' - C, Lua | '''[https://github.com/ArchiveTeam/wget-lua wget-lua]''' - C, Lua | ||
:A patched version of Wget for web crawling. | :A patched version of Wget for web crawling. | ||
'''[https://github.com/ArchiveTeam/standalone-readme-template standalone-readme-template]''' - Markdown | |||
:A template for readme files included in grab repositories. | |||
'''[https://github.com/ArchiveTeam/archiveteam-dev-env archiveteam-dev-env]''' - Shell | |||
:Ubuntu preseed for a developer environment for ArchiveTeam projects. | |||
'''[https://github.com/ArchiveTeam/wpull wpull]''' - Python | |||
:A Wget-compatible web downloader/crawler. | |||
{{devnav}} | {{devnav}} | ||
{{Navigation box}} |
Latest revision as of 23:02, 2 May 2019
Fork me on GitHub! File and triage issues, fix bugs, refactor code, submit pull requests… all welcome! Discussion in #archiveteam-dev (on hackint).
The warrior uses the following repos:
Client code
Client code includes code that the Warrior executes.
- warrior3 - bootstrap and tools to build the image
- Bootstrap code that is pulled from GitHub by the appliance and starts a docker container
- archiveteam/warrior-dockerfile - the container
- Instructions to boostrap the docker container
- warrior2 - warrior runner code
- Main code that runs inside of the docker container
- seesaw-kit
- Library that helps build grab scripts, the web interface, and pipeline engine for the warrior. The name "seesaw" comes from its original behavior: download, upload, and repeat.
Projects
Projects are in separate repositories typically with the name -grab
as a suffix.
Item lists that are loaded into the tracker are sometimes saved into a repo with -items
as a suffix. Scripts to build searchable index HTML pages are usually suffixed with -index
.
Server code
Server code includes code that the Tracker executes.
universal-tracker - Ruby
- The server of which the Seesaw contacts
warrior-hq - Ruby
- The server of which the warrior appliances contact for project metadata
archiveteam-megawarc-factory - shell
- The scripts that bundles the WARC files.
URLTeam code
URLTeam code is independent from the tracker and warrior.
Old:
- The client code that scrapes the shortlinks. It includes a pipeline shim to run the code.
- The server code for the tracker.
New:
- A pipeline shim to run the code.
- The code for both the client library and tracker.
Misc
- Dockerfile that runs the warrior inside a Docker container.
ArchiveBot - Ruby, Python, Lua
- An IRC bot for archiving websites.
wget-lua - C, Lua
- A patched version of Wget for web crawling.
standalone-readme-template - Markdown
- A template for readme files included in grab repositories.
archiveteam-dev-env - Shell
- Ubuntu preseed for a developer environment for ArchiveTeam projects.
wpull - Python
- A Wget-compatible web downloader/crawler.