Difference between revisions of "Talk:ArchiveBot"
Jump to navigation
Jump to search
(→Feature Suggestions:: new section) |
(DigitalOcean price change) |
||
(2 intermediate revisions by one other user not shown) | |||
Line 3: | Line 3: | ||
== Feature Suggestions: == | == Feature Suggestions: == | ||
* Full original Job ID stored inside the saved [https://ia801506.us.archive.org/10/items/archiveteam_archivebot_go_20190409170001/circumcisionmovie.com-inf-20190409-071623-dysua.json .json file]s for each job. (At the moment, the file name stores the 5 first letters of the original JobID). | * Full original Job ID (i.e. “''dysua6jfe6y0t8k71i65ttx3o''”) stored inside the saved [https://ia801506.us.archive.org/10/items/archiveteam_archivebot_go_20190409170001/circumcisionmovie.com-inf-20190409-071623-dysua.json .json file]s for each job. (At the moment, the file name stores the 5 first letters of the original JobID). | ||
* If a pipeline is running/has run low/out on/of space, alert to {{IRC|ArchiveBot}}'s IRC channel. | * If a pipeline is running/has run low/out on/of space, alert to {{IRC|ArchiveBot}}'s IRC channel. | ||
** …and move job to different pipeline, if technically possible. | ** …and move job to different pipeline, if technically possible. | ||
* Maybe, operate the [https://twitter.com/ArchiveBot ArchiveBot Twitter account] again. | * Maybe, operate the [https://twitter.com/ArchiveBot ArchiveBot Twitter account] again. | ||
** Ability to intercept URL's tweeted to it. | ** Ability to intercept URL's tweeted to it. | ||
** Ability to archive tweets in quotes and tweets replied to which includes @ArchiveBot or @ATArchiveBot, so that Twitter users can easily archive tweets by replying with “@ArchiveBot” mention. | |||
*** That could be a lot of fun. | |||
*** Could make ArchiveTeam more known. | |||
* Make [https://archivebot.readthedocs.io/en/latest/search.html?q=phantomjs&check_keywords=yes&area=default PhantomJS] work again (non-urgent because <code>[[chromebot]]</code> already exists for this purpose. | * Make [https://archivebot.readthedocs.io/en/latest/search.html?q=phantomjs&check_keywords=yes&area=default PhantomJS] work again (non-urgent because <code>[[chromebot]]</code> already exists for this purpose. | ||
* <code>!a</code> scans [[Wikipedia:sitemap|sitemaps]] (i.e. ''sitemap.xml'' file) and [[Robots.txt]] for more URL's. | * <code>!a</code> scans [[Wikipedia:sitemap|sitemaps]] (i.e. ''sitemap.xml'' file) and [[Robots.txt]] for more URL's. | ||
* <code>!ao < http://example.com/URL-List.txt</code> does not only save the URL's within the list but also the URL list '''itself'''. | * <code>!ao < http://example.com/URL-List.txt</code> does not only save the URL's within the list but also the URL list '''itself'''. | ||
* <code>!a < http://example.com/Website-List.txt</code> for big archivals (if not already possible.) | * <code>!a < http://example.com/Website-List.txt</code> for big archivals (if not already possible.) | ||
== DigitalOcean price change == | |||
According to https://www.digitalocean.com/products/droplets a 2GB Memory / 60GB Storage / 3TB Bandwidth droplet now costs $15. |
Latest revision as of 18:45, 23 March 2022
ArchiveBot has star quality ☺ : ArchiveBot/ArchiveBot/commit/566aa53 --Chfoo 23:06, 23 July 2014 (EDT)
Feature Suggestions:
- Full original Job ID (i.e. “dysua6jfe6y0t8k71i65ttx3o”) stored inside the saved .json files for each job. (At the moment, the file name stores the 5 first letters of the original JobID).
- If a pipeline is running/has run low/out on/of space, alert to #ArchiveBot (on hackint)'s IRC channel.
- …and move job to different pipeline, if technically possible.
- Maybe, operate the ArchiveBot Twitter account again.
- Ability to intercept URL's tweeted to it.
- Ability to archive tweets in quotes and tweets replied to which includes @ArchiveBot or @ATArchiveBot, so that Twitter users can easily archive tweets by replying with “@ArchiveBot” mention.
- That could be a lot of fun.
- Could make ArchiveTeam more known.
- Make PhantomJS work again (non-urgent because
chromebot
already exists for this purpose. !a
scans sitemaps (i.e. sitemap.xml file) and Robots.txt for more URL's.!ao < http://example.com/URL-List.txt
does not only save the URL's within the list but also the URL list itself.!a < http://example.com/Website-List.txt
for big archivals (if not already possible.)
DigitalOcean price change
According to https://www.digitalocean.com/products/droplets a 2GB Memory / 60GB Storage / 3TB Bandwidth droplet now costs $15.