Difference between revisions of "Talk:ArchiveBot"

From Archiveteam
Jump to navigation Jump to search
(→‎Feature Suggestions:: Mentioning example job ID “dysua6jfe6y0t8k71i65ttx3o”.)
(→‎Feature Suggestions:: Mentioning: easy tweet archival.)
 
Line 8: Line 8:
* Maybe, operate the [https://twitter.com/ArchiveBot ArchiveBot Twitter account] again.
* Maybe, operate the [https://twitter.com/ArchiveBot ArchiveBot Twitter account] again.
** Ability to intercept URL's tweeted to it.
** Ability to intercept URL's tweeted to it.
** Ability to archive tweets in quotes and tweets replied to which includes @ArchiveBot or @ATArchiveBot, so that Twitter users can easily archive tweets by replying with “@ArchiveBot” mention.
*** That could be a lot of fun.
*** Could make ArchiveTeam more known.
* Make [https://archivebot.readthedocs.io/en/latest/search.html?q=phantomjs&check_keywords=yes&area=default PhantomJS] work again (non-urgent because <code>[[chromebot]]</code> already exists for this purpose.
* Make [https://archivebot.readthedocs.io/en/latest/search.html?q=phantomjs&check_keywords=yes&area=default PhantomJS] work again (non-urgent because <code>[[chromebot]]</code> already exists for this purpose.
* <code>!a</code> scans [[Wikipedia:sitemap|sitemaps]] (i.e. ''sitemap.xml'' file) and [[Robots.txt]] for more URL's.
* <code>!a</code> scans [[Wikipedia:sitemap|sitemaps]] (i.e. ''sitemap.xml'' file) and [[Robots.txt]] for more URL's.
* <code>!ao < http://example.com/URL-List.txt</code> does not only save the URL's within the list but also the URL list '''itself'''.
* <code>!ao < http://example.com/URL-List.txt</code> does not only save the URL's within the list but also the URL list '''itself'''.
* <code>!a < http://example.com/Website-List.txt</code> for big archivals (if not already possible.)
* <code>!a < http://example.com/Website-List.txt</code> for big archivals (if not already possible.)

Latest revision as of 17:54, 4 May 2019

ArchiveBot has star quality ☺ : ArchiveBot/ArchiveBot/commit/566aa53 --Chfoo 23:06, 23 July 2014 (EDT)

Feature Suggestions:

  • Full original Job ID (i.e. “dysua6jfe6y0t8k71i65ttx3o”) stored inside the saved .json files for each job. (At the moment, the file name stores the 5 first letters of the original JobID).
  • If a pipeline is running/has run low/out on/of space, alert to #ArchiveBot (on hackint)'s IRC channel.
    • …and move job to different pipeline, if technically possible.
  • Maybe, operate the ArchiveBot Twitter account again.
    • Ability to intercept URL's tweeted to it.
    • Ability to archive tweets in quotes and tweets replied to which includes @ArchiveBot or @ATArchiveBot, so that Twitter users can easily archive tweets by replying with “@ArchiveBot” mention.
      • That could be a lot of fun.
      • Could make ArchiveTeam more known.
  • Make PhantomJS work again (non-urgent because chromebot already exists for this purpose.
  • !a scans sitemaps (i.e. sitemap.xml file) and Robots.txt for more URL's.
  • !ao < http://example.com/URL-List.txt does not only save the URL's within the list but also the URL list itself.
  • !a < http://example.com/Website-List.txt for big archivals (if not already possible.)