A few days ago I commented in the chat about some scripts in which I was working to download Mozilla extensions and related pages. I uploaded the code to https://github.com/aaferrari/amo-links-getter for those interested in using it or integrating it with the Warrior.
After several days of running I got a list with more than a million links to download (the list is here) but I do not think I can download everything before Mozilla deactivates/deletes the classic extensions, so I would like more people to also take care of this.
To download the links you can use the following command:
wget --header "Cookie: mamo=off" -k -x -e robots=off -H -o messages.txt -nc -i "mozilla addons url list.txt" --mirror --warc-file="addons.mozilla.org
Explanation of some parameters:
- --header: It allows to download the pages with the classic design. This is optional but it facilitates the parsing (if necessary) and I am not sure that the pages render correctly once they are downloaded with the new style.
- -k: Makes links in the downloaded HTML or CSS point to local files.
- -x: Force the creation of directories.
- -H: Obtains external resources of a page (such as CSS files or images).
- -o: Save the output of the program to a file.
- -nc: Avoid overwriting already downloaded files.