Difference between revisions of "VBulletin"
m |
InquilineKea (talk | contribs) |
||
Line 39: | Line 39: | ||
Here is a list of some old forums (many which are vBulletin): http://web.archive.org/web/20061229181451/http://rankings.big-boards.com/?p=all | Here is a list of some old forums (many which are vBulletin): http://web.archive.org/web/20061229181451/http://rankings.big-boards.com/?p=all | ||
A very trivial way to archive vBulletin forums (with recent vBulletin software) is to just run a single for loop across all the posts. E.g. run a for loop on Physics Forums from https://www.physicsforums.com/posts/1 to https://www.physicsforums.com/posts/5223616. | |||
== See also == | == See also == |
Latest revision as of 20:35, 9 September 2015
Archiving vBulletin (tested only with http://boards.cityofheroes.com/, you may have to change some things):
1. Get a recent Wget+Lua version (it should include WARC support).
2. Get the vbulletin.lua script: https://raw.github.com/ArchiveTeam/cityofheroes-grab/master/vbulletin.lua
3. Collect the forum IDs (the f=
parameter in the urls) of forums and subforums. Some pages have a "Forum Jump" dropdown list that gives you the numbers.
Run Wget with the Lua script and seed it with the forum URLs. Start with the URL to /external.php?type=RSS2
to get a session cookie (having a session cookie is necessary to remove the session ID from the URLs).
The Lua script will navigate the forum pages: it will follow pagination links, go from forumdisplay to threads, from threads to posts and members. Use --page-requisites and --span-hosts to get the images. When preparing the seed URLs, be aware that the Lua script only crawls from forum to thread to post/member. It does not, for example, jump from one forum to the other or from a thread back to the forum.
For example, this works for the City of Heroes forums:
./wget-lua \ -U "$USER_AGENT" \ -nv \ -o wget.log \ --directory-prefix files/ \ --keep-session-cookies \ --save-cookies cookies.txt \ --force-directories \ --adjust-extension \ -e "robots=off" \ --page-requisites --span-hosts \ --lua-script vbulletin.lua \ --timeout 10 \ --tries 3 \ --waitretry 5 \ --warc-file forum \ --warc-header "operator: Archive Team" \ "http://boards.cityofheroes.com/external.php?type=RSS2" \ "http://boards.cityofheroes.com/forumdisplay.php?f=547" \ "http://boards.cityofheroes.com/forumdisplay.php?f=569" \ "http://boards.cityofheroes.com/forumdisplay.php?f=660" \ etc.
Here is a list of some old forums (many which are vBulletin): http://web.archive.org/web/20061229181451/http://rankings.big-boards.com/?p=all
A very trivial way to archive vBulletin forums (with recent vBulletin software) is to just run a single for loop across all the posts. E.g. run a for loop on Physics Forums from https://www.physicsforums.com/posts/1 to https://www.physicsforums.com/posts/5223616.
See also