Difference between revisions of "Zetaboards"

From Archiveteam
Jump to navigation Jump to search
(corrected url format for discovering boards)
(add url descriptions)
Line 17: Line 17:




== URL format ==
== Board URLs ==


<s>ZetaBoards URLs have two formats:  
<s>ZetaBoards URLs have two formats:  
Line 29: Line 29:
Instead, board URLs should be discovered using the following format:
Instead, board URLs should be discovered using the following format:


'''http://BOARD.aSERVER.zetaboards.com/'''  where SERVER is 1 - 15 and BOARD is the numerical ID of the board. It is likely (but not confirmed) that the board IDs are unique across all the servers, rather than being unique to an individual server.
'''http://BOARD.SERVER.zetaboards.com/'''  where SERVER is 1 - 15 and BOARD is the numerical ID of the board. It is likely (but not confirmed) that the board IDs are unique across all the servers, rather than being unique to an individual server.


== Board Types ==
== Board Types ==
ZetaBoards provides two types of boards - public and private. Public boards can be accessed by anyone, whereas private boards can only be read by a logged in member, making their archiving harder. ZetaBoards has an optional captcha on the signup page and uses Cloudflare anti spam protection, increasing the difficulty.
ZetaBoards provides two types of boards - public and private. Public boards can be accessed by anyone, whereas private boards can only be read by a logged in member, making their archiving harder. ZetaBoards has an optional captcha on the signup page and uses Cloudflare anti spam protection, increasing the difficulty.
== URL Format ==
The URL formats are as below:
All boards have an '''/index/''' page
They then have many '''/forum/FORUMID/''' pages, which are the sub-boards of the board
Within each '''/forum/FORUMID/''' you have '''/topic/TOPICID/''' pages, one for each thread
The '''/topic/''' s can have multiple pages, which are selected by a number after '''/topic/''' , ie: '''/topic/10193784/1''' is page 1 of topic 10193784
Posts are shown under the '''/topic/''' pages, however each post also has an alternative view, '''/single/?p=POSTID&t=TOPICID'''. It's almost certainly unnecessary to archive these, as they are simply duplicates of the info on '''/topic/'''
There is a page '''/members/''' which lists all members. Like /topic/ and /forum/, this can have multiple pages which are done in the usual format.
User profiles are '''/profile/PROFILEID/''' pages.
PMs are done with the '''/comm/''' and '''/msg/''' pages. Not needed to archive.
Searching is handled by /search/. A search URL could look like ''http://BOARDURL/search/?c=1&q=SEARCHQUERY&type=post&sort=desc&forum%5B%5D=-1&s_m=3&s_d=8&s_y=2014&e_m=5&e_d=20&e_y=2018'' which looks for posts, in all forums, within all dates, sorted descending. Probably not necessary to archive these.
However, searching is also used for:
* listing active topics: /BOARDURL/search/?c=5
* showing posts by certain member: http://BOARDURL/search/?c=2&mid=USERID
* showing topics by certain member: http://BOARDURL/search/?c=4&mid=USERID
There are probably more types of pages, but these are all I could see upon quick inspection.
Summary: a simple grab of the entire board should be sufficient, ignoring /single/* /comm/* /msg/* /search/?c=1* /stats/marktopics/ /stats/mark/ /home/?c=4 /online/* /home/?c=4 /home/?c=2 /home/?c=16 /home/?c=20 /home/?c=8 /home/?c=37 /home/?c=6 /home/?c=32 /home/?c=10 /home/?c=14 /home/?c=22 /login/logout/ /login/multi_login/


{{Navigation box}}
{{Navigation box}}

Revision as of 19:07, 20 May 2018

ZetaBoards
Zetaboards.png
URL http://www.zetaboards.com/
Status Online!
Archiving status Not saved yet
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)

ZetaBoards (formally InvisionFree) is a forum host that offers both paid and free-with-ads forums to anyone. It claims to have been "used by millions of people looking for a place to gather, discuss and share.". Initial checks show that over 75,000 boards are listed on their "Featured Board Index" page on their site, showing the scale of this project. ZetaBoards is owned by Zathyus Networks, Inc., which also owns zIFBoards.

Although there has been no indication that ZetaBoards will cease to exist, they have experienced multiple periods of downtime recently, the latest being for a few days 2016-11-21 to 2016-11-23 for many of their boards. Combining this with the decline in popularity of forums with the move to other social platforms and the increase of ad-blockers reducing revenue for ad-supported websites including ZetaBoards, it would be worth preemptively archiving the forums to prevent loss of data in the future.

Current status

Currently a list of board URLs is being discovered, using the numerical URL format described below.

Once this is completed, these boards will be sorted into private and public boards, before being archived.


Board URLs

ZetaBoards URLs have two formats:

The standard on is SERVER.zetaboards.com/BOARD where SERVER is s1 - s15 (and maybe also w1 - w15?) and BOARD is the text name of the board.

The other format is zetaboards.com/directory/?p=jump&v=BOARDNUM&s=SERVERNUM where SERVERNUM is the numerical ID of the board and SERVERNUM is the numerical number of the server. At this point in time, it is unclear how the distinction is made between s servers and w servers, however by making SERVERNUM 0 you will usually be redirected to the appropriate server. This URL will redirect you to the standard URL that is detailed above.

This information is only applicable to boards that are listed on the Zetaboards Forum Directory, which is a directory that board owners can apply to be part of. It is therefore unsuitable for scraping board URLs.

Instead, board URLs should be discovered using the following format:

http://BOARD.SERVER.zetaboards.com/ where SERVER is 1 - 15 and BOARD is the numerical ID of the board. It is likely (but not confirmed) that the board IDs are unique across all the servers, rather than being unique to an individual server.

Board Types

ZetaBoards provides two types of boards - public and private. Public boards can be accessed by anyone, whereas private boards can only be read by a logged in member, making their archiving harder. ZetaBoards has an optional captcha on the signup page and uses Cloudflare anti spam protection, increasing the difficulty.

URL Format

The URL formats are as below:

All boards have an /index/ page

They then have many /forum/FORUMID/ pages, which are the sub-boards of the board

Within each /forum/FORUMID/ you have /topic/TOPICID/ pages, one for each thread

The /topic/ s can have multiple pages, which are selected by a number after /topic/ , ie: /topic/10193784/1 is page 1 of topic 10193784

Posts are shown under the /topic/ pages, however each post also has an alternative view, /single/?p=POSTID&t=TOPICID. It's almost certainly unnecessary to archive these, as they are simply duplicates of the info on /topic/

There is a page /members/ which lists all members. Like /topic/ and /forum/, this can have multiple pages which are done in the usual format.

User profiles are /profile/PROFILEID/ pages.

PMs are done with the /comm/ and /msg/ pages. Not needed to archive.

Searching is handled by /search/. A search URL could look like http://BOARDURL/search/?c=1&q=SEARCHQUERY&type=post&sort=desc&forum%5B%5D=-1&s_m=3&s_d=8&s_y=2014&e_m=5&e_d=20&e_y=2018 which looks for posts, in all forums, within all dates, sorted descending. Probably not necessary to archive these.

However, searching is also used for:


There are probably more types of pages, but these are all I could see upon quick inspection.

Summary: a simple grab of the entire board should be sufficient, ignoring /single/* /comm/* /msg/* /search/?c=1* /stats/marktopics/ /stats/mark/ /home/?c=4 /online/* /home/?c=4 /home/?c=2 /home/?c=16 /home/?c=20 /home/?c=8 /home/?c=37 /home/?c=6 /home/?c=32 /home/?c=10 /home/?c=14 /home/?c=22 /login/logout/ /login/multi_login/