Difference between revisions of "Wallhaven"

From Archiveteam
Jump to navigation Jump to search
m (→‎Overview: typos fixed: sneak peak → sneak peek)
(update info about alpha and stats)
Line 23: Line 23:


We should archive the content on alpha.wallhaven.cc.
We should archive the content on alpha.wallhaven.cc.
Alpha phase ending soon.
https://alpha.wallhaven.cc/forums/post/16193#post-16193
Yeah, we're very close to ending alpha now. There will be a few "big" changes that will be breaking for a lot of 3rd parties, so we're going to try and make sure to have a decent plan to allow a bit of time to react before we just break everything for them.


==Work thus far==
==Work thus far==
Line 33: Line 38:


Stats:
Stats:
* Around 21k wallpapers so far.
* Around 700k wallpapers so far.
* Per day around 2.5k new wallpapers are uploaded.
* Per day around 2.5k new wallpapers are uploaded.



Revision as of 13:47, 2 March 2018

Wallhaven (Alpha Phase)
Wallhaven logo
wallpaper repository
wallpaper repository
URL http://alpha.wallhaven.cc
Status Online!
Archiving status In progress...
Archiving type Unknown
IRC channel #archiveteam (on hackint)

wallhaven.cc is a store of wallpapers and other high-resolution media typically scraped from chans' /hr, /wg, and /w boards.

It seems to be a replacement for wallbase.cc project.

Overview

It is in alpha phase now. Content uploaded to alpha.wallhaven.cc will likely be deleted after that phase is over.

The notice on the page reads:

Alpha Notice: We are expecting to start fresh at the end of the alpha phase. The alpha is only intended as a sneak peek and a quick and dirty bug test.

We should archive the content on alpha.wallhaven.cc.

Alpha phase ending soon. https://alpha.wallhaven.cc/forums/post/16193#post-16193

Yeah, we're very close to ending alpha now. There will be a few "big" changes that will be breaking for a lot of 3rd parties, so we're going to try and make sure to have a decent plan to allow a bit of time to react before we just break everything for them.

Work thus far

Some page analysis.

Site Specifics

The structure is very similar to wallbase.cc. Scraping is very easy. Some urls have changed a bit.

Stats:

  • Around 700k wallpapers so far.
  • Per day around 2.5k new wallpapers are uploaded.

Data:

  • Categories: alpha.wallhaven.cc/tags/id
  • Tags: alpha.wallhaven.cc/tag/id
  • Wallpapers: alpha.wallhaven.cc/wallpaper/id
  • Users: alpha.wallhaven.cc/user/id

Media:

  • Wallpapers: alpha.wallhaven.cc/wallpapers/full/wallhaven-ID(.jpg/.png)

Other notes:

  • Tags can have aliases. This seems to be new. It's kinda cool, I think.
  • The domain implements rate limiting or the infrastructure is a lot slower compared to the wallbase.cc infrastructure.