Difference between revisions of "My Opera"

From Archiveteam
Jump to navigation Jump to search
(archiving status and screenshot)
(added info)
Line 20: Line 20:
The shutdown affects blogs/comments, files, email, and their forums ('The most important existing threads will be moved...').
The shutdown affects blogs/comments, files, email, and their forums ('The most important existing threads will be moved...').


== Archiving status ==
== Archiving and contributing  ==


[[User:Mithrandir]] noticed that the Wayback Machine has limited coverage and is making an effort to crawl some parts of the site.
=== Phase 0: Initial crawl ===
More information about the site structure will be added soon.
* Grab a seed list of users from the location pages. (In progress by [[User:Mithrandir]])
* Grab a list of links to all forum topics and all pages. (In progress by [[User:Mithrandir]])
 
== Site structure notes ==
 
* [http://my.opera.com/Tamil/albums/showpic.dml?album=196828&picture=20891221 As of 2009], there were around 16 million users.
* LOTS of old data, abandoned accounts, etc.
* Forum topics go all the way back to [http://my.opera.com/community/forums/topic.dml?id=1 Post #1] dated 7 September, 2001.
** It looks like there are nearly 2 million topics.
* There's a [http://files.myopera.com sub-domain] that houses user uploaded data.
** Initial Bing-crawled list [https://dl.dropboxusercontent.com/u/94483242/sorted.txt.7z here] (5572 urls)
** Tons of webpages, pdfs, images, archives, etc.
** Some of this data is linked to in blog posts, so we should crawl blogs for this as well.
 
* Each user can have:
** An about page (http://my.opera.com/USERNAME/about)
** A blog (http://my.opera.com/USERNAME/blog)
** A photo album (http://my.opera.com/USERNAME/albums)
** Friends (http://my.opera.com/USERNAME/friends)
** Favorite users/blog posts/photos (http://my.opera.com/USERNAME/favorites)
** Site featured wallpaper (http://my.opera.com/community/opera/wallpapers/)
** 2GB of space for files, except executables.
** Posts on the Opera forums (easier just to crawl all the topics instead)
** Groups (listed on the about page)
*** Group members are located at http://my.opera.com/GROUPNAME/members
*** Groups are basically the same as users.
** Recent visitors box on the user page.
 
=== Username discovery ===
 
* This is a little tricky, as the only pages with a large list of usernames are the [http://my.opera.com/community/members/location location] pages.
** Each country list is paginated, showing at most 72 users per page.
** For large countries, going past page ~325 returns many 503 errors.
** Assuming an average of 100 retrievable pages per country and 246 countries, that's 72*100*246 = 1,771,200 users.
** Excludes users who haven't set their location, which seems to be a lot.
** This should be good for a seed list.
* Forum posts, blog comments, friend lists, and groups seem to be the best ways to get the most usernames. (Maybe throw in the recent visitors list too.)


== References ==
== References ==

Revision as of 05:34, 13 November 2013

My Opera
My Opera 2013-10-31 20-11-38.png
URL my.opera.com[IAWcite.todayMemWeb], files.myopera.com[IAWcite.todayMemWeb]
Status Closing
Archiving status In progress...
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)

My Opera is a social media website for the Opera browser. Originally started as just a support forum, it later expanded to include blogging, image/file hosting, and email.

On October 31, 2013, Opera announced they would shut My Opera down on March 1, 2014

Shutdown notice

... The explosion of these sites and the amount of resources we need to maintain our own service has changed our outlook on My Opera. We had a good run for many years, but we believe your content could have a better home elsewhere, so we have made the decision to shut down My Opera as of March 1, 2014. [1]

The shutdown affects blogs/comments, files, email, and their forums ('The most important existing threads will be moved...').

Archiving and contributing

Phase 0: Initial crawl

  • Grab a seed list of users from the location pages. (In progress by User:Mithrandir)
  • Grab a list of links to all forum topics and all pages. (In progress by User:Mithrandir)

Site structure notes

  • As of 2009, there were around 16 million users.
  • LOTS of old data, abandoned accounts, etc.
  • Forum topics go all the way back to Post #1 dated 7 September, 2001.
    • It looks like there are nearly 2 million topics.
  • There's a sub-domain that houses user uploaded data.
    • Initial Bing-crawled list here (5572 urls)
    • Tons of webpages, pdfs, images, archives, etc.
    • Some of this data is linked to in blog posts, so we should crawl blogs for this as well.

Username discovery

  • This is a little tricky, as the only pages with a large list of usernames are the location pages.
    • Each country list is paginated, showing at most 72 users per page.
    • For large countries, going past page ~325 returns many 503 errors.
    • Assuming an average of 100 retrievable pages per country and 246 countries, that's 72*100*246 = 1,771,200 users.
    • Excludes users who haven't set their location, which seems to be a lot.
    • This should be good for a seed list.
  • Forum posts, blog comments, friend lists, and groups seem to be the best ways to get the most usernames. (Maybe throw in the recent visitors list too.)

References