Difference between revisions of "Viddy"

From Archiveteam
Jump to navigation Jump to search
(Updated status on API scraping)
(status update, extension)
Line 6: Line 6:
 
| URL = http://www.viddy.com
 
| URL = http://www.viddy.com
 
| project_status = {{closing}}
 
| project_status = {{closing}}
| archiving_status = {{notsavedyet}}
+
| archiving_status = {{in progress}}
 
| source = [https://github.com/ArchiveTeam/viddy-grab viddy-grab], [https://github.com/ArchiveTeam/viddy-items viddy-items]
 
| source = [https://github.com/ArchiveTeam/viddy-grab viddy-grab], [https://github.com/ArchiveTeam/viddy-items viddy-items]
 
| tracker = [http://tracker.archiveteam.org/viddy/ viddy]
 
| tracker = [http://tracker.archiveteam.org/viddy/ viddy]
Line 28: Line 28:
 
* www.viddy.com/media/af2745dd-d085-4b3a-9017-085f72f6334a
 
* www.viddy.com/media/af2745dd-d085-4b3a-9017-085f72f6334a
 
* www.viddy.it/Syv46W
 
* www.viddy.it/Syv46W
 +
 +
=== API Scraping ===
 +
 +
Through scraping the public API (reference for dedicated API [http://developer.viddy.com/docs here] - this mostly applies to the public API: http://viddy.com/v1/... ), computerfreak is compiling a list of user IDs (currently ~4 million retrieved) and media IDs (currently ~500k retrieved). These are available [http://archiveteam.nickproductions.com/viddy/ here] (user IDs in the viddy_user* files, and the media data (UserID, MediaID, comment_count, like_count, view_count) in the viddy_medias* files).
 +
 +
The user IDs were obtained by doing a BFS on user's followers (script stopped after about 2 days of running), and the medias were then obtained by getting the media listings for each user. <s>Obviously this is not every single user or every single video, but it is certainly a good starting point.</s> As of Nov. 29, computerfreak is continuously running a script to get new user IDs and is pulling off the new IDs and fetching the associated media info.
  
 
== How can I help? ==
 
== How can I help? ==
Line 33: Line 39:
 
=== Content discovery ===
 
=== Content discovery ===
  
Viddy item discovery is being done through the [[URLTeam]] project, scraping the viddy.it shortener. If you want to take part, join that project, see details on its page. (You'll get tasks of other shorteners as well, they are distributed randomly. You should start multiple threads.)
+
Viddy item discovery is being done through the [[URLTeam]] project, scraping the [http://www.viddy.it www.viddy.it] shortener. If you want to take part, join that project, see details on its page. (You'll get tasks of other shorteners as well, they are distributed randomly. You should start multiple threads.)
 +
 
 +
=== Content grab ===
 +
 
 +
==== Running a Warrior ====
 +
 
 +
You can start up a [[Warrior]] and there select ''Viddy''. (If you don't really care what you are archiving, select ''ArchiveTeam's Choice'' instead, as at some points ArchiveTeam may priorize another project.)
 +
 
 +
==== Running the script manually ====
 +
 
 +
If you use Linux and you're a bit familiar with it, you can try running the script directly.
 +
 
 +
The instructions can be found at [https://github.com/ArchiveTeam/viddy-grab github.com/ArchiveTeam/viddy-grab].
 +
 
 +
{| class="mw-collapsible mw-collapsed" style="text-align:left;"
 +
! Some additional information
 +
|-
 +
| Don't forget to replace YOURNICKHERE with your nickname.
 +
 
 +
The number after <code>--concurrent</code> determines how many threads run at the same time. You can increase this number if your resources (RAM, CPU, bandwidth) are sufficient. However, if you constantly see messages about rate limiting, there is no need to increase the concurrency.
 +
 
 +
If you want to stop the script, please do it gracefully if possible. To do so, create an empty file named '''STOP''' in the folder of the script (terminal command: <code>touch STOP</code>). The script finishes the current item(s) and stops only after that. (If you kill the script immediately, the items get broken, and they will need to be reassigned to another user.) – Before starting the script again, don't forget to remove the STOP file.
  
==== API Scraping ====
+
If you see "Project code is out of date", kill the script, go to its folder (<code>cd viddy-grab</code>) and issue <code><nowiki>git pull https://github.com/ArchiveTeam/</nowiki>viddy-grab</code>. After the updating has finished, re-launch the script.
 +
|}
  
Through scraping the public API (reference for dedicated API [http://developer.viddy.com/docs here] - this mostly applies to the public API: http://viddy.com/v1/... ), computerfreak is compiling a list of user IDs (currently ~4 million retrieved) and media IDs (currently ~500k retrieved). These are available [http://archiveteam.nickproductions.com/viddy/ here] (user IDs in the viddy_user* files, and the media data (UserID, MediaID, comment_count, like_count, view_count) in the viddy_medias* files).
+
=== Donating to the Internet Archive ===
  
The user IDs were obtained by doing a BFS on user's followers (script stopped after about 2 days of running), and the medias were then obtained by getting the media listings for each user. <s>Obviously this is not every single user or every single video, but it is certainly a good starting point.</s> As of Nov. 29, computerfreak is continuously running a script to get new user IDs and is pulling off the new IDs and fetching the associated media info.
+
Content downloaded by the ArchiveTeam will be uploaded to the [[Internet Archive]], where it will be stored and be available – hopefully – forever. However, storing it costs thousands of dollars in the long run. So, if you can afford, please consider donating to the Internet Archive, so that this piece of history can be kept for us all. http://archive.org/donate
  
 
== Download Your Data ==
 
== Download Your Data ==

Revision as of 19:35, 1 December 2014

Viddy
Viddy logo
Viddy - Capture Life in the Moment..png
URL http://www.viddy.com
Project status Closing
Archiving status In progress...
Project source viddy-grab, viddy-items
Project tracker viddy
IRC channel #viddiot (on EFnet)
Project lead Unknown

Viddy is a video hosting service like Instagram. The service has its own url shortener service viddy.it, just to make sure the impact of the closure is even worse.

Viddy announced to be shutting down December 15th 2014 after joining Fullscreen.

Shutdown

Message on viddy.com website:

"Viddy is powering down, but you’ll hear from us again soon…
We’ve been building something new and exciting since joining Fullscreen in January, and as we focus on the future, Viddy will be pulled from the App Store on Tuesday, November 4th, 2014, and will officially shut down Monday, December 15th, 2014."

Site Structure

  • www.viddy.com/JustinBieber/v/soundcheck-LDS3p9
  • www.viddy.com/media/af2745dd-d085-4b3a-9017-085f72f6334a
  • www.viddy.it/Syv46W

API Scraping

Through scraping the public API (reference for dedicated API here - this mostly applies to the public API: http://viddy.com/v1/... ), computerfreak is compiling a list of user IDs (currently ~4 million retrieved) and media IDs (currently ~500k retrieved). These are available here (user IDs in the viddy_user* files, and the media data (UserID, MediaID, comment_count, like_count, view_count) in the viddy_medias* files).

The user IDs were obtained by doing a BFS on user's followers (script stopped after about 2 days of running), and the medias were then obtained by getting the media listings for each user. Obviously this is not every single user or every single video, but it is certainly a good starting point. As of Nov. 29, computerfreak is continuously running a script to get new user IDs and is pulling off the new IDs and fetching the associated media info.

How can I help?

Content discovery

Viddy item discovery is being done through the URLTeam project, scraping the www.viddy.it shortener. If you want to take part, join that project, see details on its page. (You'll get tasks of other shorteners as well, they are distributed randomly. You should start multiple threads.)

Content grab

Running a Warrior

You can start up a Warrior and there select Viddy. (If you don't really care what you are archiving, select ArchiveTeam's Choice instead, as at some points ArchiveTeam may priorize another project.)

Running the script manually

If you use Linux and you're a bit familiar with it, you can try running the script directly.

The instructions can be found at github.com/ArchiveTeam/viddy-grab.

Some additional information
Don't forget to replace YOURNICKHERE with your nickname.

The number after --concurrent determines how many threads run at the same time. You can increase this number if your resources (RAM, CPU, bandwidth) are sufficient. However, if you constantly see messages about rate limiting, there is no need to increase the concurrency.

If you want to stop the script, please do it gracefully if possible. To do so, create an empty file named STOP in the folder of the script (terminal command: touch STOP). The script finishes the current item(s) and stops only after that. (If you kill the script immediately, the items get broken, and they will need to be reassigned to another user.) – Before starting the script again, don't forget to remove the STOP file.

If you see "Project code is out of date", kill the script, go to its folder (cd viddy-grab) and issue git pull https://github.com/ArchiveTeam/viddy-grab. After the updating has finished, re-launch the script.

Donating to the Internet Archive

Content downloaded by the ArchiveTeam will be uploaded to the Internet Archive, where it will be stored and be available – hopefully – forever. However, storing it costs thousands of dollars in the long run. So, if you can afford, please consider donating to the Internet Archive, so that this piece of history can be kept for us all. http://archive.org/donate

Download Your Data

"We’ve made it easy for you to download your content before this date [December 15th 2014]! Please click the button [...] and log in with your username / password to get started."[1]

The link: http://www.viddy.com/download

References