Difference between revisions of "MSN TV"

From Archiveteam
Jump to navigation Jump to search
(Add link to IA item)
 
(16 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{Infobox project
{{Infobox project
| title = MSN TV
| image = MSN TV.png
| image = MSN TV.png
| description = Get in your comfort zone
| description = Get in your comfort zone
| URL = http://www.msntv.com/
| URL = http://www.msntv.com/
| project_status = {{closing}}
| project_status = {{offline}}
| source = None
| archiving_status = {{saved}}
| archiving_status = {{notsavedyet}}
| data = {{IA item|webtv_fire_grab}} (partial)
| irc = #archiveteam
| tracker = None
}}
}}
'''MSN TV''', formerly known as '''WebTV''', is a bundled thin client and online service which is closing on September 30, 2013. Microsoft offers a bunch of migration options for MSN TV users, though it does not appear that an easy transition route will be offered for home pages. Many of these date back to the late 1990s, and all will be lost.
'''MSN TV''', formerly known as '''WebTV''', is a bundled thin client and online service which is closing on September 30, 2013. Microsoft offers a bunch of migration options for MSN TV users, though it does not appear that an easy transition route will be offered for home pages. Many of these date back to the early 2000s, and all will be lost.


==Closure notice==
==Closure notice==
Line 16: Line 13:
:The MSN TV service will be closing on September 30, 2013. If you are an MSN TV subscriber, [http://www.msntv.com/ClosureFAQ.asp click here] to read the Closure FAQ for details on what you should do before the service ends. You will also receive an email and letter providing more information.
:The MSN TV service will be closing on September 30, 2013. If you are an MSN TV subscriber, [http://www.msntv.com/ClosureFAQ.asp click here] to read the Closure FAQ for details on what you should do before the service ends. You will also receive an email and letter providing more information.


==Initial exploration==
==Exploration==
URLs of MSN TV web pages take the form of either <code>community.webtv.net</code>, or <code>community-X.webtv.net</code> where X is a number from 1 to 4.
URLs of MSN TV web pages take the form of either <code>community.webtv.net</code>, or <code>community-X.webtv.net</code> where X is a number from 1 to 4.
Sometimes random junk gets added into the start of the path part of the URL, in the form of <code>/@HH!XX!XX!XXXXXXXXXXXX/</code> for example, http://community-2.webtv.net/@HH!17!BF!62DA2CCF370F/TvFoutreach/COUNTDOWNTO666/ which is the same as http://community-2.webtv.net/TvFoutreach/COUNTDOWNTO666/. This will have to be sed'd out to remove duplicates.
Sometimes @Lookup will appear in the same place in the URL. This, too, will need to be sed'd out.
Pages are created with predictable filenames. So if we have:
<tt><nowiki>http://community-2.webtv.net/AdamWonder/SITEOFWONDERS/page8.html</nowiki></tt>
Then chances are very good that <tt>page2.html</tt>, <tt>page3.html</tt>, <tt>page4.html</tt>, etc etc in the same directory, as well as a <tt>index.html</tt>, will exist as well. It'll also be worth checking for higher numbers as well. This <tt>pageX.html</tt> pattern seems to be a default name for MSN TV's page creator; a quick scan of the lists shows that there are ''no'' files on the community-* sites that end in ".html" which are not <tt>index.html</tt> or <tt>pageX.html</tt>
'''The below lists are raw, incomplete and overlapping.''' When the time comes to crawl MSN TV, the below lists will need to be merged, de-duplicated, and further processed to find more URLs.


* [http://paste.archivingyoursh.it/ferusiveye.avrasm List of WebTV community sites, scraped from Wikimedia wikis]
* [http://paste.archivingyoursh.it/ferusiveye.avrasm List of WebTV community sites, scraped from Wikimedia wikis]
* [http://paste.archivingyoursh.it/vasusigayu.avrasm List of URLs from a 2009-ish dump of the Open Directory Project], some may no longer work
* [http://paste.archivingyoursh.it/vasusigayu.avrasm List of URLs from a 2009-ish dump of the Open Directory Project], some may no longer work
* [http://paste.archivingyoursh.it/dimidinaga.avrasm List of URLs scraped from a Twitter search]. Oddly, the [http://paste.archivingyoursh.it/gilekefope.avrasm original URLs before filtering] were in a weird format; for example, http://community-2.webtv.net/@HH!17!BF!62DA2CCF370F/TvFoutreach/COUNTDOWNTO666/ which is the same as http://community-2.webtv.net/TvFoutreach/COUNTDOWNTO666/
* [http://paste.archivingyoursh.it/dimidinaga.avrasm List of URLs scraped from a Twitter search]. [http://paste.archivingyoursh.it/gilekefope.avrasm Original URLs before filtering]
* [http://paste.archivingyoursh.it/ficequtape.avrasm List of URLS from Bing API searches] on site:webtv.net (12,000+). [http://paste.archivingyoursh.it/fuleciwequ.avrasm Older version of this list].
* [http://paste.archivingyoursh.it/fuwufadaji.avrasm List of URLs in the Common Crawl index]. (1746)
Other stuff:
* [http://paste.archivingyoursh.it/gipuxumema.avrasm *.webtv.net DNS names], might be for internal use, may or may not be anything public-facing here
* [http://paste.archivingyoursh.it/tobefifovo.dos List of URLs from URLTeam], thanks to soultcer. [http://paste.debian.net/20344/ Original unfiltered list]
 
== Archives ==
 
Archives of what was able to be grabbed are in [https://archive.org/details/webtv_fire_grab webtv_fire_grab].
 
{{navigation box}}
[[Category:Microsoft]]
[[Category:Web hosting]]

Latest revision as of 22:17, 24 November 2021

MSN TV
Get in your comfort zone
Get in your comfort zone
URL http://www.msntv.com/
Status Offline
Archiving status Saved!
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)
Data[how to use] webtv_fire_grab (partial)

MSN TV, formerly known as WebTV, is a bundled thin client and online service which is closing on September 30, 2013. Microsoft offers a bunch of migration options for MSN TV users, though it does not appear that an easy transition route will be offered for home pages. Many of these date back to the early 2000s, and all will be lost.

Closure notice

Service Closure Announcement
The MSN TV service will be closing on September 30, 2013. If you are an MSN TV subscriber, click here to read the Closure FAQ for details on what you should do before the service ends. You will also receive an email and letter providing more information.

Exploration

URLs of MSN TV web pages take the form of either community.webtv.net, or community-X.webtv.net where X is a number from 1 to 4.

Sometimes random junk gets added into the start of the path part of the URL, in the form of /@HH!XX!XX!XXXXXXXXXXXX/ for example, http://community-2.webtv.net/@HH!17!BF!62DA2CCF370F/TvFoutreach/COUNTDOWNTO666/ which is the same as http://community-2.webtv.net/TvFoutreach/COUNTDOWNTO666/. This will have to be sed'd out to remove duplicates.

Sometimes @Lookup will appear in the same place in the URL. This, too, will need to be sed'd out.

Pages are created with predictable filenames. So if we have:

http://community-2.webtv.net/AdamWonder/SITEOFWONDERS/page8.html

Then chances are very good that page2.html, page3.html, page4.html, etc etc in the same directory, as well as a index.html, will exist as well. It'll also be worth checking for higher numbers as well. This pageX.html pattern seems to be a default name for MSN TV's page creator; a quick scan of the lists shows that there are no files on the community-* sites that end in ".html" which are not index.html or pageX.html

The below lists are raw, incomplete and overlapping. When the time comes to crawl MSN TV, the below lists will need to be merged, de-duplicated, and further processed to find more URLs.

Other stuff:

Archives

Archives of what was able to be grabbed are in webtv_fire_grab.