Difference between revisions of "Ispygames"

From Archiveteam
Jump to navigation Jump to search
(All the check domains I found so far)
Line 337: Line 337:
  
 
== Gamespy Domains ==
 
== Gamespy Domains ==
* http://planetcnc.gamespy.com
+
* http://planetcnc.gamespy.com - grabbed, checking for completeness
 
* http://planetthesims.gamespy.com
 
* http://planetthesims.gamespy.com
 
* http://norad.gamespy.com
 
* http://norad.gamespy.com
* http://planetfrontlines.gamespy.com
+
* http://planetfrontlines.gamespy.com - grabbed, checking for completeness
* http://planetcivilization.gamespy.com
+
* http://planetcivilization.gamespy.com - grabbed, checking for completeness
* http://planethalflife.gamespy.com
+
* http://planethalflife.gamespy.com - grabbed, checking for completeness
 
* http://bf2142portalservices.gamespy.com
 
* http://bf2142portalservices.gamespy.com
 
* http://arena.gamespy.com
 
* http://arena.gamespy.com
* http://planettransformers.gamespy.com
+
* http://planettransformers.gamespy.com - grabbed, checking for completeness
 
* http://www.gamespy.com
 
* http://www.gamespy.com
* http://planetcoh.gamespy.com
+
* http://planetcoh.gamespy.com - grabbed, checking for completeness
* http://planetbattlefield.gamespy.com
+
* http://planetbattlefield.gamespy.com - grabbed, checking for completeness
 
* http://planetcrysis.gamespy.com
 
* http://planetcrysis.gamespy.com
* http://planetresidentevil.gamespy.com
+
* http://planetresidentevil.gamespy.com - grabbed, checking for completeness
 
* http://planetxmen.gamespy.com
 
* http://planetxmen.gamespy.com
 
* http://bugsubmit.gamespy.com
 
* http://bugsubmit.gamespy.com
* http://planetquake.gamespy.com
+
* http://planetquake.gamespy.com - grabbed, checking for completeness
* http://planetgrandtheftauto.gamespy.com
+
* http://planetgrandtheftauto.gamespy.com - grabbed, checking for completeness
 
* http://planetthemovies.gamespy.com
 
* http://planetthemovies.gamespy.com
 
* http://forumplanet.gamespy.com
 
* http://forumplanet.gamespy.com
Line 362: Line 362:
 
* http://pc.gamespy.com
 
* http://pc.gamespy.com
 
* http://planettonyhawk.gamespy.com
 
* http://planettonyhawk.gamespy.com
* http://planetunreal.gamespy.com
+
* http://planetunreal.gamespy.com - grabbed, checking for completeness
 
* http://planetmedalofhonor.gamespy.com
 
* http://planetmedalofhonor.gamespy.com
 
* http://xbox360.gamespy.com
 
* http://xbox360.gamespy.com
 
* http://sslvpn.gamespy.com
 
* http://sslvpn.gamespy.com
* http://planetfallout.gamespy.com
+
* http://planetfallout.gamespy.com - grabbed, checking for completeness
* http://planetageofempires.gamespy.com
+
* http://planetageofempires.gamespy.com - grabbed, checking for completeness
* http://planetgearsofwar.gamespy.com
+
* http://planetgearsofwar.gamespy.com - grabbed, checking for completeness
 
* http://lanoirepc.d2gstore.gamespy.com
 
* http://lanoirepc.d2gstore.gamespy.com
* http://planetcallofduty.gamespy.com
+
* http://planetcallofduty.gamespy.com - grabbed, checking for completeness
 
* http://ds.gamespy.com
 
* http://ds.gamespy.com
* http://classicgaming.gamespy.com
+
* http://classicgaming.gamespy.com - grabbed, checking for completeness
* http://planetdoom.gamespy.com
+
* http://planetdoom.gamespy.com - grabbed, checking for completeness
* http://planetwwe.gamespy.com
+
* http://planetwwe.gamespy.com - grabbed, checking for completeness
 
* http://psp.gamespy.com
 
* http://psp.gamespy.com
 
* http://forums.gamespy.com
 
* http://forums.gamespy.com

Revision as of 23:44, 10 March 2013

The News

IGN hit with layoffs, 1UP, UGO and GameSpy shutting down
1UP, UGO and GameSpy to be shut down

The Problems

  • Once you start digging around these sites you find it to be a mess of inconsistent url schemes and content everywhere.
  • Some files are being hosted on MediaFire.
  • Based on tests the larger and older a site is the more that is missed by a wget crawl due to the url scheme.

What we know

  • We already have a list of almost all the domains involved
  • A clean list with dups and bad domains is already being process and will be posted here when complete.
  • Most of the sites are not that big, but a few are huge.

The plan

  • Save the sites and related content
  • Backup the twitter feeds for any associated accounts. All my tweets just takes a username and returns the max tweets possible.


wget test command

USER_AGENT="Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
SAVE_HOST="planetdoom.gamespy.com"

wget -e robots=off --mirror --page-requisites --waitretry 5 --timeout 60 \
--tries 10 --warc-header "operator: Archive Team" --warc-cdx \
"$SAVE_HOST" --warc-file="$SAVE_HOST" --wait 2 -U "$USER_AGENT" \
--span-hosts --domains=$SAVE_HOST,pcmedia.gamespy.com,pnmedia.gamespy.com,pspmedia.gamespy.com,oystatic.ignimgs.com

IGN domains


Gamespy Domains