From Archiveteam
Jump to navigation Jump to search

Geocities was a once very popular web hosting service founded in 1994 and purchased by Yahoo in 1999. On April 2009, Yahoo announced they would be closing Geocities "later this year". Apparently, Yahoo will offer a means to export user data sometime this summer.

The Archive Team Geocities Project

Archive Team has initiated a project to mirror Geocities before it goes away. Frequently asked questions about this project are Here.

Press review

Ars Technica: Started in 1994, Geocities was like the Facebook to Angelfire's MySpace—competing webpage services that allowed over-enthused HTML newbies to create artfully horrific webpages to represent themselves in the early days of the Internet. As anyone who has surfed through GeoCities over the years will tell you, an Internet without GeoCities is like a world of celluloid without Keanu Reeves flicks. The absence of GeoCities won't create a cultural void. Few will miss its passing. It's loaded mostly with hobbyist tribute pages, authored by penny-pinching cybersurfers who put up with primitive tools and gaudy ads in exchange for free hosting. Many of the pages were created years ago, and abandoned like bunny rabbits after Easter Sunday, Ugg boots after winter, and anything Reeves did after the first Matrix movie.
TechCrunch: One of the pioneers of web-hosting sites, GeoCities gave users personal publishing tools and created “neighborhoods” within its web platform for users to be able to create pages, add a picture, text, a guest book and a website counter. Long before MySpace, Geocities was known as a place where teenagers, college students, and eventually others could impose their own garish taste upon the rest of the world.
PC World: Of the 12 remaining GeoCities users, only one was available for comment. "Holy crap!" said the user, a red-faced fellow named Strong Bad. "The scroll buttons and animated GIFs on that site were unbeatable."
The Register: A group of web preservationists called the Archive Team is trying to save most of Geocities for the ages before Yahoo! erases the beloved old-school web-hosting service from the face of the internet.

Archiveteam mentionings

Slashdot: jamie found this note from Jason Scott, who organizes the Archive Team. They are busy downloading as much of Geocities as they can before it vanishes from the Net after Yahoo pulled the plug.
Jason appeared on the April 29, 2009 edition of Future Tense to discuss why Geocities should be rescued.

Saving Geocities


Geocities Homestead Neighborhoods and Suburbs, although having not been updated since 2007, gives an overview of the Geocities history of Neighborhood organization.

Although simple directory listings aren't accessible user's accounts, you might be able to obtain Apache-style directory listing for their subdirectories. For example, by stripping off the page filename for, we can obtain an index for the subdirectory; the benefit of this is that there may exist files which are not linked internally or externally, so crawlers are not made aware of them. Unfortunately, it seems many users do not organize their content into subdirectories, instead preferring to dump all files directly into the user directory. Also, they may have been good webmasters and provided a directory index which overrides directory listings.

The Size of Geocities Accounts

We're tracking how big a given account can store. So far, we know this (some news is contradictory, we're looking for press releases):

  • 1997: 2mb Limit for Geocities. [1]
  • 1998: 15mb limit for small business service [2]
  • 1999: Geocities has 12 terabytes of storage. [3]
  • 2001: 15mb for Geocities, 25mb for $8.95 a month [4]
  • 2002: 15mb Limit for Geocities.
  • 2002: 25mb for the newly introduced "Geocities Plus"
  • 2003: 25mb for Geocities Plus (As of June)
  • 2005: 75mb for Geocities Plus (As of January)
  • 2005: 25mb for Geocities Plus (As of April)

Yahoo's Site Explorer shows 23M html pages in Yahoo's index as of April 29th, 2009.


Users involved

  • User:Jscott, Joey paulprote and many others are downloading the main stuff.
  • User:Soult is downloading at [6]
  • User:Bbot is mirroring downloaded content.
  • User:Scumola is crawling geocities looking building URL lists