Difference between revisions of "Google Code"
Mithrandir (talk | contribs) (→External links: add article on Google Code) |
m (→URL lists) |
||
Line 34: | Line 34: | ||
* TODO: Scrape the Open Directory Project | * TODO: Scrape the Open Directory Project | ||
* TODO: Scrape the Common Crawl Index | * TODO: Scrape the Common Crawl Index | ||
* [http://paste.archivingyoursh.it/raw/himupisime URLs from ArchiveTeam IRC logs] | |||
* [http://paste.archivingyoursh.it/himupisime | |||
* [http://paste.archivingyoursh.it/raw/pehobejoxi List scraped from MediaWiki wikis] | * [http://paste.archivingyoursh.it/raw/pehobejoxi List scraped from MediaWiki wikis] | ||
* [http://paste.archivingyoursh.it/raw/yulugedasa List from FlossMole's data] (sorted from a possibly-incomplete survey in November 2012: http://flossdata.syr.edu/data/gc/) | * [http://paste.archivingyoursh.it/raw/yulugedasa List from FlossMole's data] (sorted from a possibly-incomplete survey in November 2012: http://flossdata.syr.edu/data/gc/) |
Revision as of 21:03, 12 March 2015
Google Code | |
URL | Google Code[IA•Wcite•.today•MemWeb] |
Status | Closing |
Archiving status | Upcoming... |
Archiving type | Unknown |
IRC channel | #googlecodeblue (on hackint) |
Google Code (AKA Project Hosting) is a software repository that is owned by Google. It hosts only open source software paired with an open source license.[1]
Google Code allows people to commit their code into either a Subversion (SVN), Git or Mercurial repository. It has a downloads section for people to upload their software packages (with a quota limit of 4GB, can be increased upon request) and also a wiki for projects to document their work at. There is also an issue tracker to track bugs in the project's software.
Vital signs
Closing on 25th January, 2016[2].
Archiving
Archiving source code repositories is rather easy (and incremental). Just clone the git/hg repository, or checkout SVN repo. For SVN, make sure that you checkout all branches, not just trunk.
Archiving bugtrackers and the other stuff will be a bit harder.
A tool to export a repository to GitHub is available[3].
URL lists
Some seeds for site discovery:
- TODO: Scrape Google Search
- TODO: Scrape Google Code Search
- Fetch results for each label, for example: label:javascript
- Google Code search results can be grabbed in packs of 100, just add "&num=100" to the end of the URL.
- TODO: Scrape Bing
- TODO: Scrape Twitter
- TODO: Scrape the Open Directory Project
- TODO: Scrape the Common Crawl Index
- URLs from ArchiveTeam IRC logs
- List scraped from MediaWiki wikis
- List from FlossMole's data (sorted from a possibly-incomplete survey in November 2012: http://flossdata.syr.edu/data/gc/)
Tools
- FlossMole provides a set of tools to spider projects from GC