Difference between revisions of "Google Code"
(→URL lists: reorgniase) |
m (→URL lists) |
||
Line 27: | Line 27: | ||
Some seeds for site discovery: | Some seeds for site discovery: | ||
* Underway: Scrape Google Code Search | * Underway: Scrape Google Code Search | ||
** | ** Enumerate a list of labels, then fetch results for each label. | ||
*** [http://paste.archivingyoursh.it/codaqupuce.avrasm '''Phase 1''']. | |||
** Google Code search results can be grabbed in packs of 100, just add "&num=100" to the end of the URL. | ** Google Code search results can be grabbed in packs of 100, just add "&num=100" to the end of the URL. | ||
** [http://paste.archivingyoursh.it/raw/fajesufise.vhdl '''Phase 1''']. Quick grep says 114,262 projects, plus 71,972 labels for further searching. | ** [http://paste.archivingyoursh.it/raw/fajesufise.vhdl '''Phase 1''']. Quick grep says 114,262 projects, plus 71,972 labels for further searching. | ||
Line 33: | Line 34: | ||
* [http://paste.archivingyoursh.it/raw/pehobejoxi List scraped from MediaWiki wikis] | * [http://paste.archivingyoursh.it/raw/pehobejoxi List scraped from MediaWiki wikis] | ||
* [http://paste.archivingyoursh.it/raw/yulugedasa List from FlossMole's data] (sorted from a possibly-incomplete survey in November 2012: http://flossdata.syr.edu/data/gc/) | * [http://paste.archivingyoursh.it/raw/yulugedasa List from FlossMole's data] (sorted from a possibly-incomplete survey in November 2012: http://flossdata.syr.edu/data/gc/) | ||
* [http://paste.archivingyoursh.it/jepivocine | * [http://paste.archivingyoursh.it/raw/jepivocine Links from Open Directory Project] | ||
* TODO: Scrape Google Search | * TODO: Scrape Google Search | ||
* TODO: Scrape Bing | * TODO: Scrape Bing |
Revision as of 00:06, 13 March 2015
Google Code | |
URL | Google Code[IA•Wcite•.today•MemWeb] |
Status | Closing |
Archiving status | Upcoming... |
Archiving type | Unknown |
IRC channel | #googlecodeblue (on hackint) |
Google Code (AKA Project Hosting) is a software repository that is owned by Google. It hosts only open source software paired with an open source license.[1]
Google Code allows people to commit their code into either a Subversion (SVN), Git or Mercurial repository. It has a downloads section for people to upload their software packages (with a quota limit of 4GB, can be increased upon request) and also a wiki for projects to document their work at. There is also an issue tracker to track bugs in the project's software.
Vital signs
Closing on 25th January, 2016[2].
Archiving
Archiving source code repositories is rather easy (and incremental). Just clone the git/hg repository, or checkout SVN repo. For SVN, make sure that you checkout all branches, not just trunk.
Archiving bugtrackers and the other stuff will be a bit harder.
A tool to export a repository to GitHub is available[3].
URL lists
Some seeds for site discovery:
- Underway: Scrape Google Code Search
- URLs from ArchiveTeam IRC logs
- List scraped from MediaWiki wikis
- List from FlossMole's data (sorted from a possibly-incomplete survey in November 2012: http://flossdata.syr.edu/data/gc/)
- Links from Open Directory Project
- TODO: Scrape Google Search
- TODO: Scrape Bing
- TODO: Scrape Twitter
- TODO: Scrape the Common Crawl Index
Tools
- FlossMole provides a set of tools to spider projects from GC