Difference between revisions of "Google Code"

From Archiveteam
Jump to navigation Jump to search
m (→‎Archives: wikilink)
(24 intermediate revisions by 16 users not shown)
Line 4: Line 4:
| description =  
| description =  
| URL = {{url|1=http://code.google.com|2=Google Code}}
| URL = {{url|1=http://code.google.com|2=Google Code}}
| project_status = {{closing}}
| project_status = {{closed}}
| archiving_status = {{upcoming}}
| archiving_status = {{inprogress}} ([https://archive.org/search.php?query=googlecode&sort=-publicdate WARC files being uploaded])
| tracker = [http://tracker.archiveteam.org/googlecode googlecode]
| source = [https://github.com/ArchiveTeam/googlecode-grab googlecode-grab]
| irc = googlecodeblue
| irc = googlecodeblue
}}
}}


'''Google Code''' (AKA Project Hosting) is a software repository that is owned by [[Google]]. It hosts only open source software paired with an open source license.<ref>[https://code.google.com/p/support/wiki/FAQ#Hosting_Your_Open_Source_Project_on_Google_Code FAQ - support - Project Hosting on Google Code FAQ - User support for Google Project Hosting - Google Project Hosting]</ref>
'''Google Code''' (AKA Project Hosting) was a software repository owned by [[Google]]. It hosted only open source software paired with an open source license.<ref>[https://code.google.com/p/support/wiki/FAQ#Hosting_Your_Open_Source_Project_on_Google_Code FAQ - support - Project Hosting on Google Code FAQ - User support for Google Project Hosting - Google Project Hosting]</ref>


Google Code allows people to commit their code into either a Subversion (SVN), Git or Mercurial repository. It has a downloads section for people to upload their software packages (with a quota limit of 4GB, can be increased upon request) and also a wiki for projects to document their work at. There is also an issue tracker to track bugs in the project's software.
Google Code allowed people to commit their code into either a Subversion (SVN), Git or Mercurial repository. It had a downloads section for people to upload their software packages (with a quota limit of 4GB, could be increased upon request) and also a wiki for projects to document their work at. There was also an issue tracker to track bugs in the project's software.
 
Google Code officially shut down on January 25, 2016, but they left a public archive.


== Vital signs ==
== Vital signs ==


Closing on 25th January, 2016<ref>[http://google-opensource.blogspot.com/ncr/2015/03/farewell-to-google-code.html Bidding farewell to Google Code]</ref>.
The site went read-only on 24th August, 2015, and was closed on 25th January, 2016.<ref>[http://google-opensource.blogspot.com/ncr/2015/03/farewell-to-google-code.html Bidding farewell to Google Code]</ref> They left a public archive, though.


== Archiving ==
== Archiving ==
Archiving source code repositories is rather easy (and incremental). Just clone the git/hg repository, or checkout SVN repo. For SVN, make sure that you checkout all branches, not just trunk.
Archiving source code repositories is rather easy (and incremental). Just clone the git/hg repository, or checkout SVN repo. For SVN, make sure that you checkout all branches, not just trunk. Ideally for svn one would use "svnrdump dump REPO" to dump not only the latest revision of the repository, but the complete history.


Archiving bugtrackers and the other stuff will be a bit harder.
Archiving bugtrackers and the other stuff will be a bit harder.


A tool to export a repository to GitHub is available<ref>[http://code.google.com/export-to-github Export to GitHub - Google Code]</ref>.
A tool to export a repository to GitHub is available<ref>[http://code.google.com/export-to-github Export to GitHub - Google Code]</ref>. If the repository has been [https://github.com/search?utf8=%E2%9C%93&q=%22Automatically+exported+from+code.google.com%22&type=Repositories&ref=searchresults migrated to GitHub], the project is no longer available for access.
 
ArchiveTeam started to save Google Code on December 18, 2015, as a [[Warrior]] project.
 
After the closure, they left a public archive, but that is missing some of the original information.<ref>http://archive.fart.website/bin/irclogger_log/archiveteam?date=2016-03-12,Sat&sel=10#l6</ref> Although the original content got hidden from the public, ArchiveTeam got access and went on saving it, so that the Wayback Machine can receive a full copy.<ref>http://archive.fart.website/bin/irclogger_log/archiveteam?date=2016-02-26,Fri&sel=97#l93</ref><ref>http://archive.fart.website/bin/irclogger_log/archiveteam?date=2016-03-05,Sat&sel=474#l470</ref>


=== URL lists ===
=== URL lists ===
Line 28: Line 36:
* Underway: Scrape Google Code Search
* Underway: Scrape Google Code Search
** Enumerate a list of labels, then fetch results for each label.
** Enumerate a list of labels, then fetch results for each label.
*** [http://paste.archivingyoursh.it/luwobimana.avrasm '''Phase 2.1''']. (needs mojibaked labels filtered)
*** [http://paste.archivingyoursh.it/govetoviko.avrasm '''Phase 2.5'''].
** Google Code search results can be grabbed in packs of 100, just add "&num=100" to the end of the URL.
** Google Code search results can be grabbed in packs of 100, just add "&num=100" to the end of the URL.
** [http://paste.archivingyoursh.it/raw/fajesufise.vhdl '''Phase 1''']. Quick grep says 114,262 projects, plus 71,972 labels for further searching.
** [http://paste.archivingyoursh.it/raw/fajesufise.vhdl '''Phase 1''']. Quick grep says 114,262 projects, plus 71,972 labels for further searching.
** [http://filebin.ca/1uke8goEY4Ls/projects 463,061 projects] (18M text file)
* [http://paste.archivingyoursh.it/raw/himupisime URLs from ArchiveTeam IRC logs]
* [http://paste.archivingyoursh.it/raw/himupisime URLs from ArchiveTeam IRC logs]
* [http://paste.archivingyoursh.it/raw/pehobejoxi List scraped from MediaWiki wikis]
* [http://paste.archivingyoursh.it/raw/pehobejoxi List scraped from MediaWiki wikis]
* [http://paste.archivingyoursh.it/raw/yulugedasa List from FlossMole's data] (sorted from a possibly-incomplete survey in November 2012: http://flossdata.syr.edu/data/gc/)
* [http://paste.archivingyoursh.it/raw/yulugedasa List from FlossMole's data] (sorted from a possibly-incomplete survey in November 2012: http://flossdata.syr.edu/data/gc/)
* [http://paste.archivingyoursh.it/raw/jepivocine Links from Open Directory Project]
* [http://paste.archivingyoursh.it/raw/jepivocine Links from Open Directory Project]
* [https://archive.org/download/code-urls-2015may03/code-urls-2015may03.txt Links from Kyan]
* TODO: Scrape Google Search
* TODO: Scrape Google Search
* TODO: Scrape Bing
* TODO: Scrape Bing
* TODO: Scrape Twitter
* TODO: Scrape Twitter
* TODO: Scrape the Common Crawl Index
* TODO: Scrape the Common Crawl Index
* TODO: Scrape URLTeam dumps
* TODO: ask chris dibona for a complete list of projects


===Tools ===
===Tools ===
* FlossMole provides [https://code.google.com/p/flossmole/source/browse/#svn%2FFLOSSmoleGoogleCode%2Fsrc a set of tools] to spider projects from GC
* FlossMole provides [https://code.google.com/p/flossmole/source/browse/#svn%2FFLOSSmoleGoogleCode%2Fsrc a set of tools] to spider projects from GC
== Archives ==
Google Code archives are (being) uploaded to https://archive.org/details/archiveteam_googlecode, in [[WARC]] format.
"The Google Code Archive (https://code.google.com/archive/) contains the data found on the Google Code Project Hosting Service, which will be turned down in early 2016. This archive contains over 1.4 million projects, 1.5 million downloads, and 12.6 million issues."


== References ==
== References ==

Revision as of 19:52, 27 February 2017

Google Code
Google Code 1303511937361.png
URL Google Code[IAWcite.todayMemWeb]
Status Offline
Archiving status In progress... (WARC files being uploaded)
Archiving type Unknown
Project source googlecode-grab
Project tracker googlecode
IRC channel #googlecodeblue (on hackint)

Google Code (AKA Project Hosting) was a software repository owned by Google. It hosted only open source software paired with an open source license.[1]

Google Code allowed people to commit their code into either a Subversion (SVN), Git or Mercurial repository. It had a downloads section for people to upload their software packages (with a quota limit of 4GB, could be increased upon request) and also a wiki for projects to document their work at. There was also an issue tracker to track bugs in the project's software.

Google Code officially shut down on January 25, 2016, but they left a public archive.

Vital signs

The site went read-only on 24th August, 2015, and was closed on 25th January, 2016.[2] They left a public archive, though.

Archiving

Archiving source code repositories is rather easy (and incremental). Just clone the git/hg repository, or checkout SVN repo. For SVN, make sure that you checkout all branches, not just trunk. Ideally for svn one would use "svnrdump dump REPO" to dump not only the latest revision of the repository, but the complete history.

Archiving bugtrackers and the other stuff will be a bit harder.

A tool to export a repository to GitHub is available[3]. If the repository has been migrated to GitHub, the project is no longer available for access.

ArchiveTeam started to save Google Code on December 18, 2015, as a Warrior project.

After the closure, they left a public archive, but that is missing some of the original information.[4] Although the original content got hidden from the public, ArchiveTeam got access and went on saving it, so that the Wayback Machine can receive a full copy.[5][6]

URL lists

Some seeds for site discovery:

Tools

Archives

Google Code archives are (being) uploaded to https://archive.org/details/archiveteam_googlecode, in WARC format.

"The Google Code Archive (https://code.google.com/archive/) contains the data found on the Google Code Project Hosting Service, which will be turned down in early 2016. This archive contains over 1.4 million projects, 1.5 million downloads, and 12.6 million issues."

References

External links