Difference between revisions of "Gna!"

From Archiveteam
Jump to navigation Jump to search
(gna_tickets has apparently been marked as spam)
(21 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Infobox project
{{Infobox project
| title = Gna!
| title = Gna!
| URL = http://www.gna.org
| URL = https://gna.org/
| project_status = {{closing}}
| project_status = {{offline}}
| archiving_status = {{inprogress}}
| archiving_status = {{partiallysaved}}
| irc = gnarm
| irc = gnarm
| image = Gna.org screenshot 20170225.png
| image = Gna.org screenshot 20170225.png
}}
}}


'''Gna!''' is a centralized location where software developers can develop, distribute and maintain free (GPL-compatible) software. It is an instance of the [http://savannah.nongnu.org/p/savane-cleanup Savane] code-hosting platform<ref>https://gna.org/cookbook/?func=detailitem&item_id=105</ref>.
'''Gna!''' was a centralized location where software developers could develop, distribute and maintain free (GPL-compatible) software. It was an instance of the [http://savannah.nongnu.org/p/savane-cleanup Savane] code-hosting platform<ref>https://gna.org/cookbook/?func=detailitem&item_id=105</ref>. It hosted popular free software projects such as Battle for Wesnoth<ref>{{url|https://gna.org/projects/wesnoth/}}</ref> and Freeciv<ref>{{url|https://gna.org/projects/freeciv/}}</ref> ([https://gna.org/search/?type_of_search=soft&words=%%%&max_rows=2000 full list]).


== Hosted data ==
It shut down due to lack of admin effort; the pending shutdown was first announced in Nov 2016, and it finally shut down on May 24 2017.
 
As of 2017-02 it claimed to have 1458 hosted projects. (Many are probably abandoned and will not be saved by their project admins before shutdown.)
 
* '''Code hosting''' using CVS, Subversion, and Arch
** All subversion repos available via anonymous rsync: rsync://svn.gna.org/svn/ (ref: bottom of every project's svn page e.g. [https://gna.org/svn/?group=admin])
** Ditto CVS, it looks like: rsync://svn.gna.org/cvs/
** Arch/tla [https://gna.org/cookbook/?func=detailitem&item_id=101]: rsync://download.gna.org/arch/
** There's also a ViewVC web front-end to browse code.
* '''Ticket tracking'''
** Up to 4 trackers per project: 'bugs', 'patch', 'task', 'support'
** Project admins (only) can set up XML export of their own ticket text/metadata ("Export" item on tracker admin menu).
*** Only option for third parties looks like web scraping.
** There's no supported interface for grabbing issue attachments (such as patches) even for project admins though.
*** Attached files are allocated global increasing integer IDs, e.g. [https://gna.org/bugs/download.php?file_id=29845 file #29845]. It's probably possible to scrape all public files by varying 'bugs' and the ID.
** Individual tickets can be private. (Maybe files too?)
* '''File hosting''' at http://download.gna.org/
** Anonymous rsync available at rsync://download.gna.org/download/
* '''Project websites''' on home.gna.org
** e.g. http://home.gna.org/freeciv/
** These are managed via Subversion [https://gna.org/cookbook/?func=detailitem&item_id=107], so grabbing svn by rsync as above should also save website data + history
* '''Mailing lists''' using [[Mailman]]
** Which means public archives are available in mbox format (albeit with email addresses mangled). e.g. [https://mail.gna.org/public/freeciv-announce/]
** Some mailing lists are private.
* '''Project metadata''': groups, users, news, help topics etc. In a database and probably only available via web scraping.
* '''Usage stats''' at http://stats.gna.org/
 
== Shutdown Notice ==
 
* A notice of pending shutdown / request for takeover was first announced in Nov 2016<ref>https://mail.gna.org/public/project/2016-11/msg00001.html</ref> suggesting a time frame of six months
* A news item about shutdown was posted to the front page 2017-01-31 linking to the above.
* Exact shutdown timeline is currently (2017-02) unclear.
 
== rsync grab sign-up ==


This gets code and file hosting but not other stuff. &lt;180Gibyte, all in.
== Where's the stuff then? ==


Please choose --bwlimit wisely (5M?)
Here's links to all the known saved material:


{| border=1 cellspacing=1
{| class="wikitable"
! What
! What
! Size
! [https://archive.org/details/@m_kram mkram], 2017-02
! No files
! [https://archive.org/details/@zeryl Zeryl], 2017-05
! Who/when
! [https://archive.org/details/@j_t_n JTN], 2017-05
! Uploaded to
|-
|-
| rsync://svn.gna.org/svn/
| Code hosting, project files, project websites
| ~41 Gibyte
| {{IA id|Gna_code_hosting}} as of 2017-02-25 ([[Gna!/code and downloads#Upload schedule|more details]])
| ~1m
| ?
| PurpleSym 2017-02-25
| -
|  
|-
|-
| rsync://svn.gna.org/cvs/
| Mailing list archives
| ~7.5 Gibyte
| -
| ~200k
| {{IA id|mail.gna.org_2017-05-04}} (ZIP)<br>{{IA id|gna.org_2017-05-08_html_mailman_archive}} (WARC)
| mkram 2017-02-25
| -
|  
|-
|-
| rsync://download.gna.org/arch/
| Ticket trackers
| ~318 Mibyte
| -
| ~71k
| {{IA id|gna_tickets}}<br><font color="red">Inaccessible because [https://catalogd.archive.org/log/672398126 marked as spam]!</font>
| mkram 2017-02-25 (except admindir)
| -
|
|-
|-
| rsync://download.gna.org/download/
| Ticket attachments
| ~116 Gibyte
| -
| ~130k
| ?
| mkram 2017-02-25
| FIXME: as of 2017-05-24; not uploaded yet
|
|-
| rsync://download.gna.org/www/
| ~6.4 Gibyte
| ~177k
| mkram 2017-02-25 (except "some authentication folder and .bashhistory")
|
|}
|}


== Detailed quantities ==
Zeryl may also have arranged that some of their material (including some not shown here?) was ingested into the [[Wayback Machine]].


From 2017-02-25 (just counting, files not transferred):
On IRC, Zeryl said at one point "I've got: bugs, rsync items, support tickets, tasks, tickets, the warc of the mail site, mailman mboxes, mailman by thread, mailman by date. 213gb, heh". JTN doesn't know where their copy of 'rsync items' went.


=== Code hosting ===
== Hosted data ==


SVN:
As of 2017-02 it claimed to have 1458 hosted projects. (Many are probably abandoned and will not be saved by their project admins before shutdown.)


$ rsync --bwlimit 100K --stats --dry-run -avHS rsync://svn.gna.org/svn/ stoat/
Here's a breakdown of the kinds of data stored and what various people can do to grab the data:
...
* '''<font color=red>Third party</font>''' describes what random anonymous Internet people (e.g., Archive Team) can do
Number of files: 979,943 (reg: 961,966, dir: 17,976, link: 1)
** '''<font color=green>Done</font>''' shows bits that we have already rescued
Number of created files: 979,943 (reg: 961,966, dir: 17,976, link: 1)
** '''<span style="background-color: yellow">Not done</span>''' shows bits that aren't known to have been saved
Number of deleted files: 0
* '''<font color=blue>Members</font>''' describes things that only members of the relevant project can do (if better)
Number of regular files transferred: 961,966
Total file size: 43,448,178,925 bytes
Total transferred file size: 43,448,178,924 bytes
Literal data: 0 bytes
Matched data: 0 bytes
File list size: 12,891,196
File list generation time: 1.371 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 2,961,672
Total bytes received: 18,378,594


CVS:
Data stored:
* '''Code hosting''' using CVS, Subversion, and Arch  ('''<font color=green>done</font>''' 2017-02-25, not updated since; see [[Gna!/code and downloads|subpage]])
** '''<font color=red>Third parties</font>''' can grab all code with full history:
*** All subversion repos available via (insecure) anonymous rsync: rsync://svn.gna.org/svn/ (ref: bottom of every project's svn page e.g. [https://gna.org/svn/?group=admin]). (In [http://svnbook.red-bean.com/en/1.7/svn.reposadmin.planning.html#svn.reposadmin.basics.backends.fsfs FSFS] format, which is supposed to be portable.)
**** '''<font color=blue>Gna members</font>''' can get the same data with integrity protection over SSH (for any svn repository), but must use [http://svnbook.red-bean.com/en/1.7/svn.reposadmin.maint.html#svn.reposadmin.maint.migrate.svnrdump svnrdump]; this supposedly creates a faithful copy of the important stuff, but is lightly munged.
*** Ditto CVS, it looks like: rsync://svn.gna.org/cvs/
**** '''<font color=blue>Gna members</font>''' can get the same data over SSH (for any project), but must use CVS commands. Don't know if there's a standard tool for reconstructing server-side repo state.
*** Arch/tla [https://gna.org/cookbook/?func=detailitem&item_id=101]: rsync://download.gna.org/arch/
**** '''<font color=blue>Gna members</font>''' can get the same data securely over sftp (for any project)
** There's also a ViewVC web front-end to browse SVN/CVS code. (No point grabbing this if you've got the above)
* '''Ticket tracking''' ('''<font color=green>done</font>''' 2017-05-04 by [[User:Zeryl04|Zeryl04]]: {{IA id|gna_tickets}})
** Up to 4 trackers per project: 'bugs', 'patch', 'task', 'support'
** '''<font color=blue>Gna members</font>''' (only) can set up XML export of their own ticket text/metadata ("Export" item on tracker admin menu).
*** Only option for '''<font color=red>third parties</font>''' looks like web scraping. (Someone pointed [[ArchiveBot]] at it but [http://archive.fart.website/archivebot/viewer/?q=gna.org it doesn't seem to have grabbed much])
*** Exported XML is published to an unauthenticated URL of the form <nowiki>https://</nowiki>gna.org/export/''project''/''user''/''number''.xml . ''number'' might be global; a recent export had number 66. In principle this namespace could be mined by third parties although it's a rather large search space (1458 projects * 9116 users * 66 numbers) and would only catch recent or periodic exports, since they are cleared out quickly.
** There's no supported interface for grabbing issue attachments (such as patches) even for project admins though.
*** '''<font color=red>Third parties</font>''' can scrape attachments by relying on their increasing integer IDs, e.g. [https://gna.org/bugs/download.php?file_id=29845 file #29845]. It looks like you don't have to get the 'bugs' bit correct, so it's possible to scrape all public files by varying the ID. ('''<font color=green>done</font>''' to 2017-05-24 by [[User:JTN|JTN]], not uploaded anywhere yet)
** Individual tickets can be private. (Maybe files too?) But the XML export includes private tickets (yes, to an unauthenticated URL).
* '''File hosting''' at http://download.gna.org/  ('''<font color=green>done</font>''' 2017-02-25, not updated since; see [[Gna!/code_and_downloads|subpage]])
** '''<font color=red>Third parties</font>''' can do (insecure) anonymous rsync from rsync://download.gna.org/download/
** '''<font color=blue>Gna members</font>''' can get the same data (for any project) securely with rsync-over-SSH (<tt>rsync -avz ''user''@download.gna.org:/var/ftp/ dest/</tt>), or with sftp
* '''Project websites''' on home.gna.org ('''<font color=green>done</font>''' 2017-02-25, not updated since; see [[Gna!/code_and_downloads|subpage]])
** e.g. http://home.gna.org/freeciv/
** These are managed via Subversion [https://gna.org/cookbook/?func=detailitem&item_id=107], so grabbing svn by rsync as above should also save website data + history
* '''Mailing lists''' using [[Mailman]] ('''<font color=green>done</font>''' 2017-05-04 by [[User:Zeryl04|Zeryl04]] using [https://github.com/zeryl/gan_mailbox this code]; got public HTML+mbox, uploaded as {{IA id|mail.gna.org_2017-05-04|zip}}; also 2017-05-08 {{IA id|gna.org_2017-05-08_html_mailman_archive|WARC}}. [http://archive.fart.website/archivebot/viewer/domain/mail.gna.org ArchiveBot also has something], not sure what.)
** Which means public archives are available to '''<font color=red>third parties</font>''' in mbox format (albeit with email addresses mangled). e.g. [https://mail.gna.org/public/freeciv-announce/]
*** Note, the most recent mbox link on inactive lists (e.g., [https://mail.gna.org/public/freeciv-warclient-commits/]) is broken; replace "2014-09.partial.mbox.gz" with "2014-09.mbox.gz" to fix it
*** It may be worth grabbing the HTML archives too, as they contain some info not available in the mboxes, e.g. "X-From-R13" in HTML comments contains reversibly obfuscated From address
** Some mailing lists are private. Even '''<font color=blue>project admins</font>''' can't see the archives at the moment ([https://gna.org/support/?3421 sr 3421]).
* '''Project metadata''': groups, users, news, help topics etc. In a database and probably only available via web scraping. '''<span style="background-color: yellow">Not done</span>'''
* '''Usage stats''' at http://stats.gna.org/ '''<span style="background-color: yellow">Not done</span>'''


$ rsync --bwlimit 100K --stats --dry-run -avHS rsync://svn.gna.org/cvs/ stoat/
Gna admins have not so far been responsive to requests for help from at least some project members wishing to migrate or rescue their data, presumably due to the same lack of effort that is why the site is shutting down. They haven't been approached about Archive Team style bulk backup (or at least [[User:JTN|JTN]] has not done so).
...
Number of files: 191,592 (reg: 169,489, dir: 22,101, link: 2)
Number of created files: 191,592 (reg: 169,489, dir: 22,101, link: 2)
Number of deleted files: 0
Number of regular files transferred: 169,489
Total file size: 8,032,418,393 bytes
Total transferred file size: 8,032,418,367 bytes
Literal data: 0 bytes
Matched data: 0 bytes
File list size: 1,912,404
File list generation time: 0.338 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 600,574
Total bytes received: 5,028,112


Arch:
== Shutdown Notice ==
 
$ rsync --bwlimit 100K --stats --dry-run -avHS rsync://download.gna.org/arch/ stoat/
...
Number of files: 91,447 (reg: 74,294, dir: 17,152, link: 1)
Number of created files: 91,447 (reg: 74,294, dir: 17,152, link: 1)
Number of deleted files: 0
Number of regular files transferred: 74,294
Total file size: 332,910,762 bytes
Total transferred file size: 332,910,749 bytes
Literal data: 0 bytes
Matched data: 0 bytes
File list size: 1,771,746
File list generation time: 0.058 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 294,204
Total bytes received: 3,546,901
 
=== File hosting ===


Download area:
* A notice of pending shutdown / request for takeover was first announced in Nov 2016<ref>{{url|https://mail.gna.org/public/project/2016-11/msg00001.html}}</ref> suggesting a time frame of six months
* A news item<ref>https://gna.org/forum/forum.php?forum_id=2545</ref> about shutdown was posted to the front page 2017-01-31 linking to the above. A reply to that on 4 Feb suggests shutdown will happen "within 3 months, or when the hardware dies".
* This suggests shutdown by around the beginning of May 2017.
* As of early May 2017, it was still up, although its SSL certificate had been allowed to lapse.


$ rsync --bwlimit 100K --stats --dry-run -avHS rsync://download.gna.org/download/ stoat/
== Shutdown ==
...
Number of files: 129,722 (reg: 117,260, dir: 8,461, link: 4,001)
Number of created files: 129,722 (reg: 117,260, dir: 8,461, link: 4,001)
Number of deleted files: 0
Number of regular files transferred: 107,011
Total file size: 124,814,979,354 bytes
Total transferred file size: 118,372,817,722 bytes
Literal data: 0 bytes
Matched data: 0 bytes
File list size: 3,965,140
File list generation time: 0.378 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 1,302,870
Total bytes received: 5,977,965


www (including svn working files):
gna.org stopped responding during 24 May, 2017. This was unannounced, but a Gna admin confirmed that the shutdown was deliberate on #gna IRC later that day:


  $ rsync --bwlimit 100K --stats --dry-run -avHS rsync://download.gna.org/www/ stoat/
  20:53 < jtn> gna.org has stopped responding. I guess this is it. Thanks for
...
              everything.
  Number of files: 176,823 (reg: 150,915, dir: 25,887, link: 21)
  21:50 < zerodeux> yes, it's been shut down for good
  Number of created files: 176,823 (reg: 150,915, dir: 25,887, link: 21)
  21:50 < zerodeux> some traces now left on
Number of deleted files: 0
      http://www.archiveteam.org/index.php?title=Gna!
Number of regular files transferred: 150,915
Total file size: 6,840,609,373 bytes
Total transferred file size: 6,840,609,074 bytes
Literal data: 0 bytes
Matched data: 0 bytes
File list size: 6,450,464
File list generation time: 0.083 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 559,763
Total bytes received: 9,045,749


== References ==
== References ==


<references />
<references />

Revision as of 15:36, 2 July 2017

Gna!
Gna.org screenshot 20170225.png
URL https://gna.org/
Status Offline
Archiving status Partially saved
Archiving type Unknown
IRC channel #gnarm (on hackint)

Gna! was a centralized location where software developers could develop, distribute and maintain free (GPL-compatible) software. It was an instance of the Savane code-hosting platform[1]. It hosted popular free software projects such as Battle for Wesnoth[2] and Freeciv[3] (full list).

It shut down due to lack of admin effort; the pending shutdown was first announced in Nov 2016, and it finally shut down on May 24 2017.

Where's the stuff then?

Here's links to all the known saved material:

What mkram, 2017-02 Zeryl, 2017-05 JTN, 2017-05
Code hosting, project files, project websites Gna_code_hosting as of 2017-02-25 (more details) ? -
Mailing list archives - mail.gna.org_2017-05-04 (ZIP)
gna.org_2017-05-08_html_mailman_archive (WARC)
-
Ticket trackers - gna_tickets
Inaccessible because marked as spam!
-
Ticket attachments - ? FIXME: as of 2017-05-24; not uploaded yet

Zeryl may also have arranged that some of their material (including some not shown here?) was ingested into the Wayback Machine.

On IRC, Zeryl said at one point "I've got: bugs, rsync items, support tickets, tasks, tickets, the warc of the mail site, mailman mboxes, mailman by thread, mailman by date. 213gb, heh". JTN doesn't know where their copy of 'rsync items' went.

Hosted data

As of 2017-02 it claimed to have 1458 hosted projects. (Many are probably abandoned and will not be saved by their project admins before shutdown.)

Here's a breakdown of the kinds of data stored and what various people can do to grab the data:

  • Third party describes what random anonymous Internet people (e.g., Archive Team) can do
    • Done shows bits that we have already rescued
    • Not done shows bits that aren't known to have been saved
  • Members describes things that only members of the relevant project can do (if better)

Data stored:

  • Code hosting using CVS, Subversion, and Arch (done 2017-02-25, not updated since; see subpage)
    • Third parties can grab all code with full history:
      • All subversion repos available via (insecure) anonymous rsync: rsync://svn.gna.org/svn/ (ref: bottom of every project's svn page e.g. [1]). (In FSFS format, which is supposed to be portable.)
        • Gna members can get the same data with integrity protection over SSH (for any svn repository), but must use svnrdump; this supposedly creates a faithful copy of the important stuff, but is lightly munged.
      • Ditto CVS, it looks like: rsync://svn.gna.org/cvs/
        • Gna members can get the same data over SSH (for any project), but must use CVS commands. Don't know if there's a standard tool for reconstructing server-side repo state.
      • Arch/tla [2]: rsync://download.gna.org/arch/
        • Gna members can get the same data securely over sftp (for any project)
    • There's also a ViewVC web front-end to browse SVN/CVS code. (No point grabbing this if you've got the above)
  • Ticket tracking (done 2017-05-04 by Zeryl04: gna_tickets)
    • Up to 4 trackers per project: 'bugs', 'patch', 'task', 'support'
    • Gna members (only) can set up XML export of their own ticket text/metadata ("Export" item on tracker admin menu).
      • Only option for third parties looks like web scraping. (Someone pointed ArchiveBot at it but it doesn't seem to have grabbed much)
      • Exported XML is published to an unauthenticated URL of the form https://gna.org/export/project/user/number.xml . number might be global; a recent export had number 66. In principle this namespace could be mined by third parties although it's a rather large search space (1458 projects * 9116 users * 66 numbers) and would only catch recent or periodic exports, since they are cleared out quickly.
    • There's no supported interface for grabbing issue attachments (such as patches) even for project admins though.
      • Third parties can scrape attachments by relying on their increasing integer IDs, e.g. file #29845. It looks like you don't have to get the 'bugs' bit correct, so it's possible to scrape all public files by varying the ID. (done to 2017-05-24 by JTN, not uploaded anywhere yet)
    • Individual tickets can be private. (Maybe files too?) But the XML export includes private tickets (yes, to an unauthenticated URL).
  • File hosting at http://download.gna.org/ (done 2017-02-25, not updated since; see subpage)
    • Third parties can do (insecure) anonymous rsync from rsync://download.gna.org/download/
    • Gna members can get the same data (for any project) securely with rsync-over-SSH (rsync -avz user@download.gna.org:/var/ftp/ dest/), or with sftp
  • Project websites on home.gna.org (done 2017-02-25, not updated since; see subpage)
  • Mailing lists using Mailman (done 2017-05-04 by Zeryl04 using this code; got public HTML+mbox, uploaded as mail.gna.org_2017-05-04; also 2017-05-08 gna.org_2017-05-08_html_mailman_archive. ArchiveBot also has something, not sure what.)
    • Which means public archives are available to third parties in mbox format (albeit with email addresses mangled). e.g. [4]
      • Note, the most recent mbox link on inactive lists (e.g., [5]) is broken; replace "2014-09.partial.mbox.gz" with "2014-09.mbox.gz" to fix it
      • It may be worth grabbing the HTML archives too, as they contain some info not available in the mboxes, e.g. "X-From-R13" in HTML comments contains reversibly obfuscated From address
    • Some mailing lists are private. Even project admins can't see the archives at the moment (sr 3421).
  • Project metadata: groups, users, news, help topics etc. In a database and probably only available via web scraping. Not done
  • Usage stats at http://stats.gna.org/ Not done

Gna admins have not so far been responsive to requests for help from at least some project members wishing to migrate or rescue their data, presumably due to the same lack of effort that is why the site is shutting down. They haven't been approached about Archive Team style bulk backup (or at least JTN has not done so).

Shutdown Notice

  • A notice of pending shutdown / request for takeover was first announced in Nov 2016[4] suggesting a time frame of six months
  • A news item[5] about shutdown was posted to the front page 2017-01-31 linking to the above. A reply to that on 4 Feb suggests shutdown will happen "within 3 months, or when the hardware dies".
  • This suggests shutdown by around the beginning of May 2017.
  • As of early May 2017, it was still up, although its SSL certificate had been allowed to lapse.

Shutdown

gna.org stopped responding during 24 May, 2017. This was unannounced, but a Gna admin confirmed that the shutdown was deliberate on #gna IRC later that day:

20:53 < jtn> gna.org has stopped responding. I guess this is it. Thanks for 
             everything.
21:50 < zerodeux> yes, it's been shut down for good
21:50 < zerodeux> some traces now left on
      http://www.archiveteam.org/index.php?title=Gna!

References