Difference between revisions of "User:Bzc6p"

From Archiveteam
Jump to navigation Jump to search
(what about my dreams)
(added HU websites navigation box)
(64 intermediate revisions by the same user not shown)
Line 1: Line 1:
== What I'm trying to look smart with ==
<DIV style="background-color:yellow; padding: 5px; width:80%; margin-bottom:20px; margin-left:auto; margin-right:auto; text-align:center"><P style="font-size:x-large">'''Elindult az <span class="plainlinks">[https://archiveteam.hu archiveteam.hu]</span>!'''</P><P>Magyar nyelvű információk az ArchiveTeam tevékenységéről, illetve a magyar weboldalak sorsáról!</P><P style="font-size:x-large">'''Elindult az <span class="plainlinks">[https://archiveteam.hu/rszi RSZI]</span> magyar webarchívum!'''</P><P>3 képfeltöltő szolgáltatás közel '''2,5 millió, a Wayback Machine-ból sem elérhető képe''' újra hozzáférhető!</B></P><P style="font-size:x-large">'''Elindult a <span class="plainlinks">[https://archiveteam.hu/lecsu Lecsű]</span> videoarchiváló szolgáltatás!'''</P><P>Segíts megmenteni az értékes YouTube videókat az eltűnéstől!</P></DIV>
::''That era of the web is far behind us when a single'' <code>wget -r -p</code> ''command could mirror a website in its entirety. Nowadays each and every website has its own soul, its own hideous Javascript-linked content, not to mention the various file formats and ways of embedding content. Thus, if one is serious about web archiving, each and every website must be discovered carefully, often painstakingly, which is in too few cases possible in any automated ways.''


== What I probably shouldn't have archived ==
{{Hungarian websites}}
<center><div style="text-align:center; width:90%; border: 2px solid red; font-size:120%;">
<font color="red">'''Általam feltöltött tartalom eltávolításával kapcsolatos kéréseket a <code>vichra</code><code>timot</code><code>@euro</code><code>mail.</code><code>hu</code> címre kell küldeni.'''</font>


'''Requests for removal of content uploaded by me should be sent to <code>vichra</code><code>timot</code><code>@euro</code><code>mail.</code><code>hu</code>.'''
'''bzc6p''' is a [http://en.wikipedia.org/wiki/Hungary Hungarian] amateur archivist who joined the efforts of ArchiveTeam. "Specialized" in watching and saving [[Template:Hungarian websites|Hungarian websites]].
</div></center>


== Who I am ==
<div style="margin-top:12pt; font-size:120%;">
[http://en.wikipedia.org/wiki/Hungary Hungarian] amateur who joined the efforts of ArchiveTeam. "Specialized" in watching and saving Hungarian websites.
'''Contact: <code>vichratimot (at) euromail (dot) hu</code>'''
</div>


== What I've done ==
Not been doing much spectacular activity recently, but still operating my long-running projects in my, now much less, free time. You may, however, successfully contact me on my [[User_talk:bzc6p|talk page]] or via email if necessary.
[[File:Keep_calm_and_hate_javascript.png|thumb|This user is on a tin can connected to a windmill,<ref>http://archive.fart.website/bin/irclogger_log/archiveteam?date=2015-07-05,Sun&sel=163#l159</ref> likes simplicity, likes archiving websites, therefore '''hates Javascript''' being used to just show components of websites.]]
*What I've saved or am saving:
** stuff listed with my name on [[Small projects]]
** tiny projects:
*** [[Demotivalo.net]] – saved in 2014 – status: endangered – archives: see [[Demotivalo.net|there]]
*** [http://netszar.com netszar.com] – saved in June 2015 – status: endangered – archive: [https://archive.org/details/netszar_com_2015_06]
*** Kajászószentpéter – saved in June 2015 – closed on 2015-07-07 – archives: [https://archive.org/details/kajaszoszentpeter_hu_2015_06], [https://archive.org/details/kajaszoszentpeter_photos], [https://archive.org/details/kajaszoszentpeter_videos]
*[[Template:Hungarian websites|What I'm watching]]
*[[:Category:Hungarian websites|What I've written about]]


Also saved [http://wikiapiary.com/wiki/Special:Contributions/bzc6p a few wikis] in the beginning.
See [https://archive.org/details/@bzc6p what I'm archiving].


I sometimes also take part in [[Warrior]] projects, in those rare cases when the tracker limit is not already saturated.


== What I recommend ==
== My projects ==
*Chfoo's [http://github.com/chfoo/wpull Wpull] for saving to [[WARC]]
Websites that I've archived, I'm archiving or I've took part in organizing their archival, in reversed chronological order in each category. If the website has an entry on this wiki, consult that page for the archives. If not, a link to the archives should be found in the appropriate line.
* Ikreymer's [http://webrecorder.io webrecorder.io] when things are too difficult
** wget still lacks some handy features wpull already has got
** No, I don't prefer [[ArchiveBot]] as most websites can't be saved automatically, also one can't really fine-tune a specific ArchiveBot job. Can be useful and powerful, but it's still <s>quite dull</s> under development. '''Use with caution.'''
*Alard's [https://github.com/alard/warc-proxy warc-proxy] or Ikreymer's [https://github.com/ikreymer/webarchiveplayer webarchiveplayer] for testing WARCs
*Kngenie's [https://github.com/kngenie/ias3upload ias3upload] for uploading to [[Internet Archive|IA]]
**No, I tried but didn't like the [https://pypi.python.org/pypi/internetarchive internetarchive] tool.


== My dreams ==
=== Large websites ===
=== Short term (will maybe realized one day) ===
* [[network.hu]]
I'd like to create a small Hungarian Internet Archive, collecting and presenting archives of websites saved by me or others, just like the [[Internet Archive]] and the Wayback Machine does. (Don't think of large scale, just starting with a few second-hand 1 TB hard disks connected to a home server.) Also, I'd restore died websites on their original location (domain) if possible, so rotten links could resurrect.
* [[TVN.hu]]
* [[Indafotó]]
* [[myVIP]]
* [[kepfeltoltes.hu]]
* [[kephost.com]]
* Selected [[YouTube]] videos, as part of [[News+C/hu|News+C]] project


This could also act as a mirror of some websites uploaded to IA.
=== Medium-sized websites ===
* [[videok.hu]]
* [[ingyenweb.hu]]
* [[kepfeltoltes.eu]] ({{blue|continuous}})
* [[Szuperinfó]]
* [https://444.hu 444.hu], as part of [[News+C/hu]] ({{blue|continuous}})
* [https://hvg.hu hvg.hu], as part of [[News+C/hu]] ({{blue|continuous}})
* [https://kuruc.info kuruc.info], as part of [[News+C/hu]] ({{blue|continuous}})
* [[pics.coldline.hu]]
* [http://nol.hu nol.hu] ([https://archive.org/details/nol_hu_2017 archive])
* [[Wikispot]]
* [[PSharing]]
* [[keptarad.hu]]
* [[kepkezelo.com]] ({{blue|continuous}})
* [[noob.hu]]
* [[GTF Képhost]] ({{blue|continuous}})


=== Long term (isn't likely to be realized ever) ===
=== Small websites ===
Recording main Hungarian radio and television channels' complete program (0–24), and also give the public restricted access to these archives. (There is [http://nava.hu/what-is-nava/ NAVA], but that doesn't record everything, and is often a bit difficult to access.)


== What I've found in Hungarian about ArchiveTeam ==
* [http://dagalyfurdo.hu Dagály Fürdő] ([http://archive.org/details/dagalyfurdo_hu_20161010 archive])
*Sándor Berta: ''[http://sg.hu/cikkek/67175/archivaljak-a-geocities-tartalmakat Archiválják a GeoCities-tartalmakat]'' (''They archive GeoCities' contents''). sg.hu, 2009-05-04.
* [http://wirtschaftsblatt.at WirtschaftsBlatt] ([http://archive.org/details/wirtschaftsblatt_at_articles archive])
*Ádám Szedlák: ''[http://www.origo.hu/techbazis/internet/20090513-geocities-freeweb-archivalokra-varnak-az-ingyenes-tarhelyek.html Megmentik az őshonlapokat]'' (''They are saving the ancient websites''). origo.hu, 2009-05-13. (About [[Geocities]].)
* [http://balassiintezet.hu Balassi Intézet] ([http://archive.org/details/balassiintezet_hu_20160829 archive])
*Ádám Szedlák: ''[http://www.origo.hu/techbazis/internet/20091102-geocities-uj-otthont-kaptak-az-oshonlapok.html Új otthont kaptak az őshonlapok]'' (''The ancient websites got a new home''). origo.hu, 2009-11-02. (About Geocities.)
* [http://hi.co hi.co] ([http://archive.org/details/hi_co_20160829 archive])
*''[http://index.hu/tech/2013/10/21/lementik_a_legnagyobb_torrentkeresot/ Lementik a legnagyobb torrentkeresőt]'' (''They download the biggest torrent search site''). index.hu, 2013-10-21. (About saving [[IsoHunt]].)
* [http://ketezer.hu 2000] ([http://archive.org/details/ketezer_hu_20160825 archive])
*''[http://hvg.hu/tudomany/20131030_Meglepo_hir_a_filmletoltoknek_feleled_a_m Mit szóltok filmletöltők? Két héttel a bezárása után ismét működik a népszerű torrentoldal]'' (''What do you say, movie leechers? Two weeks after its closure popular torrent site runs again''). hvg.hu, 2013-10-30. (About IsoHunt restoration.)
* [http://mozaiktv.rs Mozaik TV] ([http://archive.org/details/mozaiktv_rs_20160827 archive])
*Dániel Dojcsák: ''[http://www.hwsw.hu/hirek/51400/blip-videomegoszto-online-tartalom-premium-torles-arhivum.html Elpusztulhat a nem profitképes online tartalom]'' (''Non-profitable online content may vanish''). hwsw.hu, 2013-12-03. (Mentions ArchiveTeam saving [[Blip.tv|Blip]] videos.)
* [http://precedensnyelvstudio.hu Precedens Nyelvstúdió] ([http://archive.org/details/precedensnyelvstudio_hu_20160827 archive])
*Péter Szűcs: ''[http://itcafe.hu/hir/az_internet_nem_felejt.html Az internet nem felejt]'' (''The internet doesn't forget''). itcafe.hu, 2015-03-05. (About ArchiveTeam's activity in general.)
* [http://alomauto.eu Álomautó Múzeum] ([http://archive.org/details/alomauto_eu_20160827 archive])
* [[nolblog.hu]]
* [[TVN.hu#tudjatok.hu|tudjatok.hu]]
* [http://kszz.profitarhely.hu Kecskeméti Szimfonikus Zenekar] ([https://archive.org/details/kszz_profitarhely_hu_20160128 archive])
* [http://melecafe.com Mele Café] ([https://archive.org/details/melecafe_com_20160128 archive])
* [http://cafealibi.hu Café Alibi] ([https://archive.org/details/cafealibi_hu_20160126 archive])
* [http://efmk.hu Kecskeméti Kulturális és Konferencia Központ] ([https://archive.org/details/efmk_hu_20160107 archive])
* [http://szeplakierzsebet.hu Széplaki Erzsébet] ([https://archive.org/details/szeplakierzsebet_hu_20160111 archive])
* [http://freddyfitness.hu Freddy Fitness] ([https://archive.org/details/freddyfitness_hu_20160106 archive])
* [http://legalja.hu legalja.hu] ([https://archive.org/details/legalja_hu_20151227 archive])
* Astra Insurance: [http://www.astrasig.ro Romania] ([http://archive.org/details/astrasig_ro_20151128 archive]), [http://www.astrabiztosito.hu Hungary] ([http://archive.org/details/astrabiztosito_hu_20151126 archive])
* [http://kajaszoszentpeter.hu Kajászószentpéter] (archives: [https://archive.org/details/kajaszoszentpeter_hu_2015_06 website], [https://archive.org/details/kajaszoszentpeter_photos photos], [https://archive.org/details/kajaszoszentpeter_videos videos])
* [http://netszar.com netszar.com] ([https://archive.org/details/netszar_com_2015_06 archive])
* Hungarian [[Volán]] websites
* [[Demotivalo.net]]
* [http://wikiapiary.com/wiki/Special:Contributions/bzc6p A few wikis] in the beginning.
 
=== Non-web stuff ===
I'm also archiving some Hungarian TV and radio programmes, magazines and shop flyers.
 
== To-do list ==
Hungarian websites that should be saved in the near future. I don't reserve them as ''my'' projects, as I fear I won't have time for them soon.
 
# [[blogger.hu]]
 
== Philosophy ==
My experience with my few website archiving endavours so far suggests that there are very few websites today that can be mirrored completely in automated ways without human control and intervention. Thus, if one wants to make quality archives even of a small website, it needs more or less attention, often additional work, or several, supplemental runs of archiving tools.
 
These archiving tools ([https://www.gnu.org/software/wget/manual/wget.html wget], [http://github.com/chfoo/wpull wpull], [[ArchiveBot]] etc.) are very important and useful, but in most cases, are themselves incapable of making complete archives. My philosophy is that we should do as complete and quality archives as possible, if we set off on the journey of archiving a website, so we cannot rely solely on these tools. Of course, constrained by time and resources, we must make a compromise. Otherwise, however, the above applies. At least for me. This is how I archive.
 
== Recommended tools ==
 
=== Saving to [[WARC]] ===
* Chfoo's [http://github.com/chfoo/wpull Wpull]: a good alternative to wget, still being developed, with good archiving support
* [https://en.wikipedia.org/wiki/wget wget]: faster, but lacks some handy features wpull already has got, and is pretty much in its final state
* Internet Archive's [https://github.com/internetarchive/warcprox warcprox]: provides a proxy to your web browser, so you can easily create WARCs as you browse, if it's just a few pages
* Ikreymer's [http://webrecorder.io webrecorder.io]: concept similar to warcprox, but you don't need to install anything, WARC is generated remotely (you can also install it, but it needs Docker)
 
=== Replaying WARCs ===
* Alard's [https://github.com/alard/warc-proxy warc-proxy]: using a proxy, provides accurate replay, but doesn't support HTTPS, and development seems to be stopped
* Ikreymer's [https://github.com/ikreymer/webarchiveplayer webarchiveplayer]: doesn't use a proxy, works similarly to the Wayback Machine, but because of that, some URLs are not rewritten in the files, and may not play back properly
 
=== Uploading to [[Internet Archive|IA]] ===
* Kngenie's [https://github.com/kngenie/ias3upload ias3upload]: just uploading, and needs a metadata CSV-file beforehand, but otherwise works fine
* IA-developed [https://pypi.python.org/pypi/internetarchive internetarchive]: more versatile tool (upload, download, search etc.)
* Direct use of the [http://archive.org/help/abouts3.txt Internet Archive S3 API] with the [https://en.wikipedia.org/wiki/cURL curl] program. The above uploading tools are based on this interface.
 
== Further plans ==
I hope one day I can re-host Hungarian websites that are dead now but have been archived. Or, at least, create a Wayback Machine for Hungarian websites, that would also serve as a mirror to the corresponding Internet Archive items.
 
As for the [[URL Team]] project, given that the discovered URLs have not been saved in WARC format (yet) but in a format difficult to access and read, a shorturl-resolver service for already gone URL shorteners would be useful. It would be kind of a Wayback Machine for URL shorteners. It wouldn't even be difficult to set up, based on URL Team databases.
 
I would also be glad to record Hungarian radio and television channels' programme 24/7, but that would require a vast amount of resources, Until / instead of that, I'm collecting some recordings of notable Hungarian TV and radio programmes and moments from [[YouTube]] (and of course, I'm uploading them to the Archive).


== References ==
== Hungarian articles about Archive Team ==
<references/>
Below I've collected online Hungarian news articles published about Archive Team that I've been able to find. The list is in reversed chronological order.


*I've proudly discovered that Archive Team got its own [https://webarchivum.oszk.hu/mediawiki/index.php?title=Archive_Team article] (among ''Organizations'') on the [https://webarchivum.oszk.hu/mediawiki/index.php?title=MIA_WIKI knowledge base] of the Hungarian Internet Archive, that is, the Web Archiving Department of National Széchényi Library, the national library of Hungary! (Date: 2017-07-25).
*Péter Szűcs: ''[http://itcafe.hu/hir/az_internet_nem_felejt.html Az internet nem felejt]'' (''The internet doesn't forget''). itcafe.hu, 2015-03-05. (About ArchiveTeam's activity in general.)
*Dániel Dojcsák: ''[http://www.hwsw.hu/hirek/51400/blip-videomegoszto-online-tartalom-premium-torles-arhivum.html Elpusztulhat a nem profitképes online tartalom]'' (''Non-profitable online content may vanish''). hwsw.hu, 2013-12-03. (Mentions ArchiveTeam saving [[Blip.tv|Blip]] videos.)
*''[http://hvg.hu/tudomany/20131030_Meglepo_hir_a_filmletoltoknek_feleled_a_m Mit szóltok filmletöltők? Két héttel a bezárása után ismét működik a népszerű torrentoldal]'' (''What do you say, movie leechers? Two weeks after its closure popular torrent site runs again''). hvg.hu, 2013-10-30. (About IsoHunt restoration.)
*''[http://index.hu/tech/2013/10/21/lementik_a_legnagyobb_torrentkeresot/ Lementik a legnagyobb torrentkeresőt]'' (''They download the biggest torrent search site''). index.hu, 2013-10-21. (About saving [[IsoHunt]].)
*Ádám Szedlák: ''[http://www.origo.hu/techbazis/internet/20091102-geocities-uj-otthont-kaptak-az-oshonlapok.html Új otthont kaptak az őshonlapok]'' (''The ancient websites got a new home''). origo.hu, 2009-11-02. (About Geocities.)
*Ádám Szedlák: ''[http://www.origo.hu/techbazis/internet/20090513-geocities-freeweb-archivalokra-varnak-az-ingyenes-tarhelyek.html Megmentik az őshonlapokat]'' (''They are saving the ancient websites''). origo.hu, 2009-05-13. (About [[Geocities]].)
*Sándor Berta: ''[http://sg.hu/cikkek/67175/archivaljak-a-geocities-tartalmakat Archiválják a GeoCities-tartalmakat]'' (''They archive GeoCities' contents''). sg.hu, 2009-05-04.
{{DISPLAYTITLE:User&#58;bzc6p}}
{{DISPLAYTITLE:User&#58;bzc6p}}

Revision as of 12:28, 9 October 2021

Elindult az archiveteam.hu!

Magyar nyelvű információk az ArchiveTeam tevékenységéről, illetve a magyar weboldalak sorsáról!

Elindult az RSZI magyar webarchívum!

3 képfeltöltő szolgáltatás közel 2,5 millió, a Wayback Machine-ból sem elérhető képe újra hozzáférhető!

Elindult a Lecsű videoarchiváló szolgáltatás!

Segíts megmenteni az értékes YouTube videókat az eltűnéstől!

     Hungarian websites     
Red entries indicate websites which don't have an article on this wiki yet. Striked-through entries indicate websites that have already been shut down.
Archives & Digital Libraries mek.oszk.hu  · epa.oszk.hu  · dka.oszk.hu  · webarchivum.oszk.hu  · NAVA  · Fortepan  · fentrol.hu
Blogging Blog.hu  · Blogter  · Freeblog  · Blogger.hu  · reblog.hu  · xfree.hu  · cafeblog.hu
Social networks iWiW  · myVIP  · hotdog.hu  · Baratikor.com  · network.hu  · Mommo  · privi.hu
Webhosting Extra  · tar.hu  · ATW  · Ingyenweb  · Freeweb  · Ultraweb  · x3.hu  · ini.hu  · ininet.hu  · G-Portál  · uCoz  · eOldal  · ewk  · 5mp.eu  · mindenkilapja  · Webnode
Forums, message boards* Index  · SG  · Nők Lapja Cafe  · Hoxa
Video hosting Indavideó  · Videa  · videoplayer.hu  · xfree.hu  · videok.hu
Image hosting Kepfeltoltes.hu  · Fotoalbum.hu  · Indafotó  · Kephost.com  · pics.coldline.hu  · kep.tar.hu  · noob.hu  · PSharing (a.k.a. ivPicture)  · Kephost.hu  · kepfeltoltes.eu  · kepkuldes.com  · xfree.hu  · GTF Képhost  · fotozz.hu  · Kepkezelo.com  · keptarad.hu  · darkweb.hu  · fos.hu
Questions and Answers gyakorikerdesek.hu  · tudjatok.hu
File sharing data.hu  · toldacuccot.hu  · hellshare.hu  · addat.hu  · fileposta.hu
Document sharing doksi.hu  · Docplayer
Fun Demotiváló  · keptelenseg.hu  · csubakka.hu  · nemkutya.com  · legalja.hu  · szanalmas.hu  · trollfesz.cc  · gumicsizma.hu
Trash napiszar.com  · napiszar.hu  · netszar.com  · napiszar.org
Other News+C  · moly.hu  · gyertyalang.hu  · Volán websites  · Szuperinfó


bzc6p is a Hungarian amateur archivist who joined the efforts of ArchiveTeam. "Specialized" in watching and saving Hungarian websites.

Contact: vichratimot (at) euromail (dot) hu

Not been doing much spectacular activity recently, but still operating my long-running projects in my, now much less, free time. You may, however, successfully contact me on my talk page or via email if necessary.

See what I'm archiving.


My projects

Websites that I've archived, I'm archiving or I've took part in organizing their archival, in reversed chronological order in each category. If the website has an entry on this wiki, consult that page for the archives. If not, a link to the archives should be found in the appropriate line.

Large websites

Medium-sized websites

Small websites

Non-web stuff

I'm also archiving some Hungarian TV and radio programmes, magazines and shop flyers.

To-do list

Hungarian websites that should be saved in the near future. I don't reserve them as my projects, as I fear I won't have time for them soon.

  1. blogger.hu

Philosophy

My experience with my few website archiving endavours so far suggests that there are very few websites today that can be mirrored completely in automated ways without human control and intervention. Thus, if one wants to make quality archives even of a small website, it needs more or less attention, often additional work, or several, supplemental runs of archiving tools.

These archiving tools (wget, wpull, ArchiveBot etc.) are very important and useful, but in most cases, are themselves incapable of making complete archives. My philosophy is that we should do as complete and quality archives as possible, if we set off on the journey of archiving a website, so we cannot rely solely on these tools. Of course, constrained by time and resources, we must make a compromise. Otherwise, however, the above applies. At least for me. This is how I archive.

Recommended tools

Saving to WARC

  • Chfoo's Wpull: a good alternative to wget, still being developed, with good archiving support
  • wget: faster, but lacks some handy features wpull already has got, and is pretty much in its final state
  • Internet Archive's warcprox: provides a proxy to your web browser, so you can easily create WARCs as you browse, if it's just a few pages
  • Ikreymer's webrecorder.io: concept similar to warcprox, but you don't need to install anything, WARC is generated remotely (you can also install it, but it needs Docker)

Replaying WARCs

  • Alard's warc-proxy: using a proxy, provides accurate replay, but doesn't support HTTPS, and development seems to be stopped
  • Ikreymer's webarchiveplayer: doesn't use a proxy, works similarly to the Wayback Machine, but because of that, some URLs are not rewritten in the files, and may not play back properly

Uploading to IA

  • Kngenie's ias3upload: just uploading, and needs a metadata CSV-file beforehand, but otherwise works fine
  • IA-developed internetarchive: more versatile tool (upload, download, search etc.)
  • Direct use of the Internet Archive S3 API with the curl program. The above uploading tools are based on this interface.

Further plans

I hope one day I can re-host Hungarian websites that are dead now but have been archived. Or, at least, create a Wayback Machine for Hungarian websites, that would also serve as a mirror to the corresponding Internet Archive items.

As for the URL Team project, given that the discovered URLs have not been saved in WARC format (yet) but in a format difficult to access and read, a shorturl-resolver service for already gone URL shorteners would be useful. It would be kind of a Wayback Machine for URL shorteners. It wouldn't even be difficult to set up, based on URL Team databases.

I would also be glad to record Hungarian radio and television channels' programme 24/7, but that would require a vast amount of resources, Until / instead of that, I'm collecting some recordings of notable Hungarian TV and radio programmes and moments from YouTube (and of course, I'm uploading them to the Archive).

Hungarian articles about Archive Team

Below I've collected online Hungarian news articles published about Archive Team that I've been able to find. The list is in reversed chronological order.