|Archiving status||In progress...|
|Project source||telegram-grab, telegram-items|
|IRC channel||(on hackint)|
|Data[how to use]||archiveteam_telegram|
Telegram is a mostly open source chat system (web and app-based) which acts as a social network due to features like group chats and channels, originally intended as a drop-in replacement for WhatsApp.
Channels on Telegram are essentially public streams of posts, similar to Twitter or RSS.
How to help if you have lists of URLs
This project requires lists of URLs for content on the target website. If you have a source of URLs, please:
- Use the PCRE regular expression
- Note that this regex is intentionally broad to cover many different URL formats. Please do not try to use a more narrow pattern as it may miss valid URLs. We can always filter or transform the results as needed later.
- Enable case-insensitive matching (e.g.
grep) to not miss URLs with capitalised domains or similar.
- If you use
grep, remember to include the
--texton GNU grep) option to ensure it will continue searching for matches when encountering binary data.
- Example command (GNU grep):
grep -Pahoi '(telegram\.me|//([^/]*\.)?t\.me)(:\d+)?/(?!share/)\S+' FILENAME FILENAME...
- If the output exceeds a few megabytes, please compress it, preferably using
- Upload the file to https://transfer.archivete.am/.
- Share the resulting URL in the project IRC channel.
- If you would like to keep the list non-public instead, e.g. for privacy reasons or for not wanting to be publicly associated with it, please get in touch with a channel op (e.g. arkiver or JustAnotherArchivist). Note that the items generated from your list would still be processed publicly, of course, but they would be mixed with everything else.
See also Category:Projects requiring URL lists for other ArchiveTeam projects that necessitate URL lists.
Web data only
These tools archive Telegram channels through their web view. Only public channels can be saved this way, and certain attachments (including PDFs and some very long videos) are not available.
ArchiveTeam's telegram project archives as WARC, supporting all available web data (and including outlinks). Suggestions are welcome! A bot in the project IRC channel accepts archival requests for eventual upload into the Wayback Machine; it understands the following commands:
!helpprints a help message listing available commands.
!aarchives Telegram data once.
!a <url>archives the given telegram.me or t.me url (autodetecting the page type).
!a channel:<channel name>archives the given channel.
!a post:<channel name>:<post id>archives the given post, including any discussions.
!parchives a Telegram channel once immediately, then queues it for periodic re-archiving. (This is preferred for active, significant channels.)
!p <channel> <interval> <time-to-live>archives the given channel (as
channel:<name>) once every
intervalseconds for the next
time-to-liveseconds (optional, default 1 year).
<channel>gets info for, renews, or deletes a periodic item, respectively.
!pdumpdumps info for all periodic items.
snscrape exports textual channel data as JSONL (media support is pending).
- Chromebot was once used, but has been shut down due to data integrity issues. (Old jobs can be found via this wiki.)
- The official Telegram Desktop app exports as HTML or JSON.
- telegram-export (no longer maintained)
- telegram-history-dump (no longer maintained); introduction
- Pidgin sync
See also Bellingcat's guide.
See also this etherpad of third-party channel lists.
Hong Kong pro-democracy movement / 2019–2020 Hong Kong protests
- NEXTA Live - Belarus independent news
- ArchiveBot/Telegram (some sites related to Telegram)