Difference between revisions of "Twitter"

From Archiveteam
Jump to navigation Jump to search
(add example for twarc)
 
(33 intermediate revisions by 18 users not shown)
Line 4: Line 4:
| description =  
| description =  
| URL = https://twitter.com/
| URL = https://twitter.com/
| project_status = {{online}}
| project_status = {{endangered}}
| archiving_status = {{notsavedyet}}
| archiving_status = {{notsavedyet}}
}}
}}
Line 25: Line 25:
* [https://github.com/JustAnotherArchivist/snscrape snscrape]
* [https://github.com/JustAnotherArchivist/snscrape snscrape]


* Tweets can be archived easily through socialbot (an IRC bot that utilizes snscrape, [[ArchiveBot]], and [[ChromeBot]]), individual pages can be archived through ChromeBot. Alternatively, run snscrape locally, upload the list of URLs to https://transfer.notkiska.pw/, and feed them into ArchiveBot using the <code>!ao < <i>file</i></code> command.
* Tweets can be archived easily through socialbot (an IRC bot that utilizes snscrape, [[ArchiveBot]]. Alternatively, run snscrape locally, upload the list of URLs to https://transfer.archivete.am/, and feed them into ArchiveBot using the <code>!ao < <i>file</i></code> command.


* [https://github.com/sferik/t t by sferik] is a command-line interface for Twitter using the API via an application you create on your account. Not only does it allow easy CSV/JSON export of your own data, but it allows you to scrape others tweets. API limits apply but this tool is <b>very</b> powerful
* [https://github.com/sferik/t t by sferik] is a command-line interface for Twitter using the API via an application you create on your account. Not only does it allow easy CSV/JSON export of your own data, but it allows you to scrape others tweets. API limits apply but this tool is <b>very</b> powerful
Line 34: Line 34:
* [https://github.com/sixohsix/twitter The Python Twitter API by sixohsix] has some pretty easy to use scripts for archiving Twitter accounts to a TXT file for people who aren't as technically inclined. It can only save the last 3K or so tweets due to inbuilt Twitter limits, though. (Note: the "-o" flag is pretty much required to archive accounts.)
* [https://github.com/sixohsix/twitter The Python Twitter API by sixohsix] has some pretty easy to use scripts for archiving Twitter accounts to a TXT file for people who aren't as technically inclined. It can only save the last 3K or so tweets due to inbuilt Twitter limits, though. (Note: the "-o" flag is pretty much required to archive accounts.)


* [https://github.com/DocNow/twarc twarc] - for example, you can archive a user with <code>snscrape -f "{id}" twitter-user LibertyLandMC > list.txt</code>, followed by <code>twarc hydrate list.txt</code>.
* [https://github.com/DocNow/twarc twarc] - for example, you can archive a user with <code>snscrape -f "{id}" twitter-user USER | twarc hydrate > data.txt</code>.


* [https://gist.github.com/Asparagirl/e3ee274e4df49230875c880255819d95 Here's a Gist with a step-by-step guide] to getting a long list of a user's tweet status URL's, using a Python program called Tweep.
* [https://gist.github.com/Asparagirl/e3ee274e4df49230875c880255819d95 Here's a Gist with a step-by-step guide] to getting a long list of a user's tweet status URL's, using a Python program called Tweep.


* [https://web.archive.org/web/20140105210020/http://www.tweetscan.com:80/data.php Tweetscan Data] (died sometime after 2014) downloaded your Twitter archive from 12/2007 onward in CSV format (requires Twitter account login/password)
* [https://web.archive.org/web/20140105210020/http://www.tweetscan.com:80/data.php Tweetscan Data] (died sometime after 2014) downloaded your Twitter archive from 12/2007 onward in CSV format (requires Twitter account login/password)
* [https://github.com/mikf/gallery-dl gallery-dl]'s Twitter extractor still seems to work as of 2023-07-17


=== Scraping ===
=== Scraping ===
Line 45: Line 47:


== Vital Signs ==  
== Vital Signs ==  
[[File:Duck Hunt (World)-0--twitter-5.png|frame|right|Many changes to Twitter since Musk's acquisition render this service's longevity (and archivability) rather questionable.]]
In 2017, a subsidiary forum, "translate.twitter.com", was discontinued.<ref>https://archive.fo/https://translate.twitter.com/forum%2A</ref>
In 2022, Elon Musk bought Twitter.<ref>{{URL|https://www.hindustantimes.com/world-news/end-of-free-twitter-elon-musk-hints-he-may-charge-commercial-govt-users-101651624937957.html}}</ref> This may lock up accessible tweets behind paywall, login wall or identity verification wall.
The main site is still online, although Elon Musk fired half of the Twitter staff in November 2022<ref>{{URL|https://www.theguardian.com/technology/2022/nov/04/twitter-layoffs-elon-musk-revenue-drop}}</ref> so it's possible this could ultimately put the site in jeopardy. He also reportedly warned employees that the site is at risk of going bankrupt in 2023<ref>{{URL|https://finance.yahoo.com/news/musk-says-twitter-could-face-bankruptcy-next-year-report-221322212.html}}</ref>.
As of November 17, 2022, many former Twitter employees are saying that many vital Twitter services are now being maintained by a single-digit number of technicians. This could ultimately result in Twitter losing its services.<!--citation needed-->
On December 9, 2022, Elon Musk announced that accounts with no Tweets that have no activity for an extended period of time would be deleted to make room for new accounts. As an alternative interpretation of the statement, this could refer to accounts that have not Tweeted in a long time.<ref>{{URL|https://twitter.com/elonmusk/status/1601124219009409024}}</ref>
On February 2, 2023, Twitter announced free API access will be discontinued and replaced with a paid subscription tier. <ref>{{URL|https://twitter.com/TwitterDev/status/1621026986784337922}}</ref>
During March 2023, Twemoji's maintainers were fired. Its usage in Twitter's mobile apps has since been dropped, with the native emojis now being used. A fork of Twemoji has been created by the ex-maintainers and is currently used by [[Discord]].<ref>{{URL|https://twitter.com/lexikiq/status/1641186153742123009}}</ref>
During April 2023, Twitter Inc. was folded into Elon Musk's new company X Corp.<ref>{{URL|https://www.independent.co.uk/tech/twitter-no-longer-exists-x-corp-b2318916.html}}</ref> Twitter began rebranding into "X", to align with Elon's visions of an "everything app". This rebrand was initiated in July 2023, with the Twitter branding and logo being phased out.<ref>{{URL|https://edition.cnn.com/2023/07/24/tech/twitter-rebrands-x-elon-musk-hnk-intl/index.html}}</ref>
In mid-April 2023, access to Twitter's search function was locked behind a login prompt, requiring users to have an account in order to search for tweets. This makes it more difficult to scrape tweets. <ref>{{URL|https://www.neowin.net/news/the-login-prompt-for-twitter-search-has-returned-just-months-after-elon-musk-wanted-it-gone/}}</ref><ref>{{URL|https://mashable.com/article/twitter-search-registered}}</ref>
On 8 May 2023, Elon Musk announced that Twitter is "purging accounts that have had no activity at all for several years".<ref>{{URL|https://twitter.com/elonmusk/status/1655608985058267139}}</ref> He later clarified that "the accounts will be archived".<ref>{{URL|https://twitter.com/elonmusk/status/1655720120440823809}}</ref> Twitter's "inactive account policy" defines inactive accounts as those that have not been logged in for at least 30 days, and says that accounts "may be permanently removed due to prolonged inactivity".<ref>{{URL|https://help.twitter.com/en/rules-and-policies/inactive-twitter-accounts}}</ref> (Before May 2023, the time needed for an account to become "inactive" was 6 months.)<ref>https://web.archive.org/web/20230419143120/https://help.twitter.com/en/rules-and-policies/inactive-twitter-accounts</ref>


Very stable, probably not going anywhere too soon without warning.
On 30 June 2023, Twitter made it impossible to view tweets without logging into an account.<ref>{{URL|https://www.engadget.com/twitter-isnt-showing-tweets-unless-youre-logged-in-165254006.html}}</ref> This was likely done in response to decreasing user activity levels.<ref>{{URL|https://www.zdnet.com/article/twitter-seeing-record-user-engagement-the-data-tells-a-different-story/}}</ref> Elon Musk later wrote on Twitter: "This will be unlocked shortly. Per my earlier post, drastic & immediate action was necessary due to EXTREME levels of data scraping."<ref>{{URL|https://twitter.com/elonmusk/status/1674942336583757825|Elon Musk on Twitter}} (via [https://news.ycombinator.com/item?id=36540957 Hacker News])</ref> Logged-in users are also subject to a per-day post rate limit  of 10000 (for Twitter Blue subscribers), 1000 (for normal accounts), or 500 (for new, not-subscribed accounts).<ref>{{URL|https://twitter.com/elonmusk/status/1675260424109928449}}</ref> On 5 July, Twitter silently removed the login requirement to view individual tweets, although user feeds and tweet threads still require logging in.<ref>{{URL|https://techcrunch.com/2023/07/05/twitter-silently-removes-login-requirement-for-viewing-tweets/}}</ref>


== Also see ==
== Also see ==
Line 54: Line 76:
== Library of Congress ==
== Library of Congress ==


The U.S. Library of Congress announced in April 2010, via its official Twitter account that it will be acquiring the entire archive of Twitter messages back through March 2006.[http://www.readwriteweb.com/archives/twitters_entire_archive_headed_to_the_library_of_c.php] As of 2016-02-23, this archive is still not available, and when/if it does become accessible it will likely be restricted to researchers, rather than the general public.[https://www.politico.com/story/2015/07/library-of-congress-twitter-archive-119698.html] In January 2017, it was announced that the Library of Congress will no longer archive all tweets, just ones from major news stories.[https://www.npr.org/sections/thetwo-way/2017/12/26/573609499/library-of-congress-will-no-longer-archive-every-tweet]
The U.S. Library of Congress announced in April 2010, via its official Twitter account that it will be acquiring the entire archive of Twitter messages back through March 2006.<ref>{{URL|http://www.readwriteweb.com/archives/twitters_entire_archive_headed_to_the_library_of_c.php}}</ref> As of 2016-02-23, this archive is still not available, and when/if it does become accessible it will likely be restricted to researchers, rather than the general public.<ref>{{URL|https://www.politico.com/story/2015/07/library-of-congress-twitter-archive-119698.html}}</ref> In January 2017, it was announced that the Library of Congress will no longer archive all tweets, just ones from major news stories.<ref>{{URL|https://www.npr.org/sections/thetwo-way/2017/12/26/573609499/library-of-congress-will-no-longer-archive-every-tweet}}</ref>


== Lists ==
== Lists ==
Line 62: Line 84:


== External links ==
== External links ==
* https://twitter.com/
* {{URL|https://twitter.com/}}
* {{URL|https://github.com/MartinKBeck/TwitterScraper}}


{{Navigation box}}
{{Navigation box}}

Latest revision as of 21:45, 3 September 2023

Twitter
Twitter account timeline.png
URL https://twitter.com/
Status Endangered
Archiving status Not saved yet
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)

Twitter is a microblogging service. With each "entry" being 280 (formerly 140) characters or less, the ease with which you can track the tiniest details of your life is amazing. The site has become very popular as a result.

The site is becoming so popular, in fact, that many people are deserting or cutting back on their weblogs to just use the Twitter service for what their weblogging used to fulfill; and with that comes rampant centralization, and with that, greater risk. Back up your tweets!

Archives

There are currently a few archives (but only partially):

The Twitter search API seemingly returns only the latest 7 days worth of tweets.

Backup Tools

  • Tweets can be archived easily through socialbot (an IRC bot that utilizes snscrape, ArchiveBot. Alternatively, run snscrape locally, upload the list of URLs to https://transfer.archivete.am/, and feed them into ArchiveBot using the !ao < file command.
  • t by sferik is a command-line interface for Twitter using the API via an application you create on your account. Not only does it allow easy CSV/JSON export of your own data, but it allows you to scrape others tweets. API limits apply but this tool is very powerful

Twitter automatically resizes uploaded images. To get image in its original resolution, append :orig after the url, e.g.:

   https://pbs.twimg.com/media/CBAoaU1UwAIUPIc.jpg:orig
  • The Python Twitter API by sixohsix has some pretty easy to use scripts for archiving Twitter accounts to a TXT file for people who aren't as technically inclined. It can only save the last 3K or so tweets due to inbuilt Twitter limits, though. (Note: the "-o" flag is pretty much required to archive accounts.)
  • twarc - for example, you can archive a user with snscrape -f "{id}" twitter-user USER | twarc hydrate > data.txt.
  • Tweetscan Data (died sometime after 2014) downloaded your Twitter archive from 12/2007 onward in CSV format (requires Twitter account login/password)
  • gallery-dl's Twitter extractor still seems to work as of 2023-07-17

Scraping

See Site exploration for details.

Vital Signs

Many changes to Twitter since Musk's acquisition render this service's longevity (and archivability) rather questionable.

In 2017, a subsidiary forum, "translate.twitter.com", was discontinued.[1]

In 2022, Elon Musk bought Twitter.[2] This may lock up accessible tweets behind paywall, login wall or identity verification wall.

The main site is still online, although Elon Musk fired half of the Twitter staff in November 2022[3] so it's possible this could ultimately put the site in jeopardy. He also reportedly warned employees that the site is at risk of going bankrupt in 2023[4].

As of November 17, 2022, many former Twitter employees are saying that many vital Twitter services are now being maintained by a single-digit number of technicians. This could ultimately result in Twitter losing its services.

On December 9, 2022, Elon Musk announced that accounts with no Tweets that have no activity for an extended period of time would be deleted to make room for new accounts. As an alternative interpretation of the statement, this could refer to accounts that have not Tweeted in a long time.[5]

On February 2, 2023, Twitter announced free API access will be discontinued and replaced with a paid subscription tier. [6]

During March 2023, Twemoji's maintainers were fired. Its usage in Twitter's mobile apps has since been dropped, with the native emojis now being used. A fork of Twemoji has been created by the ex-maintainers and is currently used by Discord.[7]

During April 2023, Twitter Inc. was folded into Elon Musk's new company X Corp.[8] Twitter began rebranding into "X", to align with Elon's visions of an "everything app". This rebrand was initiated in July 2023, with the Twitter branding and logo being phased out.[9]

In mid-April 2023, access to Twitter's search function was locked behind a login prompt, requiring users to have an account in order to search for tweets. This makes it more difficult to scrape tweets. [10][11]

On 8 May 2023, Elon Musk announced that Twitter is "purging accounts that have had no activity at all for several years".[12] He later clarified that "the accounts will be archived".[13] Twitter's "inactive account policy" defines inactive accounts as those that have not been logged in for at least 30 days, and says that accounts "may be permanently removed due to prolonged inactivity".[14] (Before May 2023, the time needed for an account to become "inactive" was 6 months.)[15]

On 30 June 2023, Twitter made it impossible to view tweets without logging into an account.[16] This was likely done in response to decreasing user activity levels.[17] Elon Musk later wrote on Twitter: "This will be unlocked shortly. Per my earlier post, drastic & immediate action was necessary due to EXTREME levels of data scraping."[18] Logged-in users are also subject to a per-day post rate limit of 10000 (for Twitter Blue subscribers), 1000 (for normal accounts), or 500 (for new, not-subscribed accounts).[19] On 5 July, Twitter silently removed the login requirement to view individual tweets, although user feeds and tweet threads still require logging in.[20]

Also see

Library of Congress

The U.S. Library of Congress announced in April 2010, via its official Twitter account that it will be acquiring the entire archive of Twitter messages back through March 2006.[21] As of 2016-02-23, this archive is still not available, and when/if it does become accessible it will likely be restricted to researchers, rather than the general public.[22] In January 2017, it was announced that the Library of Congress will no longer archive all tweets, just ones from major news stories.[23]

Lists

External links

  1. https://archive.fo/https://translate.twitter.com/forum%2A
  2. https://www.hindustantimes.com/world-news/end-of-free-twitter-elon-musk-hints-he-may-charge-commercial-govt-users-101651624937957.html[IAWcite.todayMemWeb]
  3. https://www.theguardian.com/technology/2022/nov/04/twitter-layoffs-elon-musk-revenue-drop[IAWcite.todayMemWeb]
  4. https://finance.yahoo.com/news/musk-says-twitter-could-face-bankruptcy-next-year-report-221322212.html[IAWcite.todayMemWeb]
  5. https://twitter.com/elonmusk/status/1601124219009409024[IAWcite.todayMemWeb]
  6. https://twitter.com/TwitterDev/status/1621026986784337922[IAWcite.todayMemWeb]
  7. https://twitter.com/lexikiq/status/1641186153742123009[IAWcite.todayMemWeb]
  8. https://www.independent.co.uk/tech/twitter-no-longer-exists-x-corp-b2318916.html[IAWcite.todayMemWeb]
  9. https://edition.cnn.com/2023/07/24/tech/twitter-rebrands-x-elon-musk-hnk-intl/index.html[IAWcite.todayMemWeb]
  10. https://www.neowin.net/news/the-login-prompt-for-twitter-search-has-returned-just-months-after-elon-musk-wanted-it-gone/[IAWcite.todayMemWeb]
  11. https://mashable.com/article/twitter-search-registered[IAWcite.todayMemWeb]
  12. https://twitter.com/elonmusk/status/1655608985058267139[IAWcite.todayMemWeb]
  13. https://twitter.com/elonmusk/status/1655720120440823809[IAWcite.todayMemWeb]
  14. https://help.twitter.com/en/rules-and-policies/inactive-twitter-accounts[IAWcite.todayMemWeb]
  15. https://web.archive.org/web/20230419143120/https://help.twitter.com/en/rules-and-policies/inactive-twitter-accounts
  16. https://www.engadget.com/twitter-isnt-showing-tweets-unless-youre-logged-in-165254006.html[IAWcite.todayMemWeb]
  17. https://www.zdnet.com/article/twitter-seeing-record-user-engagement-the-data-tells-a-different-story/[IAWcite.todayMemWeb]
  18. Elon Musk on Twitter[IAWcite.todayMemWeb] (via Hacker News)
  19. https://twitter.com/elonmusk/status/1675260424109928449[IAWcite.todayMemWeb]
  20. https://techcrunch.com/2023/07/05/twitter-silently-removes-login-requirement-for-viewing-tweets/[IAWcite.todayMemWeb]
  21. http://www.readwriteweb.com/archives/twitters_entire_archive_headed_to_the_library_of_c.php[IAWcite.todayMemWeb]
  22. https://www.politico.com/story/2015/07/library-of-congress-twitter-archive-119698.html[IAWcite.todayMemWeb]
  23. https://www.npr.org/sections/thetwo-way/2017/12/26/573609499/library-of-congress-will-no-longer-archive-every-tweet[IAWcite.todayMemWeb]