Difference between revisions of "Talk:Twitter"

From Archiveteam
Jump to navigation Jump to search
m (→‎Metadata: “require” → “rely”.)
(3 intermediate revisions by one other user not shown)
Line 7: Line 7:
Do the current methods of saving tweets (i.e. [[snscrape]] ([https://github.com/JustAnotherArchivist/snscrape see on GitHub])) also include metadata such as tweet source tags?<ref>https://www.youtube.com/watch?v=cZIso9uqNls</ref><ref>https://pctechmag.com/2019/02/via-twitter-for-iphone/</ref>
Do the current methods of saving tweets (i.e. [[snscrape]] ([https://github.com/JustAnotherArchivist/snscrape see on GitHub])) also include metadata such as tweet source tags?<ref>https://www.youtube.com/watch?v=cZIso9uqNls</ref><ref>https://pctechmag.com/2019/02/via-twitter-for-iphone/</ref>


=== snscrape ===
=== [[snscrape]] ===


The '''''snscrape''''' method puts the last 3200 (duiedue to API limits) tweets of the archival target user into a URL list, uploaded to https://transfer.sh/ or https://transfer.notkiska.pw/ and feeds it into [[ArchiveBot]] using <code>!a < http://example.com/URL-list.txt</code> and '''also archives tweet replies''' because the URL list contains them as well. <br />[[ArchiveBot]] also saves it into the Wayback Machine, which is great.
The '''''snscrape''''' method puts the last 3200 (due to API limits) tweets of the archival target user into a URL list, uploaded to https://transfer.sh/ or https://transfer.notkiska.pw/ and feeds it into [[ArchiveBot]] using <code>!a < http://example.com/URL-list.txt</code> and '''also archives tweet replies''' because the URL list contains them as well. <br />[[ArchiveBot]] also saves it into the Wayback Machine, which is great.


=== chromebot ===
=== [[chromebot]] ===
Another archival method is <code>chromebot: a https://twitter.com/username</code> or also <code>chromebot: a https://twitter.com/search?q=from:username</code> (without “from:” when also including tweet replies and tweets mentioning the user in the thread).
Another archival method is <code>chromebot: a https://twitter.com/username</code> or also <code>chromebot: a https://twitter.com/search?q=from:username</code> (without “from:” when also including tweet replies and tweets mentioning the user in the thread).


Line 18: Line 18:
In case of controversies, <code>chromebot: a https://twitter.com/search?q=to:username</code> should be used as well.
In case of controversies, <code>chromebot: a https://twitter.com/search?q=to:username</code> should be used as well.


=== Metadata ===
=== Tweet Metadata ===
Because both chromebot and archivebot rely on Twitter's web interface, the amount of tweet metadata captured might be very limited.<br />
Because both chromebot and archivebot rely on Twitter's web interface, the amount of tweet metadata captured might be very limited.<br />
Here are several documentations about Twitter's metadata API:
Here are several documentations about Twitter's metadata API:
Line 26: Line 26:


There should be a way to mass-grab this metadata as well. --[[User:ATrescue|ATrescue]] ([[User talk:ATrescue|talk]]) 01:49, 30 April 2019 (UTC)
There should be a way to mass-grab this metadata as well. --[[User:ATrescue|ATrescue]] ([[User talk:ATrescue|talk]]) 01:49, 30 April 2019 (UTC)
: snscrape grabs the entire history, not just the last 3200 tweets. But it doesn't include retweets. --[[User:JustAnotherArchivist|JustAnotherArchivist]] ([[User talk:JustAnotherArchivist|talk]]) 01:54, 10 May 2019 (UTC)
== References ==

Revision as of 01:54, 10 May 2019

Maybe we should ask the US Government for that Twitter backup? --BlueMaxima 15:33, 24 January 2011 (UTC)

I don't believe they care enough. --ATrescue (talk) 01:50, 30 April 2019 (UTC)

Archiving tweet metadata as well.

Do the current methods of saving tweets (i.e. snscrape (see on GitHub)) also include metadata such as tweet source tags?[1][2]

snscrape

The snscrape method puts the last 3200 (due to API limits) tweets of the archival target user into a URL list, uploaded to https://transfer.sh/ or https://transfer.notkiska.pw/ and feeds it into ArchiveBot using !a < http://example.com/URL-list.txt and also archives tweet replies because the URL list contains them as well.
ArchiveBot also saves it into the Wayback Machine, which is great.

chromebot

Another archival method is chromebot: a https://twitter.com/username or also chromebot: a https://twitter.com/search?q=from:username (without “from:” when also including tweet replies and tweets mentioning the user in the thread).

But chromebot's infinite scroll might not reach as many tweets as snscrape, and even fewer from the targeted account when other tweets are in the search results as well.

In case of controversies, chromebot: a https://twitter.com/search?q=to:username should be used as well.

Tweet Metadata

Because both chromebot and archivebot rely on Twitter's web interface, the amount of tweet metadata captured might be very limited.
Here are several documentations about Twitter's metadata API:

There should be a way to mass-grab this metadata as well. --ATrescue (talk) 01:49, 30 April 2019 (UTC)

snscrape grabs the entire history, not just the last 3200 tweets. But it doesn't include retweets. --JustAnotherArchivist (talk) 01:54, 10 May 2019 (UTC)

References