Bluesky

From Archiveteam
Jump to navigation Jump to search

Bluesky is a social network based on the AT Protocol, offering functionality comparable to Twitter. Its user base has been expanding, partly due to the current state of Twitter.

Although it is intended to be federated, in practice, it remains largely centralized on the bsky.social server. Users can use a domain they own as their username; however, this alone does not make it federated.

It is an open network, meaning that, except for private messages, activities like blocks, username changes, and others are public.

Downloading users

Bluesky offers exports of user data in the form of CAR files.

There exists a basic command-line tool to download these CAR files, though something better should be developed that can download these cars at faster speeds. The tool in question is at https://github.com/bluesky-social/cookbook/[IAWcite.todayMemWeb].

The cars then contain references to blobs, which are images, videos, etc. that users have uploaded. These also need to be downloaded (helpfully the tool also does that.)

They offer information about downloading and processsing cars at https://docs.bsky.app/blog/repo-export[IAWcite.todayMemWeb].

Bluesky also has a "goat" tool for interacting with these repositories, on GitHub, at https://github.com/bluesky-social/indigo/pull/1108[IAWcite.todayMemWeb].

To export a user's public data, install goat and run these commands. Replace the target user variable with your target user.

export TARGET_USER=example.com
mkdir -p $TARGET_USER && cd $TARGET_USER
goat repo export $TARGET_USER
goat blob export $TARGET_USER

The goat blob command can also be goat blob export --concurrent 8 $TARGET_USER but this functionality is from pull request 1108[IAWcite.todayMemWeb] on the indigo repository that as of 2025-06-26, has not been merged yet. If you would like to be polite to the server and your network, you can also add a delay of 10s between downloads with --delay 10s. Or 30s, or 1m, or 1d.

Do note that this will not store replies to the user's posts or other data. A better tool needs to exist for that.

Archiveteam1.png This article is a stub. You can help by expanding it.