5 min read

Setting Up Your Own PDS Is Frighteningly Easy

Setting Up Your Own PDS Is Frighteningly Easy
This is really easy to do!

A PDS, or Personal Data Server, is kind of the core thing behind atproto. The idea is that it's a server that stores all your atproto data and cryptographic keys. Most people just use Bluesky's PDS right now, but this is probably the most commonly self-hosted part of the ecosystem: you can store all your data for any atproto-compliant app on your own server which, in theory, keeps it from getting walled off in an adversarial scenario.

I'm not quite ready to risk moving my own data off of Bluesky's PDS, but I figured today I'd take the first step towards doing so and set up a PDS on a Hetzner server. I'm following along with the official atproto guide.

The annoying part of this is that currently atproto only offers support for Caddy, while my existing server runs on nginx. Rather than adding yet another thing to figure out, I just spun up a new server - I can spare the $4 for the time being. I then set up forwarding on my bront.rodeo domain for atproto.bront.rodeo.

Once the server is set up with the relevant open firewalls and DNS records, you can set up your PDS, which I was shocked to discover is literally just running one bash script.

curl https://raw.githubusercontent.com/bluesky-social/pds/main/installer.sh >installer.sh
sudo bash installer.sh

I used atproto.bront.rodeo for my DNS and a personal email as an admin email and set up a new user with the handle brent.atproto.bront.rodeo. The account created successfully, and I checked atproto.bront.rodeo.xrpc/_health to make sure everything was working:

Success! That was easier than I expected, honestly. Let's try creating an account on Bluesky with this PDS.

Go to Bluesky and make sure you're signing in with a custom provider (click the little pencil icon by Bluesky's provider name):

And sign in with the password the installer generated by the installer. This just works! I was able to log in to an empty Bluesky account.

Naturally the first suggested post is from the Krassensteins. Bluesky gonna bluesky, I guess. I set up a simple profile and made my first post:

This was astonishingly easy to do, and feels like a flashpoint moment for me on understanding the appeal of atproto. It literally took me ten minutes to set up a self-hosted server that I can use for oauth and cross-application storage. That's insane! So insane that I feel like I must be missing something, so I did a quick API check to see what all I can get directly from my server. I went to https://atproto.bront.rodeo/xrpc/com.atproto.sync.listRepos and found my did:

Going back to yesterday's API script for some more complex queries, I cut out some of the unnecessary bits from the original author's vibe-coded mess and hit describeRepo with my did.

const PDS_HOST = 'https://atproto.bront.rodeo';
const PUBLIC_API_HOST = 'https://atproto.bront.rodeo';
const PLC_DIRECTORY_HOST = 'https://plc.directory';

async function fetchATProtoAPI(endpoint, method = 'GET', params = {}, body = null, accessJwt = null, usePublicHost = false, returnRawResponse = false) {
    let baseUrl;

    if (endpoint.startsWith('plc/')) {
        baseUrl = PLC_DIRECTORY_HOST;
        endpoint = endpoint.substring(4);
    } else {
        baseUrl = usePublicHost ? `${PUBLIC_API_HOST}/xrpc` : `${PDS_HOST}/xrpc`;
    }

    const url = new URL(endpoint.startsWith('http') ? endpoint : `${baseUrl}/${endpoint}`);

    if (method === 'GET' && !endpoint.startsWith('http')) {
        Object.keys(params).forEach(key => url.searchParams.append(key, params[key]));
    }

    const options = {
        method: method,
        headers: {}
    };

    if (body && method !== 'GET') {
        options.headers['Content-Type'] = 'application/json';
        options.body = JSON.stringify(body);
    }

    try {
        const response = await fetch(url, options);
        if (!response.ok) {
            let errorData;
            try {
                errorData = await response.json();
            } catch (e) {
                errorData = { message: response.statusText };
            }
            const errorMsg = `API Error (${response.status}) for ${url.pathname} on ${url.hostname}: ${errorData.error || errorData.message || 'Unknown error'}`;
            console.error(errorMsg, errorData); // Log the full error object too
            throw new Error(errorMsg);
        }
        return await response.json();
    } catch (err) {
        console.error(`Fetch API error for ${url.toString()}:`, err);
        throw err;
    }
}

async function describeRepo(repoDid, pdsHostOverride = null) {
    return fetchATProtoAPI('com.atproto.repo.describeRepo', 'GET', { repo: repoDid });
}
const did_description = await describeRepo(""did:plc:crwugtsporw4ixhqqyul4km6")
console.log(did_description)


Which returns the relevant info for my user:

Then ran a listRecords call to make sure my collections are actually being stored on my PDS:

async function listRecords(repoDid, collectionNsid, limit = 50, cursor = null) {
    const params = { repo: repoDid, collection: collectionNsid, limit };
    if (cursor) {
        params.cursor = cursor;
    }
    return fetchATProtoAPI('com.atproto.repo.listRecords', 'GET', params);
}

const repos = await listRecords("did:plc:crwugtsporw4ixhqqyul4km6", "app.bsky.feed.post", 10, null)
repos.records.map(r => {
    console.log("URI:", r.uri)
    console.log("Value:", r.value)
})

And, if I'm not misunderstanding something, I am able to access my Bluesky posts directly from my PDS's open API, which again, I set up in about ten minutes. This is astonishing! So astonishing that I'm still suspicious of it and am waiting for the other shoe to drop.

Assuming it doesn't, it appears that what we have is a built-in open API on a self-hosted server, with no need for manual setup of API tokens or account registration for client apps that write back to it automatically. I'm starting to understand PDSes as first-class citizens in this ecosystem a lot better: this is easy as hell to set up and really does just let you own your data. Imagine a world where you can oauth through a self-hosted server and know exactly where all the data for all your accounts lives, where different ecosystems know how to talk to each other by default. That rocks!