The network can be reduced to three primary roles: data servers, the aggregation...

Vinnl · on March 13, 2024

Why would anyone run their own aggregator? (i.e. if you run a search engine, you can show contextual ads to recoup your investment and then some.)

Sorry about going off-topic, I realise it's only tangentially about labelling.

pfraze · on March 13, 2024

We'll let you know when we figure out why we're doing it.

Vinnl · on March 13, 2024

I guess I should have asked about anyone else :) I know why you would - you're planning to sell services around Bluesky [0], and thus need Bluesky itself to be working.

But if it's already working (because you're running an aggregator), there doesn't seem much reason for anyone else to run one? In other words, isn't there a significant risk that there will be fewer aggregators than there are search engines, i.e. just a single one?

[0] https://bsky.social/about/blog/7-05-2023-business-plan

BryantD · on March 13, 2024

I think this is a really good question. Let me offer one possible answer:

It might not be necessary or useful to have multiple aggregators right now. However, I do feel better knowing that if Bluesky the company goes under or changes to a degree where I'm not happy with their decisions, it's possible for someone to stand up a second aggregator.

For that matter, if someone's a free speech absolutist and if they care enough about it to spend the money, they could stand up an aggregator right now with more permissive standards.

jakebsky · on March 13, 2024

Even at scale, running a Relay should be well within the means of a motivated individual or org that is willing to spend hundreds of dollars per month. Right now it'd cost just tens of dollars per month to run a whole network Relay. Some people are already doing this I believe.

Running an "AppView" (an atproto application aggregator/indexer/API server) is generally an order of magnitude more expensive and complicated. But still not beyond the reach of a user coop, non-profit, or small startup.

So these services should all be well within the capabilities of at least multiple companies operating in the atproto ecosystem as it scales.

And in many cases it should make good sense for these companies to do this since it will improve their performance by colocating their services and enable them to do things like schedule their own maintenance windows, etc.

Vinnl · on March 13, 2024

Thanks! "It costs thousands of dollars a month, which is feasible enough that people will find a way" sounds pretty reasonable.

bobajeff · on March 13, 2024

Would it be possible to do a p2p aggregator (Like yacy but for atprotocol)?

pfraze · on March 13, 2024

It might be worth trying, but essentially what you're trying to do is cost/load sharing on the aggregation system. You could do that by computing indexes and sharing them around, to reduce some amount of required compute, and I suspect we'll be doing things like that. (For example, having the precomputed follow graph index as a separate dataset.) However if you're trying to replace the full operational system, I think the only kind of load sharing that could work would require federated queries, which I consider a pretty unproven concept.