I know this is not as care free as in NewSQL dbs like CockroachDB but its not that difficult either.
Choosing a shard key ultimately comes down to breaking down your ER diagram and inspecting the incoming write flow to certain tables which are bound to grow.
In any of the cases you don't have to let your user suffer from downtime. At least I have not. Even if you chose wrong shard key, You can gracefully transfer or re-distribute the data records by creating new tables and renaming then with old ones. The window of appearant downtime here is few milliseconds. This is not something which will happen frequently.
Yes, I read it and that is what I meant with "even if you tune all the settings and have statistics of remote tables".
I have tried all the options, if you have any remote join which is more than a few rows you'll have a bad time.
I'm not even talking about changing the shard key, even growing the cluster isn't easily done and much less with "millisecond latency". Unless you use Citus which technically isn't a pure PostgreSQL solution.
Have you read this page carefully especially the section "remote execution option"
https://www.postgresql.org/docs/current/postgres-fdw.html
If not give it a try.
Also this one might help too.
https://www.percona.com/blog/2021/06/02/postgres_fdw-enhance...
Now, about re-sharding:
I know this is not as care free as in NewSQL dbs like CockroachDB but its not that difficult either.
Choosing a shard key ultimately comes down to breaking down your ER diagram and inspecting the incoming write flow to certain tables which are bound to grow.
In any of the cases you don't have to let your user suffer from downtime. At least I have not. Even if you chose wrong shard key, You can gracefully transfer or re-distribute the data records by creating new tables and renaming then with old ones. The window of appearant downtime here is few milliseconds. This is not something which will happen frequently.