Read replica support for Postgres and MySQL datastores #1878

josephschorr · 2024-04-25T20:16:15Z

No description provided.

benny-yamagata · 2024-05-02T02:58:07Z

Just to clarify, this update will allow spice to use multiple reader nodes for postgres and mysql? So if I have 3 readers I would be able to pass in the entries for all of them to be used?

josephschorr · 2024-05-02T03:50:23Z

@benny-yamagata Yes, the replica URI parameter is a list of URIs and the system will round robin between them

vroldanbet

The implementation works with the assumption individual hosts will be listed, but folks most of the time would use replicas behind a load-balancer in order to be able to scale read traffic when needed. I don't think the implementation will work for that more common use case because the datastore snapshot reader does not use a single transaction.

internal/datastore/mysql/datastore.go

vroldanbet · 2024-05-02T10:54:13Z

internal/datastore/mysql/datastore.go

+	index uint32,
+	options ...Option,
+) (datastore.ReadOnlyDatastore, error) {
+	ds, err := newMySQLDatastore(ctx, url, int(index), options...)


I searched online if there is any URI query parameter we can add to enforce read only, but didn't find any. Users should make sure the credentials provided for the replica have read-only permissions.

Not really necessary; the datastore itself ensures it is read only

does it install the read only datastore proxy on the replica?

internal/datastore/mysql/datastore_test.go

internal/datastore/postgres/postgres.go

internal/datastore/proxy/replicated.go

vroldanbet · 2024-05-02T11:53:13Z

internal/datastore/proxy/replicated.go

+	return rr.chosenReader.LookupNamespacesWithNames(ctx, nsNames)
+}
+
+func (rr *replicatedReader) chooseSource(ctx context.Context) error {


I think returning the chosen reader in chooseSource would better encapsulate this logic.

internal/datastore/proxy/replicated.go

josephschorr · 2024-05-02T15:05:00Z

I don't think the implementation will work for that more common use case because the datastore snapshot reader does not use a single transaction.

It uses a single connection, which means it should stay connected to the same replica

internal/datastore/postgres/postgres.go

ecordell · 2024-05-02T17:00:03Z

It uses a single connection, which means it should stay connected to the same replica

I possibly just missed it, but it looked like you were using a pgxpool.Conn for the read replicas which cycles actual connections out from under itself.

josephschorr · 2024-05-02T17:00:48Z

It uses a single connection, which means it should stay connected to the same replica

I possibly just missed it, but it looked like you were using a pgxpool.Conn for the read replicas which cycles actual connections out from under itself.

Yeah, I traced it and it does use the pool. We'll have to do something else

vroldanbet

👍🏻 on the overall strategy, this should work. My main concerns are:

exposing strict via the CLI, I don't see a reason we should do that
cached checked revision proxy may race
missing tests on strictReaderQueryFuncs, clarification how in progress surfaces in a replica
clarification on Watch API behaviour, do we have a test? I understand it requires no extra work, it would just lag behind the primary
inconsistent refactor of common datastore flags

pkg/cmd/datastore/datastore.go

vroldanbet · 2024-06-26T06:40:41Z

pkg/cmd/datastore/datastore.go

 	var legacyConnPool ConnPoolConfig
 	RegisterConnPoolFlagsWithPrefix(flagSet, "datastore-conn", DefaultReadConnPool(), &legacyConnPool)
 	deprecateUnifiedConnFlags(flagSet)
 	RegisterConnPoolFlagsWithPrefix(flagSet, "datastore-conn-pool-read", &legacyConnPool, &opts.ReadConnPool)
 	RegisterConnPoolFlagsWithPrefix(flagSet, "datastore-conn-pool-write", DefaultWriteConnPool(), &opts.WriteConnPool)
+	RegisterConnPoolFlagsWithPrefix(flagSet, "datastore-read-replica-conn-pool", DefaultReadConnPool(), &opts.ReadReplicaConnPool)


Naming pattern differs from existing conn pools. Should be datastore-conn-pool-read-replica.

I wanted all the read replica flags to start with datastore-read-replica to be explicit

I can tell you that as someone usually tinkering with these flags, having to remember 2 different naming patterns is not going to make my life easier

Its a tradeoff, but I still think having all the read replica flags sharing a common prefix makes more sense. Perhaps we should ask?

what is the tradeoff?

When would you ever want to configure read replica pool to be not the same as the read pool?

Always: you can't use the read pool with replicas

Do we think we could use a heuristic inside SpiceDB to come up with these values rather than even exposing them at all?

Incredibly unlikely... We need to know that a replica is a replica

Always: you can't use the read pool with replicas

Configure them differently, not use the same pool.
For example, if I'm using a RDS with a read replica, the read replica is likely the same size as my primary likely with the same max connections.

Incredibly unlikely... We need to know that a replica is a replica

I'm not advocating altering the system to be unaware of read replicas. I'm advocating for the system being smart enough to discover the right size of connection pools because users are very likely to pick non-ideal values themselves

For example, if I'm using a RDS with a read replica, the read replica is likely the same size as my primary likely with the same max connections.

Which still means the read pool for non-replicas will want to be configured differently, as we also have a write pool for the primary.

I'm advocating for the system being smart enough to discover the right size of connection pools because users are very likely to pick non-ideal values themselves

And... how would it do that?

Do you think a simplified version of adaptive concurrency limits (e.g. similar to tcp congestion control) would work? Basically use connection errors to adaptively discover the right number of connections for you pool -- something like this could even make the system robust to changes in the replicas or the size of the database.

Perhaps, but it is IMO outside of the scope of this change (as it would apply to all pools, not just replicas), and would require a significant amount of testing.

pkg/cmd/datastore/datastore.go

vroldanbet · 2024-06-26T06:49:42Z

pkg/cmd/datastore/datastore.go

+		mysql.RevisionQuantization(opts.RevisionQuantization),
+		mysql.MaxRevisionStalenessPercent(opts.MaxRevisionStalenessPercent),
+		mysql.TablePrefix(opts.TablePrefix),
+		mysql.WatchBufferLength(opts.WatchBufferLength),
+		mysql.WatchBufferWriteTimeout(opts.WatchBufferWriteTimeout),
+		mysql.WithEnablePrometheusStats(opts.EnableDatastoreMetrics),
+		mysql.MaxRetries(uint8(opts.MaxRetries)),
+		mysql.OverrideLockWaitTimeout(1),
+		mysql.CredentialsProviderName(opts.ReadReplicaCredentialsProviderName),


I don't see it done for MySQL

jschorr edit since I can't reply: this is done

internal/datastore/proxy/replicated.go

internal/datastore/postgres/strictreader.go

josephschorr · 2024-06-26T16:39:43Z

Updated

pkg/cmd/datastore/datastore.go

vroldanbet

Just a few more remarks but seems ready to go

internal/datastore/context.go

internal/datastore/postgres/postgres.go

internal/datastore/proxy/cachedcheckrev.go

internal/datastore/proxy/replicated.go

vroldanbet · 2024-07-01T10:21:12Z

internal/datastore/proxy/replicated.go

+	replica datastore.ReadOnlyDatastore
+	primary datastore.Datastore
+
+	chosePrimary bool


Suggested change

chosePrimary bool

chosePrimaryInTest bool

This is accomplished by adding a datastore proxy that selects a read replica for each SnapshotReader, in a round robin fashion, and ensures that the revision being requested is available on the replica. This extra check does add some latency to the overall operation, but it should provide for the safest approach for using read replicas Fixes authzed#1321 Fixes authzed#1320

…d of using read replicas

josephschorr · 2024-07-01T16:40:11Z

Updated

This will allow replicas behind load balancers to be supported (just in Postgres for now)

vroldanbet

LGTM. Great work! 🚀

github-actions bot added area/CLI Affects the command line area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) labels Apr 25, 2024

josephschorr force-pushed the read-replica-support branch from 1d5612b to aa51f2a Compare April 25, 2024 20:25

josephschorr marked this pull request as ready for review April 26, 2024 18:22

josephschorr requested review from vroldanbet and a team as code owners April 26, 2024 18:22

vroldanbet requested changes May 2, 2024

View reviewed changes

ecordell reviewed May 2, 2024

View reviewed changes

internal/datastore/postgres/postgres.go Outdated Show resolved Hide resolved

josephschorr marked this pull request as draft May 2, 2024 17:00

josephschorr closed this Jun 17, 2024

github-actions bot locked and limited conversation to collaborators Jun 17, 2024

josephschorr reopened this Jun 21, 2024

josephschorr force-pushed the read-replica-support branch from aa51f2a to de54d54 Compare June 23, 2024 00:38

josephschorr marked this pull request as ready for review June 24, 2024 19:08

vroldanbet requested changes Jun 26, 2024

View reviewed changes

vroldanbet assigned josephschorr Jun 26, 2024

josephschorr force-pushed the read-replica-support branch from de54d54 to 5a5dd26 Compare June 26, 2024 16:39

josephschorr force-pushed the read-replica-support branch from 5a5dd26 to 3d4516e Compare June 26, 2024 19:02

jzelinskie reviewed Jun 28, 2024

View reviewed changes

pkg/cmd/datastore/datastore.go Outdated Show resolved Hide resolved

josephschorr force-pushed the read-replica-support branch from 3d4516e to da79df9 Compare June 28, 2024 21:26

vroldanbet reviewed Jul 1, 2024

View reviewed changes

josephschorr added 2 commits July 1, 2024 12:36

Add caching on read replica CheckRevision calls to reduce the overhea…

65113c4

…d of using read replicas

josephschorr force-pushed the read-replica-support branch from da79df9 to d6873f8 Compare July 1, 2024 16:40

josephschorr force-pushed the read-replica-support branch from d6873f8 to c0bc83e Compare July 1, 2024 16:46

Add an additional mode to replica support that uses a strict read mode

0656701

This will allow replicas behind load balancers to be supported (just in Postgres for now)

josephschorr force-pushed the read-replica-support branch from c0bc83e to 0656701 Compare July 1, 2024 16:52

vroldanbet approved these changes Jul 1, 2024

View reviewed changes

josephschorr enabled auto-merge July 1, 2024 16:56

josephschorr added this pull request to the merge queue Jul 1, 2024

Merged via the queue into authzed:main with commit 9456ce0 Jul 1, 2024
22 checks passed

josephschorr deleted the read-replica-support branch July 1, 2024 17:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read replica support for Postgres and MySQL datastores #1878

Read replica support for Postgres and MySQL datastores #1878

josephschorr commented Apr 25, 2024

benny-yamagata commented May 2, 2024

josephschorr commented May 2, 2024

vroldanbet left a comment •

edited

Loading

vroldanbet May 2, 2024

josephschorr Jun 21, 2024

vroldanbet Jun 26, 2024

vroldanbet May 2, 2024

josephschorr commented May 2, 2024

ecordell commented May 2, 2024 •

edited

Loading

josephschorr commented May 2, 2024

vroldanbet left a comment

vroldanbet Jun 26, 2024

josephschorr Jun 26, 2024

vroldanbet Jun 26, 2024

josephschorr Jun 26, 2024

vroldanbet Jun 26, 2024

josephschorr Jun 28, 2024

jzelinskie Jun 30, 2024

josephschorr Jun 30, 2024

jzelinskie Jun 30, 2024

josephschorr Jun 30, 2024

vroldanbet Jun 26, 2024 •

edited by josephschorr

Loading

josephschorr commented Jun 26, 2024

vroldanbet left a comment

vroldanbet Jul 1, 2024

josephschorr commented Jul 1, 2024

vroldanbet left a comment

Read replica support for Postgres and MySQL datastores #1878

Read replica support for Postgres and MySQL datastores #1878

Conversation

josephschorr commented Apr 25, 2024

benny-yamagata commented May 2, 2024

josephschorr commented May 2, 2024

vroldanbet left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josephschorr commented May 2, 2024

ecordell commented May 2, 2024 • edited Loading

josephschorr commented May 2, 2024

vroldanbet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vroldanbet Jun 26, 2024 • edited by josephschorr Loading

Choose a reason for hiding this comment

josephschorr commented Jun 26, 2024

vroldanbet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josephschorr commented Jul 1, 2024

vroldanbet left a comment

Choose a reason for hiding this comment

vroldanbet left a comment •

edited

Loading

ecordell commented May 2, 2024 •

edited

Loading

vroldanbet Jun 26, 2024 •

edited by josephschorr

Loading