Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relationship integrity #1980

Merged
merged 1 commit into from
Sep 3, 2024

Conversation

josephschorr
Copy link
Member

@josephschorr josephschorr commented Jul 8, 2024

Fixes: #1953

@josephschorr josephschorr requested a review from a team as a code owner July 8, 2024 21:23
@josephschorr josephschorr marked this pull request as draft July 8, 2024 21:23
@github-actions github-actions bot added area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) labels Jul 8, 2024
colRelation = "relation"
colUsersetNamespace = "userset_namespace"
colUsersetObjectID = "userset_object_id"
colUsersetRelation = "userset_relation"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe put a new line here to keep all the caveat stuff together

)

func init() {
err := CRDBMigrations.Register("add-integrity-columns", "add-relationship-counters-table", addIntegrityColumns, noAtomicMigration)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a must register? i swear it's been written at some point

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, we never added it. We could, but this seems fine

@@ -36,6 +36,9 @@ var (
colUsersetRelation,
colCaveatContextName,
colCaveatContext,
colIntegrityKeyID,
colIntegrityHash,
colTimestamp,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the timestamp if we have the key id?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, to ensure that if a key is expired, we can still accept relationships signed from before the expiration time

Comment on lines 105 to 147
var sb strings.Builder
sb.WriteString(tpl.ResourceAndRelation.Namespace)
sb.WriteString(":")
sb.WriteString(tpl.ResourceAndRelation.ObjectId)
sb.WriteString("#")
sb.WriteString(tpl.ResourceAndRelation.Relation)
sb.WriteString("@")
sb.WriteString(tpl.Subject.Namespace)
sb.WriteString(":")
sb.WriteString(tpl.Subject.ObjectId)
sb.WriteString("#")
sb.WriteString(tpl.Subject.Relation)

if tpl.Caveat != nil && tpl.Caveat.CaveatName != "" {
sb.WriteString(" with ")
sb.WriteString(tpl.Caveat.CaveatName)
sb.WriteString(":")
sb.WriteString(tpl.Caveat.Context.String())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this something we can put in the tuple package? maybe something similar already exists

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could put it into there

Comment on lines 147 to 210
func (r *relationshipIntegrityProxy) CheckRevision(ctx context.Context, revision datastore.Revision) error {
return r.ds.CheckRevision(ctx, revision)
}

func (r *relationshipIntegrityProxy) Close() error {
return r.ds.Close()
}

func (r *relationshipIntegrityProxy) Features(ctx context.Context) (*datastore.Features, error) {
return r.ds.Features(ctx)
}

func (r *relationshipIntegrityProxy) HeadRevision(ctx context.Context) (datastore.Revision, error) {
return r.ds.HeadRevision(ctx)
}

func (r *relationshipIntegrityProxy) OptimizedRevision(ctx context.Context) (datastore.Revision, error) {
return r.ds.OptimizedRevision(ctx)
}

func (r *relationshipIntegrityProxy) ReadyState(ctx context.Context) (datastore.ReadyState, error) {
return r.ds.ReadyState(ctx)
}

func (r *relationshipIntegrityProxy) RevisionFromString(serialized string) (datastore.Revision, error) {
return r.ds.RevisionFromString(serialized)
}

func (r *relationshipIntegrityProxy) Statistics(ctx context.Context) (datastore.Stats, error) {
return r.ds.Statistics(ctx)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like to make functions like these 1 single line so they all stack together without newlines between. it makes it more clear that they're just boilerplate

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried but go is reformatting it back to this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely because the receiver is named relationshipIntegrityProxy and thus its making the lines too long. Shall I rename the struct?

currentKeyHMAC := hmacConfig{
keyID: currentKey.ID,
expiredAt: currentKey.ExpiredAt,
hmacPool: &sync.Pool{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure it depends, but for tiger I found that a buffered channel performed better than sync.Pool (select on retrieving from the pool to return a new instance if the pool is empty, similar to sync.Pool)

keyID: key.ID,
expiredAt: key.ExpiredAt,
hmacPool: &sync.Pool{
New: func() any { return hmac.New(alg, key.Bytes) },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's probably worth benchmarking sha3 here, since you can use it as a MAC directly instead of through an HMAC

Copy link
Contributor

@vroldanbet vroldanbet Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also learned recently about https://github.com/google/highwayhash (and its Go native impl, https://github.com/minio/highwayhash), theoretically 1 order of magnitude faster than SHA256 thanks to SIMD. The go implementation says:

HighwayHash is not a general-purpose cryptographic hash function (such as Blake2b, SHA-3 or SHA-2) and should not be used if strong collision resistance is required.

But the reference C++ implementation from Google says:

Given a strong hash function and secret seed, it appears infeasible for attackers to generate hash collisions because s and/or R are unknown. However, they can still observe the timings of data structure operations for various m. With typical table sizes of 2^10 to 2^17 entries, attackers can detect some 'bin collisions' (inputs mapping to the same bin). Although this will be costly for the attacker, they can then send many instances of such inputs, so we need to limit the resulting work for our data structure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some prelim tests and highwayhash is faster by about ~50% while SHA3 seemed to be slower.

However, I'm concerned about the lack of strong collision resistance. We'll need to investigate the implications of that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated branch with the highwayhash impl for testing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scenario this attempts to protect from is unauthorized modification of the state of the database. This requires sophisticated attacks and computing running for days to find a collision that could reveal the seed. If someone has access to the database and the machinery to run such an attack, the hash is not going to get in the way honestly, they probably have enough incentives to find other ways. I don't think remote attacks, with all the level of indirections, can give a clear signal to an attacker.

This is a decision the customer could themselves make and we could make it configurable via a flag. If the customer considers only collision-resistant hashes acceptable and are willing to accept the tradeoff in computational cost, then they can choose to do so. Highwayhash is by no means a "weak" hash, but one that makes the attack cost be sufficiently high to be unfeasible to most attackers.

By contrast, 'strong' hashes such as SipHash or HighwayHash require infeasible attacker effort to find a hash collision (an expected 2^32 guesses of m per the birthday paradox) or recover the seed (2^63 requests). These security claims assume the seed is secret. It is reasonable to suppose s is initially unknown to attackers, e.g. generated on startup or even per-connection.

Consider the authors describe how complicated an attack is:

A timing attack by Wool/Bar-Yosef recovers 13-bit seeds by testing all 8K possibilities using millions of requests, which takes several days (even assuming unrealistic 150 us round-trip times). It appears infeasible to recover 64-bit seeds in this way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion, switching back to sha2

Copy link
Contributor

@ecordell ecordell Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context here was that highwayhash isn't suitable for hmac (or at least we don't know that it is)

@github-actions github-actions bot added the area/dependencies Affects dependencies label Jul 11, 2024
@josephschorr josephschorr force-pushed the relationship-integrity branch 8 times, most recently from 9452735 to 3133123 Compare July 18, 2024 18:21
@josephschorr josephschorr marked this pull request as ready for review July 18, 2024 18:36
Copy link
Contributor

@ecordell ecordell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looking good, I left some comments throughout, but my main concerns are:

  1. I don't see where the integrity proxy gets configured and installed...?
  2. I don't see where/how spicedb loads signing keys. Is a commit missing?
  3. I have some concerns about these fields being included in every query and not just when integrity is enabled. Wouldn't it be better to not query for those fields at all? I think this will mess with covering indexes, since I don't think we want them included in indexes.
  4. I think we need to have something in place for key rotation and, maybe more importantly, evolution of the canonical form. Let's say we add an extra field for native TTLs to tuples and we migrate caveat TTLs to a native form - that will require changes in the canonical form and re-signing with the keys.

@@ -51,7 +51,7 @@ type crdbReadWriteTXN struct {

var (
upsertTupleSuffix = fmt.Sprintf(
"ON CONFLICT (%s,%s,%s,%s,%s,%s) DO UPDATE SET %s = now(), %s = excluded.%s, %s = excluded.%s WHERE (relation_tuple.%s <> excluded.%s OR relation_tuple.%s <> excluded.%s)",
"ON CONFLICT (%s,%s,%s,%s,%s,%s) DO UPDATE SET %s = now(), %s = excluded.%s, %s = excluded.%s, %s = excluded.%s, %s = excluded.%s WHERE (relation_tuple.%s <> excluded.%s OR relation_tuple.%s <> excluded.%s)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to have alternate forms of this query for use when relationship integrity is enabled. Not sure what the performance penalty is for checking additional fields.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is a common "issue" with adding these columns, but it does mean having multiple copies, which means more testing


if details.After != nil && details.After.IntegrityKeyID != nil && details.After.IntegrityHashAsHex != nil && details.After.TimestampAsString != nil {
hexString := *details.After.IntegrityHashAsHex
hashBytes, err := hex.DecodeString(hexString[2:]) // drop the \x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming I'll get to a test that shows this is fine but I thought CRDB used \x prefix for just a single hex byte, and x'asdfasdf' for a whole string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not according to testing

}

timestampString := *details.After.TimestampAsString
parsedTime, err := time.Parse("2006-01-02T15:04:05.999999999", timestampString)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constant?


func addRelationshipIntegrityColumns(t *tables) string {
return fmt.Sprintf(`ALTER TABLE %s
ADD COLUMN integrity_key_id VARCHAR(255) DEFAULT NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need 255 bytes for a key id?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but do you have a good limit for it?

func addRelationshipIntegrityColumns(t *tables) string {
return fmt.Sprintf(`ALTER TABLE %s
ADD COLUMN integrity_key_id VARCHAR(255) DEFAULT NULL,
ADD COLUMN integrity_hash TINYBLOB DEFAULT NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why tinyblob and not varbinary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will switch; wasn't aware of it

}

hashWithoutByte := tpl.Integrity.Hash[1:]
if tpl.Integrity.Hash[0] != versionByte || len(hashWithoutByte) != hashLength {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the version byte on the hash and not a part of the key id?

return nil, nil
}

var sb strings.Builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a bytes.Builder

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2024-08-27 at 12 23 56 PM

return r.ReadWriteTransaction.BulkLoad(ctx, wrapped)
}

type wrappedBulkLoadIterator struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name it something like hmacAddingBulkLoadIterator etc?

Copy link
Contributor

@ecordell ecordell Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

edit by @josephschorr: to what is this referring? GitHub is showing no context

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2024-08-27 at 12 23 36 PM

internal/datastore/spanner/watch.go Outdated Show resolved Hide resolved
@josephschorr
Copy link
Member Author

Redesigned and rebased as discussed

@github-actions github-actions bot added area/CLI Affects the command line area/api v1 Affects the v1 API and removed area/dependencies Affects dependencies labels Aug 13, 2024
@@ -447,6 +459,18 @@ func (cds *crdbDatastore) Features(ctx context.Context) (*datastore.Features, er
return features, err
}

func (cds *crdbDatastore) SupportsIntegrity() bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this isn't just another feature returned by the features method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted it to match an interface in the proxy around casting; if you like, I can move it into Features and just have the proxy check that call instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a feature makes more sense - we do that for other stuff like Watch, etc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You added offline features, but this is still there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed it. Removed

)
require.NoError(t, err)

wrapped, err := proxy.NewRelationshipIntegrityProxy(ds, defaultKeyForTesting, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just instantiate this inside NewCRDBDatastore so you can't forget if you set WithIntegrity(true)?

Couldn't the unwrappedTester just set WithIntegrity(false)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it exists outside of CRDB's implementation; its a generic proxy that currently works with CRDB and Memdb (for testing)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just seems confusing that you have to do both - if you forget to add the proxy in, you will think you have integrity enabled but there will be no validation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, no, you'd have integrity data storage enabled, but it would fail due to checks ensuring the data is actually present. Having it be in its own proxy file makes testing easier

internal/datastore/memdb/memdb.go Show resolved Hide resolved
internal/datastore/postgres/common/bulk.go Show resolved Hide resolved
return r.ReadWriteTransaction.BulkLoad(ctx, wrapped)
}

type wrappedBulkLoadIterator struct {
Copy link
Contributor

@ecordell ecordell Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

edit by @josephschorr: to what is this referring? GitHub is showing no context

pkg/datastore/test/tuples.go Show resolved Hide resolved
return nil, nil
}

var sb strings.Builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

return nil, err
}

key.lock.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like it would be a bottleneck for large reads. what about a pool of hashers instead? either a sync.Pool or even just a chan hasher

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a pool of them at first, but removed it because I didn't think we needed to lock it. I've added the pool back in

@josephschorr
Copy link
Member Author

Updated

@josephschorr josephschorr force-pushed the relationship-integrity branch 2 times, most recently from c7afa27 to b6ddc46 Compare August 27, 2024 18:51
@josephschorr
Copy link
Member Author

Updated

@josephschorr josephschorr force-pushed the relationship-integrity branch 2 times, most recently from c0bead0 to fa9bed1 Compare August 29, 2024 15:44
}

func (rwt *crdbReadWriteTXN) queryWriteTuple() sq.InsertBuilder {
if rwt.withIntegrity {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the late note - can we remove these ifs from every call? i.e. pick a different implementation of crdbReadWriteTXN on initialization instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a lot of copy-paste of code and reducing reuse a lot

ecordell
ecordell previously approved these changes Sep 3, 2024
Currently only supported in memdb and CRDB drivers
@josephschorr josephschorr added this pull request to the merge queue Sep 3, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 3, 2024
@josephschorr josephschorr added this pull request to the merge queue Sep 3, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 3, 2024
@josephschorr josephschorr added this pull request to the merge queue Sep 3, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 3, 2024
@josephschorr josephschorr added this pull request to the merge queue Sep 3, 2024
Merged via the queue into authzed:main with commit 6772a60 Sep 3, 2024
22 checks passed
@josephschorr josephschorr deleted the relationship-integrity branch September 3, 2024 18:57
@github-actions github-actions bot locked and limited conversation to collaborators Sep 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/api v1 Affects the v1 API area/CLI Affects the command line area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relationship Integrity
4 participants