-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduces mysql datastore #525
introduces mysql datastore #525
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial feedback
|
||
// instrumentedConnector wraps the default MySQL driver connector | ||
// to get metrics and tracing out of the process of creating a new connection | ||
type instrumentedConnector struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put the type def below the top-level definitions of the metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 6e50310
"github.com/rs/zerolog/log" | ||
) | ||
|
||
// instrumentedConnector wraps the default MySQL driver connector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of the process -> "for creating a new connection"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in de2ed55
defer func() { | ||
connectHistogram.Observe(time.Since(startTime).Seconds()) | ||
}() | ||
conn, err := d.conn.Connect(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newline before this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 5c2d1b6
Subsystem: "datastore", | ||
Name: "mysql_connect_count_total", | ||
Help: "number of mysql connections opened.", | ||
}, []string{"error"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe had_error
? or rename to success
and invert the conditional below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
went for the latter approach! addressed in a51c097
if err != nil { | ||
return nil, fmt.Errorf("unable to register metric: %w", err) | ||
} | ||
err = prometheus.Register(connectCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newline before this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 7a61d90
"fmt" | ||
) | ||
|
||
// namespace max size: https://buf.build/authzed/api/file/main/authzed/api/v0/core.proto#L29 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we read them from the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting to change the DML statements to interpolate the size from a constant? Such constants do not exist anywhere in the codebase, the numbers are hardcoded. Or are you suggesting that I should create a new constant in this file and interpolate the DML?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we read them from the code?
I would like that but not that I think due to the way the protobuf generation works the validation is handled inline: https://github.com/authzed/spicedb/blob/main/pkg/proto/core/v1/core.pb.validate.go#L240-L249
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's leave it as-is then
|
||
func createReverseQueryIndex(driver *MysqlDriver) string { | ||
return fmt.Sprintf("CREATE INDEX ix_relation_tuple_by_subject ON %s (userset_object_id, userset_namespace, userset_relation, namespace, relation)", | ||
driver.RelationTuple()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add commas after these args and newlines before the end parens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 1ecaa4a
@@ -0,0 +1,16 @@ | |||
package migrations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe squash these migrations since this is a new driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 055f809
} | ||
} | ||
|
||
func TablePrefix(prefix string) Option { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
document
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in vroldanbet@7161a83
|
||
// NewMySQLBuilder returns a TestDatastoreBuilder for the mysql driver | ||
// backed by a MySQL instance | ||
func NewMySQLBuilder(t testing.TB) TestDatastoreBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Driver is named MySql
; please choose one (probably MySQL
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 94047da
@josephschorr thanks for the review, all your comments should be addressed. I asked for clarification on a handful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
GitHub runs mysql schema migrations at scale using skeema (https://github.com/skeema/skeema) and gh-ost (https://github.com/github/gh-ost). The SpiceDB migration driver for MySQL is designed in a way it is compatible with migration workloads relying on those tools. The main difference is how the MySQL migration driver maintains the metadata needed to know which version is currently running. The PSQL migration driver follows a pattern similar to what's proposed by SQLAlchemy's Alembic, with a table that keeps track of the migration run. Because Skeema does not support data migrations, and seeding the database could be considered a data migration, we decided to encode the metadata into the database schema (DML) so we can expose the same information but only relying on schema migrations. As such the migration_version table is empty, and maintains the running version in a table column. This turns every migration into a pure DML operations, which Skeema understands well. The way we run these then is by executing zed migrate in development DB, and letting skeema detect the changes to the development database. These are then persisted, and deployed using our own internal migration automation buil on top of skeema and gh-ost. The other distiction with the PSQL migration driver is the support for table prefixes. This is necessary to align with our own internal conventions. Co-authored-by: Alex Slepak <ams11@github.com> Co-authored-by: Christopher Kirkland <chriskirkland@github.com> Co-authored-by: Hannah Gould <hagould@github.com> Co-authored-by: John Reed <johnpreed@github.com> Co-authored-by: Sean Bryant <sbryant@github.com>
some recent changes to telemetry seems to be failing with the 1.6.0 MySQL Driver version. 1.5.0 was an indirect dependency already in the main branch.
in its current state the MySQL datastore duplicates a significant portion of the PSQL datastore's logic. Given the fast pace of breaking changes in the internal implementation, we considered code should be upstreamed ASAP and start addressing duplication and gaps as more-approachable and smaller follow ups. The fundamental differences with the PSQL implementation are: - usage of Go's databases/sql API - queries need to be generated on Datastore initialization in order to handle customizable prefix functionality - transaction seed in runtime: the implementation tries hard to avoid data migrations. Instead of handling the seeding of "transaction zero" as part of "spicedb migrate", we introduced it in IsReady(). This will be revisited in a follow-up, to explore moving it to the datastore struct initialization. - SQL query changes: some queries were tweaked to make it compatible with Vitess - IsReady(): introduces changes to avoid downtime during migrations. The implementation relies on a new function "IsHeadCompatible()", which has a rather simplistic heuristic: compatible if is HEAD or the version it HEAD replaced. This is probably the most controversial part of the change given it introduces a new method as part of the API. - Stats cluster ID is not implemented - we didn't see an obvious need to introduce the complexity needed to handle it, but if controversial can be handled in a follow up - only supports a subset of the CLI options. Those will be handled as part of a follow up - does not emit metrics for garbage-collection - also to be handled in a follow up - exposes connection pool metrics via dlmiddlecote/sqlstats This commit does not wire the datastore into the spicedb CLI. Co-authored-by: Alex Slepak <ams11@github.com> Co-authored-by: Christopher Kirkland <chriskirkland@github.com> Co-authored-by: Hannah Gould <hagould@github.com> Co-authored-by: John Reed <johnpreed@github.com> Co-authored-by: Sean Bryant <sbryant@github.com>
this was caught in the mysql implementation by the linter. For some reason the linter didn't detect it in the PSQL implementation.
This should theoretically not be necessary, but while learning on how MySQL did this, we found some behaviour we couldn't quite decipher. - MySQL is governed by a global timezone parameter. - DATETIME is stored with timezone information. - sessions can also have their own timezone and override the server global - when we query the RelationTupleTransaction, the driver is able to obtain DATETIME honoring the timezone it was stored with. The parseTime flag in the go driver enables properly unmarshalling the query response into a time.Time - However, when performing a query without a table like SELECT NOW(), somewhere in the way the timezone wasn't being recognized. We spent some time debugging the driver but couldn't find the answer. With some tweaks to the TestTransactionTimestamps we managed to reproduce the problem, and fixed it by forcing the timezone in the query SELECT UTC_TIMESTAMP(). We ended up leaving it this way because we found it easier to convert all time.Time instances flowing through SpiceDB to UTC just to make it easier to visually inspect it while debugging, so the code was here for a different reason that the PSQL driver.
the initial ask was to move the log message. While inspecting further the method it became obvious there was no reason to scatter the seeding logic and that everything done in IsReady should be encapsulated in seedBaseTransaction().
if the database defaults to latin1, users running the datastore may end up in trouble when needing to store unicode characters. Having it upfront will likely spare some pain in the future.
027ddec
to
603bcb6
Compare
Depends on #519
What
This PR introduces a MySQL-based datastore implementation
How
This implementation duplicates a significant portion of the PSQL datastore's logic. Given
the fast pace of breaking changes in the internal implementation, we considered code should be upstreamed ASAP and start addressing duplication and gaps as more-approachable and smaller follow ups. Early attempts to refactor led to less than desirable engineering overhead keeping the conflicts under control.
This is not a production-grade implementation yet, and we expect tuning will be required. Some in-house load-tests revealed there is room for improvements.
The fundamental differences with the PSQL implementation are:
database/sql
APIspicedb migrate
, we introduced it inIsReady()
. This will be revisited in a follow-up, to explore moving it to the datastore struct initialization.IsReady()
: introduces changes to avoid downtime during migrations. The implementation relies on a new functionIsHeadCompatible()
, which has a rather simplistic heuristic: compatible if isHEAD
or the versionHEAD
replaced. This is probably the most controversial part of the change given it introduces a new method as part of the API.dlmiddlecote/sqlstats
Follow ups
IsReady()
to datastore initializationContributors
These series of PRs are the result of the work from various current and former hubbers which I've added as co-authors, in alphabetic order:
@ams11
@chriskirkland
@hagould
@johnpreed
@sbryant