Skip to content
This repository has been archived by the owner on Jun 9, 2023. It is now read-only.

Define database through schema rather than data #1910

Open
ojeytonwilliams opened this issue Nov 16, 2022 · 13 comments
Open

Define database through schema rather than data #1910

ojeytonwilliams opened this issue Nov 16, 2022 · 13 comments
Labels
API Discussion Ideas, feature requests, views on features. Anything which is a discussion.

Comments

@ojeytonwilliams
Copy link
Contributor

Discuss your ideas or share your views:

Since roles, permissions and rsvps are all represented by rows in the database, this creates some difficulties. Whenever we modify the init script this, effectively, requires a migration.The reason for this is that any live databases will need these changes to function correctly. These changes also should only be applied once and a migration is the safest way to ensure that happens.

However, these are not normal migrations - they have no effect on the db schema. This means, for one thing, that rolling back is not trivial since prisma cannot help you. You have to write additional queries to undo the migration. Also, each migration has to be created from scratch, rather than generated by prisma. All of which can be done, but are more time consuming and error prone than schema changes.

Why we're doing this

The plan is to allow owner users to create custom roles specific to their instances. These roles need to be persisted and the database was the obvious place to do that.

What we might do differently

One thought I had was to bake the permissions into the schema. i.e. rather than each permission being a row, it could be a column. Then a given role would still be still be data, but the permissions would not be. For example, instance roles could look like this

name chapter-join sponsor-manage
chapter_administrator true false
owner true true

This seems like it would at least reduce the complexity of the migration.

@ojeytonwilliams ojeytonwilliams added Discussion Ideas, feature requests, views on features. Anything which is a discussion. API labels Nov 16, 2022
@allella
Copy link
Contributor

allella commented Nov 16, 2022

Would the custom queries / migration be limited to defining just the default / first owner? If so, is that causing complexity, or is the thought that other roles need to be hard coded? I've been imagining the owner is like that in the Drupal CMS. "User 1" is the "owner" and all other users are added to the database with only user id 1 having a special meaning.

@ojeytonwilliams
Copy link
Contributor Author

Would the custom queries / migration be limited to defining just the default / first owner?

No, but that is something we need to address before too long. For now I've just been updating the db by hand after the owner signs up. Eventually we'll want some sort of initial signup process to create the first owner.

It's possible I'm overthinking this, but the problem is that the definition of a given instance role (owner, say) is stored as rows in three different tables (instance_roles, instance_role_permissions and instance_permissions). So, we end up having to write migrations like #1909, by hand, to modify the permissions data.

It's further complicated by the fact we prepare new databases using the init script. It fills the permission and role tables with the required data, so has to be run before we can add users. This has been working great in development, but I'm beginning to think it was a mistake. Reason being, the init script can be thought of as being yet another type of migration. The data it creates has to be in every database, after all. So, it's a migration, but it doesn't appear in our migration history.

With that in mind, it's probably best to create a migration that generates what the current init script generates. Then all the migrations are in one place, all databases (production, local, wherever) have the same schema + initial data and (as a consequence) all the tests run against something very similar to production.


To summarize, I see two potential areas for improvement

  1. Replace init script with a migration
  2. Reorganise the schema as outlined above (same deal for instance, chapter and event permissions)

The second one doesn't feel MVP worthy to me, simply because it's a ton of refactoring.

So, for now I'll get on with step 1. because it's blocking deployments and we can keep discussing this here. In the event we decide that step 1. was a mistake, I won't have lost too much time.

@allella
Copy link
Contributor

allella commented Nov 17, 2022

I don't understand the current challenges so much, but I understand the suggestion to move the rows into columns.

Making each row a column we'd be de-normalizing the schema, which typically gets projects painted into a corner down the road.

I took a quick look at how the Drupal CMS does their users, since I'm familiar with it and it's a mature and well thoughtout project that deals with broad use cases. They have a user__roles table and a users table to do what we are doing for roles and then a row in a config table that stores all the permissions mappings for that role.

I'm not sure if this helps, but I thought it would be a good sanity check to keep the conversation going.

@ojeytonwilliams
Copy link
Contributor Author

Thanks Jim, do you know where I can go to see their schema for myself?

It does sound like we're doing something broadly similar and that maybe it isn't as bad as I thought.

@allella
Copy link
Contributor

allella commented Nov 17, 2022 via email

@allella
Copy link
Contributor

allella commented Nov 17, 2022

Actually, Drupal 8 isn't the latest, they are on Drupal 9, but I don't figure it would have changed significantly.

@ojeytonwilliams
Copy link
Contributor Author

Drupal 7's table looks a lot like ours, while Drupal 8's does not seem to have permissions in there. I couldn't find the config table

@allella
Copy link
Contributor

allella commented Nov 17, 2022

The config table is a key/value with a serialized blob that gets parsed as needed. It's used sort of like an object in a non-relational (NoSQL-style) database.

I'm not sure that config table is the direction. It's used for all sorts of other Drupal configuration and I believe it's often persisted to YAML files so some of the database configuration can be overridden by config files, and so the configuration can be included in version control.

@ojeytonwilliams
Copy link
Contributor Author

Okay, cool. It sounds similar enough that we should definitely wait and see how it works in practice before de-normalizing.

@ojeytonwilliams
Copy link
Contributor Author

One thing about denormalisation before I forget. The role tables would still be in 2NF, even with the permissions as rows.

Since name is unique, it's fine for there to be a functional dependence name -> permission_column. Reason being, having a dependency on a proper subset of a key means you're not in 2NF, but each key is a set with one element so they have no proper subsets.

I believe it would also in 3NF because each permission column would functionally depend directly on id (since it's a PK) and name (since it's unique) and nothing else. My understanding is that if you just have transitive dependencies, that's a violation of 3NF, but if the dependency is direct, it's not.

I still think we shouldn't migrate just now, but I believe we'd be okay from a database normalization perspective.

@ojeytonwilliams
Copy link
Contributor Author

@moT01 have been going back and forth on this a bit. There are a couple of reasonably okay possibilities.

Firstly, a single permission table

id chapter-join sponsor-manage...
1 true false
2 true true

and three role tables (instance, chapter and event)

id name permission_id
1 owner 1
2 chapter_administrator 2
3 member 3
id name permission_id
1 administrator 4
2 member 5
id name permission_id
1 member 6

Secondly, multiple permission tables (instance, chapter and event)

id chapter-join sponsor-manage...
1 true false
2 true true

...and the same for chapter and event (but with different permissions)

id name instance_permission_id
1 owner 1
2 chapter_administrator 2
3 member 3
id name chapter_permission_id
1 administrator 1
2 member 2
id name event_permission_id
1 member 1

The relative strengths and weaknesses of these two approaches are basically the pros and cons of #1919 so we can discuss them there.

@gikf
Copy link
Member

gikf commented Jan 3, 2023

Another reason for this is the ability to easily view what permissions have certain role, and what permissions exist, without having live db running. With each new migration (manual or not) this is going to get harder.

@ojeytonwilliams
Copy link
Contributor Author

Agreed. I'm reasonably convinced it's the way to go, but it's a huge migration, so it's not something I want to undertake until the MVP is reasonably stable.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Discussion Ideas, feature requests, views on features. Anything which is a discussion.
Projects
None yet
Development

No branches or pull requests

3 participants