Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove resource hooks in favor of resource middleware #259

Open
smyrman opened this issue Sep 4, 2019 · 12 comments
Open

Remove resource hooks in favor of resource middleware #259

smyrman opened this issue Sep 4, 2019 · 12 comments
Labels
proposal A suggestion for change that has not been accepted

Comments

@smyrman
Copy link
Collaborator

smyrman commented Sep 4, 2019

Hooks have several problems and limitations compared to middleware:

  • We need two methods (before and after hooks)
  • We can't control the flow as detailed as we would want (Find hooks can modify the query, Get hooks can't)
  • We can't abort/change the flow without raising an error (E.g. call patch instead of delete/clear to implement soft delete).

With middleware we get more power:

  • We can do both pre and post operations within a single method.
  • We can control the flow, e.g. call patch instead of delete to implement soft delete.
  • We can always modify the query.

I want to suggest completely removing resource hooks in favor of middleware. This may involve making resource.Resource an interface that can be wrapped, or we might decide it's sufficient to wrap the storage layer and document that as the way to implement hooks. It could also involve changing the methods of a resource to be settable (except for wrapper methods such as MultiGet), setting the methods to the storage layer method on initalization, and wrapping them when Resource.Use is called.

@smyrman smyrman added the proposal A suggestion for change that has not been accepted label Sep 4, 2019
@rs
Copy link
Owner

rs commented Sep 4, 2019

The issue I see with middleware is that it makes it harder to use optional interfaces. We use those on storer for instance, IIRC.

@smyrman
Copy link
Collaborator Author

smyrman commented Sep 4, 2019

I think that's solvable with clousure defaults, however I think we can probably just drop the optional interfaces all-together and make things easier. To make writing middleware easier we can also drop the Get and Delete methods:

type Storer interface{
	Count(ctx context.Context, q *query.Query) (int, error) // merged from Counter interface
	Find(ctx context.Context, q *query.Query) (*ItemList, error)
	Clear(ctx context.Context, q *query.Query) (int, error)
	Insert(ctx context.Context, items []*Item) error
	Replace(ctx context.Context, item *Item, original *query.Query) error // original as query
}

Middleware we could now wrote:

  • Simple and safe permission checks that only alter the query (for all methods).
  • Soft delete can alter the flow without raising errors, e.g. call Patch on parent.

The resource could still provide convenience methods for Delete, Get and MultiGet by ID on top of Clear and Find.

I am imagining we still call a method on Resource to apply middleware. E.g.:

// Type aliases for functions.
type (
	CountFunc = func(ctx context.Context, q *query.Query) (int, error)
	ClearFunc = CountFunc
	FindFunc = func(ctx context.Context, q *query.Query) (*ItemList, error)
	InsertFunc = func(ctx context.Context, items []*Item) error
)

type CountMiddleware struct {
	CountFunc(parent Resource) CounterFunc
}

type FindMiddleware struct {
	FindFunc(parent Resource) FindFunc
}

// ...

type Resource struct {
	count CountFunc
 	find FindFunc
	clear ClearFunc
	insert InsertFunc
	// ...
}


// newResource creates a new resource with provided spec, handler and config.
func newResource(name string, s schema.Schema, h Storer, c Conf) Resource {
	return Resource{
		count: h.Count,
		find: h.Find,
		// ...
	}
}

func (r Resource) WithMiddleware(m interface{}) Resource {
	parent := r // ensure all middleware functions on a given m get's passed the same parent.
	if t, ok := m.(CountMiddleware); ok {
		r.count = t.CountFunc(parent)
	}
	if t, ok := m.(FindMiddleware); ok {
		r.find = t.FindFunc(parent)
	}
	// ...
	return r
}

func (r Resource) Get(id interface{}) (*Item, error) {
	// Call Find with limit 1, report 0 items found as error
}
func (r Resource) Delete(id interface{}) error {
	// Call Clear with limit 1, report 0 deleted as error
}

@rs
Copy link
Owner

rs commented Sep 5, 2019

Dropping MultiGet on storer would prevent some storage from implementing some interesting optimizations.

If you can't add a middleware for the Get method, how do you handle ACL for resource per id?

In your example you can not chain middleware right? Calling WithMiddleware more than once would override existing middleware. Also, I'm not a big fan of m interface{} argument.

@smyrman
Copy link
Collaborator Author

smyrman commented Sep 5, 2019

In your example you can not chain middleware right? Calling WithMiddleware more than once would override existing middleware. Also, I'm not a big fan of m interface{} argument.

No, it can be chained; r Resource is not a pointer reciever, and is thus copied each time it's passed to a new variable or function.

r := newResource("r", s, h, c)
r2 := r.WithMiddleware(b).WithMiddleare(c) // normally you would do r =, not r2 :=

Given b and c both implement a given middleware interface, and both of the returned functions call parent, calling the function defined on r2 would result in:

  1. function from middleware c is issued
  2. function from middleware b is issued
  3. the original method from h is issued

The original r remains unchanged if not explicitly overwritten (resources are now immutable).
This does mean they need to be bound to an index after they are fully defined.

@smyrman
Copy link
Collaborator Author

smyrman commented Sep 5, 2019

If you can't add a middleware for the Get method, how do you handle ACL for resource per id?

Through the middleware defined for the Find method; resources where we lack access won't be found.

m interface{} argument

This is the same as what we get for hooks today:

// Use attaches an event handler to the resource. This event handler must
// implement on of the resource.*EventHandler interface or this method returns
// an error.
func (r *Resource) Use(e interface{}) error {
	return r.hooks.use(e)
}

But agree it would be nice with a concrete interface, but then you can't allow middleware for multiple operations to be passed in the same function call.

@smyrman
Copy link
Collaborator Author

smyrman commented Sep 5, 2019

m interface{} argument

Maybe this is better:

type MiddlewareFunc func(r Resource) Resource

We don't even need to define this, it could just be by convention.

Example middleware:

func WithSoftDelete(r Resource) Resource {
	parent := r  // capture parent
	r.Clear = func(ctx context.Context, q *query.Query) (int, error) {
		deleteTime := time.Now()
		items, err := parent.Find(ctx, q)
		if err != nil {
			return 0, err
		}
		for i, item := range items.Items  {
			origQ, err := query.Predicate{
				query.Equal{"_etag", item.Etag},
				query.Equal{"id": item.ID},
			}
			item.Payload["deletedAt"] = deleteTime
			item.UpdateEtag() // recalculate E-tag based on item changes.
			err := parent.Replace(item, origQ)
			if err != nil {
				return i, err
			}
		}
		return len(items.Items), nil
	}
	return r
}

Ideally Replace would also be defined as a bulk operation somehow to allow doing this kind of things as a transaction internally in the storage layer.

Usage:

r := resource.New(schema, h) // pass in only schema and Storer?
r = WithSoftDelete(r)
r = WithPermissions(r)
idx.Bind("path", r, conf) // Pass in resource config during bind?
// ...
idx.Bind("path.sub", r2, conf)

When to pass in path and config (during initialization or bind) can be determined later; if middleware need access to it, it must be during New. If it's safe to put the information on the index only, it can happen during Bind.

@smyrman
Copy link
Collaborator Author

smyrman commented Sep 5, 2019

Dropping MultiGet on storer would prevent some storage from implementing some interesting optimizations.

If we could get a simpler / safer access control, and get to implement soft-delete, I would say it's likely worth the price.

Storage layers could still optimize fetching multiple items by ID, but would need to inspect the Find predicate to do so rather than implementing an additional interfaces. I don't know which storage engine you had in mind that can do a multi-get faster than a normal find with id :{$in: [...]}?

@rs
Copy link
Owner

rs commented Sep 5, 2019

That's my point, it is expensive to introspect a query to see if an optimization is possible. Same for ACL, how do you put an ACL on /obj/<id> without a hook on "get"?

@smyrman
Copy link
Collaborator Author

smyrman commented Sep 6, 2019

Same for ACL, how do you put an ACL on /obj/ without a hook on "get"?

Get wraps Find (which already have the right middleware set).

  1. Calling Get(ctx, id) results in a Find(ctx, q.Query{Predicate: query.Equal{"id", id}, Window:query.Window{Limit: 1})
  2. Find issues the relevant permission middleware to AND in the relevant permission criteria.
  3. If Find returns 0 results, Get returns "not fond".

The same is true for Delete which would wrap a Clear call with all relevant middleware applied.

Our own permission syte, isn't actually ACL; permissions are stored on "protected fields" on the object, or on related objects that are fetched before the resource itself, but I don't get how this would be any different with an ACL permission system.

That's my point, it is expensive to introspect a query to see if an optimization is possible.

For the case in particular, it's something like this:

// AsIDQuery returns an IDQuery or nil.
AsIDQuery(q *query.Query) *IDQuery {
    if len(q.Predicate) != 1 {
        return nil
    }
    in, ok := q.Predicate[0].(query.In)
    if !ok || in.Field != "id" {
        return nil
    }
    return *IDQuery{IDs: in.Values, Window: q.Window}
}

I could agree it being complex, but I don't buy that it's going to be expensive (depending on weather Expressions are allowed as In values). As a compromise, one could also add a query.Expression or query.Predicate type specifically for matching against one or more IDs, but then other storage backends would need to know that it can be replaced with an in query.

Either way, the best thing about it complexity wise, is that the complexity is moved from the rest-layer end-user to the rest-layer storer backend implementation, and that's a good thing.

To reiterate on the last point from our own experience on writing hooks, it's really hard and limiting to work with for advanced use-cases. For the permission hooks in particular, we need to:

  1. Modify the query on Find / Clear / Count operations.
  2. On Get, we can't modify the query (we can't redirect to a Find call), and we need to wait until we have fetched the resource, and then match our permission query on the fetched item in code.
  3. On Replace, we need to introspect the original item through reflect.DeepEqual to check that the permissions (or any other protected field) hasn't changed (unless we have permissions to change it).

With the new proposed interface, our permission system in particular only need to alter queries. On replace for instance, we could alter the original query to validate that the relevant protected fields are either not-changed, or that we have the right permissions.

Writing a proper Soft-delete, as explained by #225, just isn't possible through hooks. Probably we could do_ almost_ everything we want as middleware directly on top of the Storage layer, but we have so far chosen to use the documented hooks instead.

@smyrman
Copy link
Collaborator Author

smyrman commented Sep 6, 2019

Same for ACL, how do you put an ACL on /obj/ without a hook on "get"?

OK I get what you mean now. Sorry for being slow. You refer to if, e.g. a user has get:resource permissions but not list:resource permissions for a given resource. I was thinking about filtering out specific items that the user has permission to, and not wether or not the operation as a whole is allowed for a particular user.

This is a good point, and need to be supported.

I want to remind you though, that the current implementation of GET resource/id/ actually calls Resource.Find with a filter on ID and a limit. The only part of rest-layer currently using Get, is field embedding (within the default MultiGet implementation).

On that note, since PATCH needs to fetch the original resource to perform it (and would not want to trigger ACL checks when doing so) I would say ACL to allow/disallow certain operations on a resource (either for a user or globally), should perhapsto be implemented at the rest layer (e.g. as HTTP middleware), and not as resource layer checks.

@rs
Copy link
Owner

rs commented Sep 6, 2019

Ok, if we go that route, why not simplifying it even more and just let people wrap storer to build middleware like it is down with HTTP? This way there is no complexity to be added at the rest-layer layer.

@smyrman
Copy link
Collaborator Author

smyrman commented Sep 6, 2019

Yeah, Probably thats fair.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal A suggestion for change that has not been accepted
Projects
None yet
Development

No branches or pull requests

2 participants