Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 2 RFC #125

Open
DaddyWarbucks opened this issue Nov 11, 2022 · 4 comments
Open

Version 2 RFC #125

DaddyWarbucks opened this issue Nov 11, 2022 · 4 comments

Comments

@DaddyWarbucks
Copy link

I have whipped up a new version on this package to add some features and solve some shortcomings of the current version. One of my main motivations in doing so is how well this package can work with the new version of feathers-dataloader. There is now good reason to use this package on the server because it allows sharing a dataloader across all batched services.

Problem

Server

  • The batch service only accepts ['create', 'users', ...] arguments in the calls property. This means that to use it serverside, the developer has has to manually convert their calls to this syntax.
  • The batch service converts all errors to JSON. When using this serverside, it should use real/original errors.

Client

  • Lack of control over who/what/when services are batched.
  • Batches cannot be deduped

Solution

Server

1 - The batch service now accepts an array of promises and/or JSON config. This makes it much easier for the developer to use on the server.

// server.js

// Returns Promise.allSettled() like results with JSON errors. But now it takes service promises too.
const results = await app.service('batch').create({
  calls: [
    app.service('users').get(1),
    app.service('posts').find(),
    // Still supports JSON args
    ['get', 'users', ...]
  ]
});

2- Two convenience methods have been added that re-parse errors and shorten/familiarize the syntax. service.all() and service.allSettled()

// server.js

// Returns Promise.all() like results, but throws with a real error
const results = await app.service('batch').all([
  app.service('users').get(1),
  app.service('posts').find(),
   // Still supports JSON args
   ['get', 'users', ...]
]);

// Returns Promise.allSettled() like results with real errors
const results = await app.service('batch').allSettled([
  app.service('users').get(1),
  app.service('posts').find(),
   // Still supports JSON args
   ['get', 'users', ...]
]);

Client

There are 3 new solutions here to offer the developer more control over the batching process. 2 of them use a BatchManager. The third does not use the BatchManger and is a very explicit way to create batches easily.

1 - Manually create batches. This batchMethods plugin does not attempt to capture batches in a BatchManger, instead it simply adds two methods to the client side batch service to make it easier to use. This is super low commitment from the developer and they don't have to worry about excluding services, skipping batch calls, timeouts, etc. Just a convenient way to make batches. But I don't love this...the whole service callback thing is weird.

// client.js
const { batchMethods } = require('feathers-batch/client')

app.configure(batchMethods({ batchService: 'batch' }));

// `service` is a class that DUCKS like a regular service, but actually returns JSON args
// Returns Promise.all() like results, but throws with a real error
app.service('batch').all((service) => {
  return [
   service('users').get(1),
   service('posts').find(),
   // Still supports JSON args
   ['get', 'users', ...]
  ]
});

// `service` is a class that DUCKS like a regular service, but actually returns JSON args
// Returns Promise.allSettled() like results with real errors
app.service('batch').allSettled((service) => {
  return [
   service('users').get(1),
   service('posts').find(),
   // Still supports JSON args
   ['get', 'users', ...]
  ]
});

2 - Rather than a plugin, the developer can use the batchHook directly. This allows them to use different BatchManager with different timeouts/configs across multiple services or even multiple methods. This is the most powerful solution, but also the most tedious to setup. Especially because many developers may not "set up" all of their remote services. This does not automatically add batchMethods, you would still need to use that plugin if you want those methods. This can be paired with batchClient because the hooks will override the batchClient's config, so it's nice to use both.

const { batchHook } = require('feathers-batch/client')

const batchHook1 = batchHook({ batchService: 'batch', timeout: 25 });
const batchHook1 = batchHook({ batchService: 'batch', timeout: 100 });

app.service('users').hooks({
   before: {
     get: [batchHook1],
     create: [batchHook2]
   }
});

3 - This is a rewrite of the batchClient to solve its main issue. Which was that it used an app.before.all hook. Instead the plugin now uses app.mixins to modify each service's methods to use the BatchManager. This automatically places the batching mechanism in the right place so that all clientside hooks work as expected. This does not automatically add batchMethods, you would still need to use that plugin if you want those methods. This can be paired with batchHook. The hooks will override this manager.

// client.js
const { batchClient } = require('feathers-batch/client')

app.configure(batchClient({ batchService: 'batch' }));

// Automatically batched, same as the old version
app.service('users').get(1);
app.service('posts').find();

// You can also manually skip batches now
app.service('users').get(1, { batch: false });
app.service('posts').find({ batch: false });
  • Deduping - The client now dedupes requests to the server.
// client.js

// Automatically deduped
app.service('users').get(1);
app.service('users').get(1);
  • This new client rewrite also opens the door for socket users to gain value from it as well. Previously this was not advantageous for socket users because sockets are already fast and this could actually causes a slowdown as you wait for the timeout. But, now you can mix and match config to make this more advantageous by sharing loaders on the server. This is particularly helpful in hooks and resolvers as well.
// Disable batching unless a batchManager is explicitly passed.
const disableBatch =  (ctx) => {
  if (!ctx.params.batchManager) {
    ctx.params.batch = false;
  }
  return context;
}

app.hooks({
  before: {
    all: [disableBatch]
  }
});

// Now I can use custom one off managers

const properties = {
  user: (result, context) => {
      const { batchManger } = context.params;
      return context.app.get(result.userId, { batchManger  });
  },
  comments: (result, context) => {
      const { batchManger } = context.params;
      return context.app.get(result.commentIds, { batchManger  });
  }
}

const myResolveHook = context => {
  const batchManager = new BatchManager({ batchService: 'batch', timeout: 0.01 });
  const ctx = {
    ...context,
    params: {
       ...context.params,
       batchManager
    }
  }
  return resolve(properties, ctx);
}
@DaddyWarbucks
Copy link
Author

I've also added more powerful exclude abilities

// Skip individual calls. Overrides all other configs
app.service('users', { excludeBatch: true });
app.service('users', { excludeBatch: context => true });

// Configure batchClient/batch
batchClient({ batchService: 'batch',  excludeBatch: ['users'] });
batchClient({ batchService: 'batch',  excludeBatch: context => true });

// Configure services to ignore batching
app.use('users', memory({ excludeBatch: true }));
app.use('users', memory({ excludeBatch: context => true }));

@DaddyWarbucks
Copy link
Author

DaddyWarbucks commented Nov 17, 2022

Some updates

I have removed all server changes. I was misguided in my attempt to make this work differently on the server. I have some ideas about a similar feature that is similar to this that would work on the server and benefit loaders, but I don't think this is the right place for that idea. So the service.all and service.allSettled have been removed from the server. There is still room for a concept like this (on the client and/or server). But I don't love the syntax...and the new updates allow for some other ways to do this too.

// `service` is a class that DUCKS like a regular service, but actually returns JSON args
// Returns Promise.all() like results, but throws with a real error
batchManager.all((service) => {
  return [
   service('users').get(1),
   service('posts').find(),
   // Still supports JSON args
   ['get', 'users', ...]
  ]
});

// `service` is a class that DUCKS like a regular service, but actually returns JSON args
// Returns Promise.allSettled() like results with real errors
batchManager.allSettled((service) => {
  return [
   service('users').get(1),
   service('posts').find(),
   // Still supports JSON args
   ['get', 'users', ...]
  ]
});

I have accomplished the following, which are all good things that could be shipped even w/o a major bump.

  • Dedupe requests. This uses the same stableStringify trick as feathers-dataloader to create a key to compare against. Works as expected and is a great addition. Right now there is no cacheParamsFn that allows you to manipulate which params are stringified because this library currently only sends the query. More on that later.
  • Better exclude option. exclude now takes an array or a promise like (context) => boolean. It's called on every request. So you can now exclude calls explicitly.
  • dedupe option. dedupe takes a boolean or a promise like (context) => boolean. It's called on every request. So you can now dedupe (or not) calls explicitly.
  • Configure via a hook. This allows you to configure the manager on each service and each method for more control.

Problems
The main hangup I have run into (...again) is that client side REST/Socket services do not support underscored methods. This would be super helpful in making this library work with minimal config. We want this to work without mangling client side hooks, which is not possible without manually addingbatchHook to every service. So I am inclined to update the REST/Socket client services to allow this. But that is definitely a major bump.

A video of me rambling about this. I ran out of time again...but you get it.
https://www.loom.com/share/c7e6f2ce590341dbac79ca0d833d9de9

@daffl @marshallswain @fratzinger Any thoughts on this stuff?

@DaddyWarbucks
Copy link
Author

DaddyWarbucks commented Nov 17, 2022

And a fun side note. Technically the batchClient could be used on the server. Note this is different than my previous comments about "used on the server". I have abandoned that idea. I am referring to using batchClient on the server. The current version could have also been used on the server, but I wouldn't have for some lack of configuration. With the exclude option being more powerful, you could more sensibly use this on the server.
This does then accomplish what I wanted for "use it on the server", which the main goal was to take advantage of AsyncLocalStorage. See https://github.com/feathersjs-ecosystem/dataloader/blob/main/docs/guide.md#reusing-loaders for more details.

This would require a new property of dedupeFn, which would be similar to https://github.com/feathersjs-ecosystem/dataloader/blob/main/docs/guide.md#params-caching. This is currently not supported because the batch service really only accepts query in its payload. But, if we updated the service to accept more params and added some way for the user to customize the key...batching on the server.

// server.js
..express/koa setup, etc

const exclude = (context) => {
  // Maybe you want it to only work on the server
  // Maybe you want it to only work on external requests...whatever
  if (context.params.provider) {
    return true;
  }
  // Maybe only work on get/find
  if (!['get', 'find'].includes(context.method)) {
    return true;
  }
  
  // Other basic stuff
  if (context.path === 'authentication') {
    return true;
  }
  
  return false;
}

const dedupeFn = (context) => {
  return {
    query: context.params.query,
    user: context.params.user
  }
}

app.configure(batchClient({
  timeout: 5,
  batchService: 'batch',
  exclude: exclude,
  dedupeFn: dedupeFn,
  dedupe: true
}))

@DaddyWarbucks
Copy link
Author

There should also be a maxBatchSize option. Otherwise the client/clients can load up many requests within the same batch which is basically a DOS attack. This is OK for the client to do, we can't stop them from doing that. But the server should be updated to maintain some queue that limits the batch size and execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant