Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1348] initial groupBy implementation #1364

Merged
merged 2 commits into from
Feb 27, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
37 changes: 37 additions & 0 deletions lib/groupBy.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import doLimit from './internal/doLimit';
import groupByLimit from './groupByLimit';

/**
* Returns a new object, where each value corresponds to an array of items, from
* `coll`, that returned the corresponding key. That is, the keys of the object
* correspond to the values passed to the `iteratee` callback. Note: Since this
* function applies the `iteratee` to each item in parallel, there is no
* guarantee that the `iteratee` functions will complete in order and there is
* no guarantee that grouped items will appear in the same order as in `coll`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reconsidering about groupBy again I question whether its better implemented synchronously after all the asynchronous work has been resolved, as making the order fixed in an asynchronous processing flow would make things much more complicated

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand, do you mean kinda like ./internal/filter.js where it asynchronously transforms the values, but synchronously filters them afterwards? I could look into that tomorrow.

*
* @name groupBy
* @static
* @memberOf module:Collections
* @method
* @category Collection
* @param {Array|Iterable|Object} coll - A collection to iterate over.
* @param {Function} iteratee - A function to apply to each item in `coll`.
* The iteratee is passed a `callback(err, key)` which must be called once it
* has completed with an error (which can be `null`) and the `key` to group the
* value under. Invoked with (value, callback).
* @param {Function} [callback] - A callback which is called when all `iteratee`
* functions have finished, or an error occurs. Result is an `Object` whoses
* properties are arrays of values which returned the corresponding key.
* @example
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a decent example.

*
* async.groupBy(['userId1', 'userId2', 'userId3'], function(userId, callback) {
* db.findById(userId, function(err, user) {
* if (err) return callback(err);
* return callback(null, user.age);
* });
* }, function(err, result) {
* // result is object containing the userIds grouped by age
* // e.g. { 30: ['userId1', 'userId3'], 42: ['userId2']};
* });
*/
export default doLimit(groupByLimit, Infinity);
44 changes: 44 additions & 0 deletions lib/groupByLimit.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import noop from 'lodash/noop';
import doParallelLimit from './internal/doParallelLimit';

/**
* The same as [`groupBy`]{@link module:Collections.groupBy} but runs a maximum of `limit` async operations at a time.
*
* @name groupByLimit
* @static
* @memberOf module:Collections
* @method
* @see [async.groupBy]{@link module:Collections.groupBy}
* @category Collection
* @param {Array|Iterable|Object} coll - A collection to iterate over.
* @param {number} limit - The maximum number of async operations at a time.
* @param {Function} iteratee - A function to apply to each item in `coll`.
* The iteratee is passed a `callback(err, key)` which must be called once it
* has completed with an error (which can be `null`) and the `key` to group the
* value under. Invoked with (value, callback).
* @param {Function} [callback] - A callback which is called when all `iteratee`
* functions have finished, or an error occurs. Result is an `Object` whoses
* properties are arrays of values which returned the corresponding key.
*/
export default doParallelLimit(function(eachFn, coll, iteratee, callback) {
callback = callback || noop;
coll = coll || [];
var result = {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about having result = Object.create(null)? Then instead of hasOwnProperty.call(...) down below, you could just do key in result.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to avoid using Object.create(null) as the behaviour can not be emulated consistently in older environments. I prefer using standard objects + hasOwn (hasOwn tends to be faster anyway last time I checked)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind using Object.create(null) internally, but I'd also prefer to avoid using it for objects returned to the user. The user might use something like result.hasOwnProperty(anImportantKey) somewhere later in the code, which would result in a TypeError.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object.create(null) is ES5, which is our minimum required version as of 2.0. @hargasinski brings up a good point though, best not to use it because of that.

// from MDN, handle object having an `hasOwnProperty` prop
var hasOwnProperty = Object.prototype.hasOwnProperty;

eachFn(coll, function(val, _, callback) {
iteratee(val, function(err, key) {
if (err) return callback(err);

if (hasOwnProperty.call(result, key)) {
result[key].push(val);
} else {
result[key] = [val];
}
callback(null);
});
}, function(err) {
callback(err, result);
});
});
23 changes: 23 additions & 0 deletions lib/groupBySeries.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
import doLimit from './internal/doLimit';
import groupByLimit from './groupByLimit';

/**
* The same as [`groupBy`]{@link module:Collections.groupBy} but runs only a single async operation at a time.
*
* @name groupBySeries
* @static
* @memberOf module:Collections
* @method
* @see [async.groupBy]{@link module:Collections.groupBy}
* @category Collection
* @param {Array|Iterable|Object} coll - A collection to iterate over.
* @param {number} limit - The maximum number of async operations at a time.
* @param {Function} iteratee - A function to apply to each item in `coll`.
* The iteratee is passed a `callback(err, key)` which must be called once it
* has completed with an error (which can be `null`) and the `key` to group the
* value under. Invoked with (value, callback).
* @param {Function} [callback] - A callback which is called when all `iteratee`
* functions have finished, or an error occurs. Result is an `Object` whoses
* properties are arrays of values which returned the corresponding key.
*/
export default doLimit(groupByLimit, 1);
9 changes: 9 additions & 0 deletions lib/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,9 @@ import filter from './filter';
import filterLimit from './filterLimit';
import filterSeries from './filterSeries';
import forever from './forever';
import groupBy from './groupBy';
import groupByLimit from './groupByLimit';
import groupBySeries from './groupBySeries';
import log from './log';
import map from './map';
import mapLimit from './mapLimit';
Expand Down Expand Up @@ -128,6 +131,9 @@ export default {
filterLimit: filterLimit,
filterSeries: filterSeries,
forever: forever,
groupBy: groupBy,
groupByLimit: groupByLimit,
groupBySeries: groupBySeries,
log: log,
map: map,
mapLimit: mapLimit,
Expand Down Expand Up @@ -220,6 +226,9 @@ export {
filterLimit as filterLimit,
filterSeries as filterSeries,
forever as forever,
groupBy as groupBy,
groupByLimit as groupByLimit,
groupBySeries as groupBySeries,
log as log,
map as map,
mapLimit as mapLimit,
Expand Down