Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v4.0 #137

Closed
jonschlinkert opened this issue Aug 27, 2018 · 49 comments
Closed

v4.0 #137

jonschlinkert opened this issue Aug 27, 2018 · 49 comments

Comments

@jonschlinkert
Copy link
Member

jonschlinkert commented Aug 27, 2018

v4.0

Goals for 4.0

  • Targeted for first week of September, 2018
  • massive reduction in dependencies - no more snapdragon.
  • Load time - currently around 1.5ms
  • Fast first match - less than 5ms from a cold start
  • opt-in caching

v3.0 post-mortem

What we did well in 3.0

  • accuracy
  • parity with bash (ignoring cases where certain bash/shell behavior doesn't make sense in node.js)
  • in-memory performance

What needs to be improved

  • number dependencies
  • load time (first require call)
  • "first match"

Thoughts on the approach I used in 3.0, what I liked about it, and what I didn't like...

  • I used snapdragon for parsing and compiling the regexes in 3.0. This was a double-edged sword in that it helped me discover better strategies for creating the regular expressions, but on the other hand, it added bloat to micromatch that I don't want.
  • inherited parsers - micromatch 3.0 is built on top of other single-responsibility matching libraries: extglob, expand-brackets, braces, and nanomatch (which only does basic globs/wildcards). Each of those libraries attempts to follow the conventions and rules of the equivalent bash module. This was a fun idea, and it worked out pretty well. I was able to build specific parsing and compiling rules in each library, then use those libraries essentially as plugins in micromatch. In some ways, this patterns has also made it easier to fix bugs related to matching and syntax, since it's more clear when a specific rule is being broken in the equivalent bash module.
  • when v3.0 was published, initialization time was around 20ms, and there were far fewer total dependencies in the tree. We started upgrading snapdragon in a couple of the single-responsibility matching libs that micromatch depends on, then we quickly realized that a couple of the libs (one was braces I think) would take more time to upgrade than we anticipated. I had to decide whether or not it was worthwhile spending time to make this upgrades versus a complete overhaul. This is ultimately what made me strongly dislike the approach I took to "building up the matchers" in 3.0. It started out as a neat idea, and made it easier to get the accuracy I wanted, but it has the horrible side effect of multiplying dependencies when version mismatches happen.
@phated
Copy link
Member

phated commented Aug 27, 2018

@jonschlinkert great post! And I like the goals set forth for 4.0 ❤️

@jonschlinkert
Copy link
Member Author

thanks @phated!

@jimmywarting
Copy link

jimmywarting commented Sep 5, 2018

massive reduction in dependencies

You could reduce some more if you target node LTS versions meaning no more support for node < 6

for example: use Object.assign instead of extend-shallow

@jonschlinkert
Copy link
Member Author

You could reduce some more if you target node LTS versions meaning no more support for node < 6

The goal is for micromatch v4.0 to only have two dependencies: one for brace expansion, and one for parsing globs and creating regular expressions. The latter won't have any dependencies of its own. The brace expansion library (braces) has other dependencies, but we will tackle that to reduce deps next.

@paulmillr
Copy link
Member

Love it!

@paulmillr
Copy link
Member

@jonschlinkert I want to include v4.0 in readdirp. When do you want to release it?

@jonschlinkert
Copy link
Member Author

jonschlinkert commented Sep 13, 2018 via email

@paulmillr
Copy link
Member

Yes sir. Please optimize first.

@jonschlinkert
Copy link
Member Author

jonschlinkert commented Sep 13, 2018 via email

@SimenB
Copy link

SimenB commented Oct 16, 2018

@jonschlinkert any news here? 🙏

@jonschlinkert
Copy link
Member Author

@jonschlinkert any news here?

Yes! I've spent quite a bit of time on unit tests and optimization. I'll try to get something pushed up to a branch this weekend. Then we can all agree on whether it's ready to publish.

Thanks for the ping!

@SimenB
Copy link

SimenB commented Oct 17, 2018

Awesome news! Looking forward to land this in Jest 24 🙂

@jonschlinkert
Copy link
Member Author

jonschlinkert commented Oct 22, 2018 via email

@SimenB
Copy link

SimenB commented Oct 22, 2018

Yeah, GitHub has had a lot of issues for the last 20 hours (still ongoing, even though it's better https://status.github.com/messages)

@jonschlinkert
Copy link
Member Author

jonschlinkert commented Oct 22, 2018 via email

@paulmillr
Copy link
Member

@jonschlinkert sup

@paulmillr
Copy link
Member

@jonschlinkert ping.

@jonschlinkert
Copy link
Member Author

@paulmillr hey sorry I missed your last ping. I didn't forget, I've been porting tests. I'll just go ahead and push something up tonight for everyone to see.

@jonschlinkert
Copy link
Member Author

I just pushed up https://github.com/micromatch/picomatch, the matcher function that I'll be using in micromatch.

  • I started working on optimizations, but I'm not finished. i.e. it's fast but will get faster.
  • Micromatch will continue to support full brace expansion. Picomatch only supports brace matching, but not expansion.

Why another matching lib?

It was the fastest way for me to get micromatch upgraded, reduce deps, and get feedback on matching before we release the next version of micromatch.

@paulmillr
Copy link
Member

Nice! Thank you!

@SimenB
Copy link

SimenB commented Nov 27, 2018

Sorry to nag, but any updates here? Is it possible to release and land optimizations later, perhaps? 🙂 Really excited about the direction here!

@jonschlinkert
Copy link
Member Author

Sorry to nag, but any updates here? Is it possible to release and land optimizations later, perhaps? 🙂 Really excited about the direction here!

No worries, thanks for the reminder. Will do, I'll get something published ASAP.

@jonschlinkert
Copy link
Member Author

just wanted to mention that I haven't forgotten, I was stuck on a windows-related matching issue that I spent a great deal of time trying to figure out. Over the next few days, I'll finally have time to get this finished!

@SimenB
Copy link

SimenB commented Jan 3, 2019

Happy new years! 🎉 Any news on this?

@SimenB
Copy link

SimenB commented Jan 26, 2019

Jest 24 is out with micromatch v3 (so we upgraded from 2 to 3). From my understanding of the OP, we should be able to upgrade to v4 when it's released without it being a breaking change for consumers. Had to hack around the snapdragon dep for the browser builds though, so still looking forward to this 😀

@jonschlinkert
Copy link
Member Author

jonschlinkert commented Jan 26, 2019 via email

@paulmillr
Copy link
Member

@jonschlinkert hey Jon, thanks for your work — do you think we can get the release ball rolling in February? We want to release chokidar v3 without node-gyp and radically decrease its bundle size before next nodejs v8.x release. That would be really useful!

@jonschlinkert
Copy link
Member Author

@paulmillr yes I think I can make that happen!

@SimenB
Copy link

SimenB commented Mar 4, 2019

Hey again! Sorry about the nagging, but the require time has been identified as a pretty severe performance regression in the latest version of Jest: jestjs/jest#8032. What's the current status of v4?

If you're working on edge cases, are they edge casey enough that you can maybe release an alpha? :)

Thanks for your hard work at this, I don't mean to sound unappreciative

@thymikee
Copy link

thymikee commented Mar 4, 2019

Is there anything we can help with to make it happen? 🙂

@jonschlinkert
Copy link
Member Author

Hey again! Sorry about the nagging

No worries! I don't mind, and I appreciate that you care and have been so patient. I've actually been working on this for the past several hours. I finally figured out how to get around this windows issue!! I'm going to keep working on this until it's released!

@SimenB
Copy link

SimenB commented Mar 4, 2019

Awesome news, thanks @jonschlinkert!

@jonschlinkert
Copy link
Member Author

@SimenB I'm still working on it. Spent the last couple of days fixing a bug. I'll be pushing up something today, then I should be able to publish a beta. FWIW I know it seems like I'm probably not doing anything on this, but there was a bug that I had a really hard time figuring out for some reason. I ended up having to refactor the entire logic for stars and globstars. I finally have it working!

@SimenB
Copy link

SimenB commented Mar 8, 2019

Super exciting @jonschlinkert! Will there be anything you'd consider breaking changes in this release?

@jonschlinkert
Copy link
Member Author

jonschlinkert commented Mar 14, 2019

Alright, I really had every intention of pushing this up on Friday, and now I've spent the last several days on it. This ended up being such a pain to fix that I started from scratch to get rid of some code debt. I'm glad I did, it helped me figure out a great parsing strategy that made picomatch faster by 200-400%, without caching! All of the unit tests are now passing, and I'm updating the docs!

More importantly, the code is now much easier to maintain and understand, which means it will take me less time to fix bugs and make updates. I'll add a description of the parsing strategy to the readme.

Super exciting @jonschlinkert!

Thank you! I'm starting to think so too now that I refactored :)

Will there be anything you'd consider breaking changes in this release?

I'm going through commits and creating docs now. I'll have a better (more confident) answer for you once I get a little further. In the meantime, here is a preview of some of the new features that (IMHO) make this release worth the wait and really nice to use:

const micromatch = require('micromatch');
const paths = [];

// custom function to modify each string before it's tested
const format = str => str.replace(/\\/g, '/').replace(/^\.\//, '');

// custom function that is called on each successful match
const onMatch = ({ pattern, regex, input, value }) => {
  // pattern - the original glob
  // regex - the regex created from the glob
  // input - the unmodified input string to be matched
  // value - the "formatted" input string to be returned
  console.log({ pattern, regex, input, value });
};

// custom function that is called on each string that matches an ignore pattern
const onIgnore = ({ pattern, regex, input, value }) => {
  console.log({ pattern, regex, input, value });
};

const options = { onMatch, onIgnore, format, ignore: ['*/baz'] };
const matches = micromatch(['foo\\bar', 'foo\\baz', './abc/xyz'], '**', options);
console.log(matches);
//=> ['foo/bar',  'abc/xyz']

...and here are a few of the benchmarks with caching disabled!

image

@SimenB
Copy link

SimenB commented Mar 14, 2019

Woah, absolutely bonkers numbers on those benchmarks! Awesome job.

The format function looks really interesting as well, I wonder if that can help with the underlying issue in #142

@csvan
Copy link

csvan commented Mar 14, 2019

@jonschlinkert amazing work, those numbers look fantastic!

@jonschlinkert
Copy link
Member Author

I wonder if that can help with the underlying issue in #142

Yes! That's exactly the kind of use case that I was hoping to address with format!

amazing work, those numbers look fantastic!

Thank you! Much appreciated!

@paulmillr
Copy link
Member

@jonschlinkert what's the plan for v4 in removing braces? Are we ditching them?

@jonschlinkert
Copy link
Member Author

what's the plan for v4 in removing braces? Are we ditching them?

No. Braces are supported.

@paulmillr
Copy link
Member

@jonschlinkert I'm asking because braces is the module that brings the most complexity to chokidar and other libs. 3 megabytes — that's too much...

We seem to have a way of optimizing micromatch (v4.0), but nothing for braces.

@jonschlinkert
Copy link
Member Author

I'm asking because braces is the module that brings the most complexity to chokidar and other libs. 3 megabytes — that's too much...

I'm not sure how that was determined when micromatch v4.0 wasn't pushed up yet, but I'll look into it and see what we can do.

@paulmillr
Copy link
Member

I determined that because we are rewriting micromatch to use picomatch instead of nanomatch — but the braces module is staying the same, apparently.

@jonschlinkert
Copy link
Member Author

jonschlinkert commented Apr 1, 2019

but the braces module is staying the same, apparently.

It's being refactored. The main cause of the size is two things:

  1. People doing pull requests to bump versions for no good reason, which caused duplicate dependencies in the tree
  2. source maps, which will not be included in the next version of braces.

I won't be taking any non-critical pull requests to the current versions of braces or snapdragon if anyone wants me to get updates released. I can either stay focused on releasing streamlined, faster versions, or we can keep playing whack-a-mole.

@paulmillr
Copy link
Member

@jonschlinkert Please take a look here, i've reduced braces/snapdragon size twice in no time: here-be/snapdragon#23

@jonschlinkert
Copy link
Member Author

Please take a look here, i've reduced braces/snapdragon size twice in no time

The snapdragon API changed a lot between the current version and the version that braces uses. I appreciate the PR, and I'll merge it in, but the amount of time it would take to update braces to use the newer API wouldn't be worth it. I'd rather just refactor braces completely to not use snapdragon.

@jonschlinkert
Copy link
Member Author

@paulmillr micromatch/braces#24

It's fast as hell. we have that going for us, which is nice.

Also wanted to mention that we can't use the same parsing strategy with braces as with other globs. It's a lot more complicated. We need a real parser with an AST that allows us to handle nested expressions correctly. Assuming we want to continue avoiding the DDoS issues that have plagued minimatch

@paulmillr
Copy link
Member

@jonschlinkert that's pretty huge. Please release this before micromatch v4.

@jonschlinkert
Copy link
Member Author

Alright, it's published!

I'll be out of the office until this afternoon. If any regressions happen I can triage then. Any help from team members would be appreciated too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants