Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to TraceMap for sourcemap's originalPositionFor API #1181

Merged
merged 3 commits into from Apr 30, 2022

Conversation

jridgewell
Copy link
Collaborator

@jridgewell jridgewell commented Apr 20, 2022

trace-mapping is faster, smaller, and lighter than the source-map package, and doesn't require WASM, manual memory management, or async/await to use.

Re: #1164

[trace-mapping](https://github.com/jridgewell/trace-mapping) is faster, smaller, and lighter than the `source-map` package, and doesn't require WASM, manual memory management, or async/await to use.
@@ -44,12 +44,11 @@
"use strict";

import MOZ_SourceMap from "source-map";
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still use the SourceMapGenerator API to generate a new sourcemap for the minified output. TraceMap (and AnyMap) are only for consuming and tracing an already existing sourcemap.

@@ -1,2 +1,2 @@
new function(){console.log(3)};
//# sourceMappingURL=data:application/json;charset=utf-8;base64,eyJ2ZXJzaW9uIjozLCJzb3VyY2VzIjpbInN0ZGluIl0sIm5hbWVzIjpbImNvbnNvbGUiLCJsb2ciXSwibWFwcGluZ3MiOiJBQUErQyxJQUFuQyxXQUFjQSxRQUFRQyxJQUFJIiwic291cmNlc0NvbnRlbnQiOlsiY2xhc3MgRm9vIHsgY29uc3RydWN0b3IoKXtjb25zb2xlLmxvZygxKzIpO30gfSBuZXcgRm9vKCk7XG4iXX0=
//# sourceMappingURL=data:application/json;charset=utf-8;base64,eyJ2ZXJzaW9uIjozLCJzb3VyY2VzIjpbInN0ZGluIl0sIm5hbWVzIjpbImNvbnNvbGUiLCJsb2ciXSwibWFwcGluZ3MiOiJBQUErQyxJQUFyQyxXQUFnQkEsUUFBUUMsSUFBSSIsInNvdXJjZXNDb250ZW50IjpbImNsYXNzIEZvbyB7IGNvbnN0cnVjdG9yKCl7Y29uc29sZS5sb2coMSsyKTt9IH0gbmV3IEZvbygpO1xuIl19
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug in the SourceMapConsumer lookup that's now fixed. There are two segments which both occupy generated column 10. The first has a source column of 10, and the second has a source column of 12. SourceMapConsumer returned the second segment, where TraceMap returns the first. This is a difference between the binary search immediately returning, or finding the "lower bound" for conflicting matches.

lib/sourcemap.js Outdated Show resolved Hide resolved
Co-authored-by: Fábio Santos <fabiosantosart@gmail.com>
@fabiosantoscode
Copy link
Collaborator

I was testing this out earlier, but didn't realize I wasn't actually using the new changes :D


// a small wrapper around fitzgen's source-map library
async function SourceMap(options) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the only reason minify is async right now. I'm wondering if there should be a minifySync

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I don't need to wonder much, it was the one thing people asked for when Terser 5 was released.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like me to go through and remove all asyncs?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this PR :)

Also if the minify function becomes sync, it would break people who are calling .then directly. It should be a new function

@fabiosantoscode
Copy link
Collaborator

fabiosantoscode commented Apr 30, 2022

I'm taking a while to merge this PR because I've been doing some memory testing.

When minifying a very large file (a minified dist/bundle.min.js repeated a whole bunch of times until it's 10mb), this PR appears to allocate around 100mb more RAM than master.

The memory usage during formatting does not change, but it does change between before and after calling await SourceMap() so I think the library is allocating it.

There's also a minimal speed penalty for this large file (8.2s instead of 7.6s). (was wrong, this PR is faster than current)

@fabiosantoscode
Copy link
Collaborator

fabiosantoscode commented Apr 30, 2022

Here are some numbers for a file around half that size -- the speed is better relative to master, but the memory cost is still apparent.

nix-shellfabio@nixos ♥  ## CURRENT                 
nix-shellfabio@nixos ♥  node --expose-gc usemaps.js
--before creating SourceMap                        
heapTotal 439.53 MB                                
heapUsed 324.79 MB                                 
--before creating OutputStream and printing        
heapTotal 409.95 MB                                
heapUsed 318.78 MB                                 
--after OutputStream prints                        
heapTotal 522.59 MB                                
heapUsed 474.29 MB                                 
{                                                  
  parse: 2.1,                                      
  rename: 0,                                       
  compress: 8.308,                                 
  scope: 0.843,                                    
  mangle: 3.424,                                   
  properties: 0,                                   
  format: 4.883,                                   
  total: 19.558                                    
}                                                  
nix-shellfabio@nixos ♥  ## THIS PR                 
nix-shellfabio@nixos ♥  node --expose-gc usemaps.js
--before creating SourceMap                        
heapTotal 490.07 MB                                
heapUsed 324.92 MB                                 
--before creating OutputStream and printing        
heapTotal 464.88 MB                                
heapUsed 385.74 MB                                 
--after OutputStream prints                        
heapTotal 588.27 MB                                
heapUsed 541.35 MB                                 
{                                                  
  parse: 2.095,                                    
  rename: 0,                                       
  compress: 8.401,                                 
  scope: 0.8130000000000001,                       
  mangle: 2.902,                                   
  properties: 0,                                   
  format: 4.705,                                   
  total: 18.916                                    
}                                                  

And here are the file sizes BTW. They're minified by terser which was the easiest way I could think of to create a source map.

The run described above is for minifying kinda-huge.min.js with sourcemaps.

nix-shellfabio@nixos ♥  ls -alh memorygate/huge.min.js
-rw-r--r-- 1 fabio users 10M Apr 29 14:04 memorygate/huge.min.js
nix-shellfabio@nixos ♥  ls -alh memorygate/kinda-huge.min.js
-rw-r--r-- 1 fabio users 5.2M Apr 30 17:20 memorygate/kinda-huge.min.js

@fabiosantoscode
Copy link
Collaborator

Here's the data for the 10MB file, for completion:

nix-shellfabio@nixos ♥  ## MASTER (10mb file)      
nix-shellfabio@nixos ♥  node --expose-gc usemaps.js
--before creating SourceMap                        
heapTotal 993.62 MB                                
heapUsed 616.16 MB                                 
--before creating OutputStream and printing        
heapTotal 824.15 MB                                
heapUsed 604.93 MB                                 
--after OutputStream prints                        
heapTotal 961.59 MB                                
heapUsed 908.98 MB                                 
{                                                  
  parse: 3.902,                                    
  rename: 0,                                       
  compress: 15.171000000000001,                    
  scope: 1.586,                                    
  mangle: 5.524,                                   
  properties: 0,                                   
  format: 9.462,                                   
  total: 35.645                                    
}                                                  
nix-shellfabio@nixos ♥  ## THIS PR (10mb file)     
nix-shellfabio@nixos ♥  node --expose-gc usemaps.js
--before creating SourceMap                        
heapTotal 987.66 MB                                
heapUsed 616.2 MB                                  
--before creating OutputStream and printing        
heapTotal 914.62 MB                                
heapUsed 735.73 MB                                 
--after OutputStream prints                        
heapTotal 1090.3 MB                                
heapUsed 1039.85 MB                                
{                                                  
  parse: 3.814,                                    
  rename: 0,                                       
  compress: 15.038,                                
  scope: 1.564,                                    
  mangle: 5.423,                                   
  properties: 0,                                   
  format: 9.063,                                   
  total: 34.902                                    
}                                                  

I noticed that your PR is faster after all so I'm editing the original comment. Don't know how I missed that.

@jridgewell
Copy link
Collaborator Author

jridgewell commented Apr 30, 2022

It seems you're right. I had only ever memory profiled against the non-WASM 0.6.1. I'm going to add memory benchmarks to my readme, but this is the current result:

amp.js.map
trace-mapping decoded memory usage:   541240
trace-mapping encoded memory usage:  5735360
source-map-js memory usage:         10700144
source-map-0.6.1 memory usage:      17496800
source-map-0.8.0 memory usage:       2181104

***

babel.min.js.map
trace-mapping decoded memory usage:   296400
trace-mapping encoded memory usage: 35610160
source-map-js memory usage:         51294136
source-map-0.6.1 memory usage:      63780600
source-map-0.8.0 memory usage:       1433192

***

preact.js.map
trace-mapping decoded memory usage:   267736
trace-mapping encoded memory usage:   242032
source-map-js memory usage:           928824
source-map-0.6.1 memory usage:       1012256
source-map-0.8.0 memory usage:         88816

***

react.js.map
trace-mapping decoded memory usage:   141032
trace-mapping encoded memory usage:   679360
source-map-js memory usage:          2217168
source-map-0.6.1 memory usage:       2178512
source-map-0.8.0 memory usage:        117480

Because WASM has direct memory access, they can pack the integers extremely tightly, whereas I have to rely on V8's PACKED_SMI arrays. I've been experimenting with a new decoder form using Int32Array, which may improve both memory and runtime, but it means the translation from "decoded map" (SourceMapSegment[][], which I'm introducing to Babel) -> Int32 isn't free anymore. But the current state of "encoded map" (VLQ string, which is used by basically everything) -> decoded array already isn't free, so it may be an acceptable tradeoff.

@jridgewell
Copy link
Collaborator Author

jridgewell commented Apr 30, 2022

I'm a little suspicious this isn't properly detecting memory allocated in the C++ side. In my example above, amp.js.map has 45,120 segments. In order to actually allocate this in a "packed" form in WASM (or Int32Array), you'd need at least 902,400 bytes of memory. However, if I place memory trackers before/after creating the SourceMapConsumer (before actually parsing and allocating the mappings in WASM memory, which isn't done until originalPositionFor is called), I can see 2,009,744 bytes being used. After calling originalPositionFor, it only allocates an additional 32,280 bytes, which is impossible to store the mappings in.

@fabiosantoscode
Copy link
Collaborator

I'm doing a thing that makes this discussion a bit less necessary

--before creating OutputStream and printing
heapTotal 412.21 MB                        
heapUsed 318.82 MB
--after OutputStream prints
heapTotal 517.1 MB
heapUsed 299.21 MB

I figured that we can destroy the AST as we walk it during the output phase. It works pretty well!

@jridgewell
Copy link
Collaborator Author

Yup. Memory allocated by WASM is stored in external, not heapUsed. When summing both:

amp.js.map - 45120 segments
trace-mapping decoded memory usage:     529440
trace-mapping encoded memory usage:    5757912
source-map-js memory usage:           10627472
source-map-0.6.1 memory usage:        17576800
source-map-0.8.0 memory usage:         9589141

***

babel.min.js.map - 347793 segments
trace-mapping decoded memory usage:     197824
trace-mapping encoded memory usage:   35515320
source-map-js memory usage:           51247688
source-map-0.6.1 memory usage:        63522760
source-map-0.8.0 memory usage:        45275776

***

preact.js.map - 1992 segments
trace-mapping decoded memory usage:     267592
trace-mapping encoded memory usage:     242032
source-map-js memory usage:             939040
source-map-0.6.1 memory usage:          981688
source-map-0.8.0 memory usage:           90672

***

react.js.map - 5726 segments
trace-mapping decoded memory usage:      12720
trace-mapping encoded memory usage:     679360
source-map-js memory usage:            2256496
source-map-0.6.1 memory usage:         2185512
source-map-0.8.0 memory usage:          835480

So I think trace-mapping is on par or better for memory usage.

@fabiosantoscode
Copy link
Collaborator

Hey, that's awesome! I'll merge then :)

@fabiosantoscode fabiosantoscode merged commit 423b304 into terser:master Apr 30, 2022
@jridgewell jridgewell deleted the trace-mapping branch April 30, 2022 17:45
@fabiosantoscode
Copy link
Collaborator

This was incredible work @jridgewell, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants