Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: debugger source lookup becomes unusable #5367

Closed
reckart opened this issue Sep 8, 2022 · 30 comments · Fixed by #5372
Closed

Performance: debugger source lookup becomes unusable #5367

reckart opened this issue Sep 8, 2022 · 30 comments · Fixed by #5372
Assignees

Comments

@reckart
Copy link
Contributor

reckart commented Sep 8, 2022

This is a kind-of follow-up issue to #5322

I am currently using a bnd 6.4.0-SNAPSHOT from the p2 update site.

When I try to debug, the source lookup has become unbearably slow. Checking what happens, I found the checksum calculation in the BND plugins to be the culprit again. The screenshot below shows who is calling the aQute...Digestr.write() method - which is mainly the ResourceImpl.equals().

Screenshot 2022-09-08 at 09 41 25

I really wonder... is my setup so extraordinary with a couple of hundred Maven modules and bundles? Am I doing something wrong?

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

I did an analysis. I pulled the latest changes from git to commit 5853f25 - and I added a log point in the deferred SHA256 calculation:

	private final static Logger logger = LoggerFactory.getLogger(ResourceBuilder.class);
	private final static AtomicInteger hashCalcCount = new AtomicInteger(0);


	public boolean addFile(File file, URI uri) throws Exception {
                ....
		DeferredValue<String> sha256 = new DeferredComparableValue<>(String.class,
			asSupplierOrElse(() -> {
				long start = System.currentTimeMillis();
				int count = hashCalcCount.incrementAndGet();
				logger.info("["+count+"] Calculating SHA256 on ["+file+"]...");
				String hash = SHA256.digest(file).asHex();
				logger.info("["+count+"] Calculating SHA256 on ["+file+"] took "+(System.currentTimeMillis()-start)+"ms");
				return hash;
			}, null),
			deferredHashCode);
                ....
        }

Then I build, installed the freshly built plugins into my Eclipse and opened a test.bndrun file in the BND Runfile Editor.

The log point was triggered way over 40k times and the file had still not opened in the BND runfile editor. I was only able to capture the first 23337 calls to the logpoint in a log file because I had to briefly leave then and when I came back the scroll buffer had already overflown... anyway...

So from 23337 times the log-point was triggered, here are the top-ten times a SHA256 digest was calculated for a given file. You see here the name of the file and how often during the opening of a single BND runfile the same hash was calculated over and over again. For most files, calculating the has only takes a couple of milliseconds - but I also have some large files for which it takes several seconds.

  40 .../git/uima-uimaj/uimaj-adapter-vinci/target/uima-adapter-vinci.jar
  40 .../git/uima-uimaj/uimaj-cpe/target/uima-cpe.jar
  40 .../git/uima-uimaj/uimaj-tools/target/uima-tools.jar
  41 .../.m2/repository/commons-io/commons-io/2.11.0/commons-io-2.11.0.jar
  41 .../git/uima-uimafit/uimafit-core/target/uimafit-core-3.4.0-SNAPSHOT.jar
  42 .../git/uima-uimaj/uimaj-core/target/uima-core.jar
  43 .../.m2/repository/junit/junit/4.13.2/junit-4.13.2.jar
  43 .../.m2/repository/org/slf4j/slf4j-api/1.7.36/slf4j-api-1.7.36.jar
  44 .../.m2/repository/org/apache/commons/commons-lang3/3.12.0/commons-lang3-3.12.0.jar
  44 .../.m2/repository/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar

@bjhargrave you wrote that even if the calculation is deferred now, it needs to be called at some point - ok, fine - but does it have to be called 40+ times on the same file even while just simple wanting to open a runfile in the editor?

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

I have built myself a locally modified version with a little cache for the SHA256 hashes. It expires hashes that are older than 30 minutes or if the file modification timestamp on disk differs from the one recorded in the cache. It also records misses, file updates and hits - here an example:

Calculated SHA256 on [MY-ARTIFACT-1.0.0-SNAPSHOT-tests.jar] took 6ms -- 1011 - 83524 - 0
  • Cache contains 1011 items
  • There were a total of 83524 cache hits
  • None of the cached SHA256 hashes has been replaced because of the file on disk having a newer timestamp

(Those are the numbers for opening my bndrun file in the bnd runfile editor directly after starting Eclipse.)

Also, overall it feels quite a bit faster now... I still have to see if I notice any negative side effects of this cache.

@bjhargrave
Copy link
Member

You are using m2e which is using Bndtools m2e to handle the bndrun support. The bndrun support depends upon OSGi repositories which are made using FileSetRepository. The FileSetRepository is then a collection of ResourceImpls whose equals method depends upon the SHA256 value.

It is unclear to me from the information here how many times FileSetRepositories are being created. Each FSR will have distinct ResourceImpls which each need to compute the SHA256 value.

I would need a small github repo which demonstrates the problem to investigate further. Otherwise, I am just guessing on what the actual issue is an how to solve it.

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

Well you can assume that there are like a couple of hundred Eclipse projects in the workspace, each of which is an m2e module and each of which includes the bnd maven plugins to generate the OSGI metadata for the main artifact, for the test artifact, which run the bnd resolver maven plugin to update the test.bndrun file and which contain the bnd testing maven plugins to run tests (there is one test.bndrun file in each module). So from what you say, you could assume a couple of hundred FileSetRepositories I guess.

The m2e modules cluster into several groups that each inherit from a joint parent pom module. That parent pom usually contains a large dependency management section which manages all dependency versions for the entire reactor. The useDependencyManagement option is enabled in the bnd maven plugin exeuctions in all of the m2e modules.

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

The set of dependencies of all of these projects is probably the ~1000 JAR files that end up in my cache (see comment above) - so if two of the repositories refer to a ResouceImpl that has the same path, it should usually be the case that the file also has the same hash - so information about hashes calculated for files referenced from multiple repositories should be centrally sharable, no?

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

This is the makeshift static cache I hacked myself together for the moment - not very well done, but apparently effective...

	private final static AtomicInteger missCount = new AtomicInteger(0);
	private final static AtomicInteger hitCount = new AtomicInteger(0);
	private final static AtomicInteger fupCount = new AtomicInteger(0);
	private final static Map<String, CachedSha256> CACHE = new HashMap<>();

	private class CachedSha256 {
		final long timestamp;
		final long calculated;
		final String sha256;

		CachedSha256(long timestamp, String sha256) {
			this.timestamp = timestamp;
			this.calculated = System.currentTimeMillis();
			this.sha256 = sha256;
		}
	}


		DeferredValue<String> sha256 = new DeferredComparableValue<>(String.class,
			asSupplierOrElse(() -> {
				String path = file.getPath();
				CachedSha256 cachedHash;
				synchronized (CACHE) {
					long now = System.currentTimeMillis();

					// Clean up every once in a while
					int count = missCount.incrementAndGet();
					if (count % 50000 == 0) {
						CACHE.entrySet().removeIf(e -> (now - e.getValue().calculated) > (30*60*1000) );
					}

					cachedHash = CACHE.get(path);
					long timestamp = file.lastModified();
					if (cachedHash == null || timestamp > cachedHash.timestamp) {
						if (cachedHash != null && timestamp > cachedHash.timestamp) {
							fupCount.incrementAndGet();
						}

						String hash = SHA256.digest(file).asHex();
						cachedHash = new CachedSha256(timestamp, hash);
						CACHE.put(path, cachedHash);
					}
					else {
						hitCount.incrementAndGet();
					}
				}
				return cachedHash.sha256;
			}, null),
			deferredHashCode);

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

Looking at the code, I note that there is a small bug - the missCount is actually not counting misses but lookups. The real miss count would be hitCount - missCount...

@bjhargrave
Copy link
Member

I am not sure I see the value of the DeferredValue classes if we will have a cache of file->SHA values. Both are overkill. I would want to rip out the DeferredValue change if we move to a SHA cache. Then ResourceBuilder.addFile would itself consult a new SHA cache class instead directly using SHA256.digest.

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

I guess deferring the calculation until it is needed could still save effort in case the value is never needed at all. I did notice a performance improvement when you first put in deferral. But at the time, I still had fewer Eclipse projects converted to OSGi bundles. The more projects I migrated, the slower it appeared to become again.

@bjhargrave
Copy link
Member

It should be easy for you to test the removal of using the DeferredValue class in the SHA value calculation while keeping your SHA cache.

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

I think instead of removing it, I might add counters that count calls to addFile() and counts to the supplier of the deferred value. The difference would be the calls saved.

Btw. I tried writing log messages through an SLF4J logger in ResourceBuilder- but I was not able to see these messages in the Eclipse log, so I had to connect using a debugger and to use a logpoint. Can you tell me how to properly log messages from within the ResourceBuilder so that I can actually see them somewhere while this code is running in Eclipse?

@bjhargrave
Copy link
Member

If you start Eclipse from the command line, then you should be able to see the console output. We use logback. See

-runpath: \
ch.qos.logback.classic,\
ch.qos.logback.core,\
org.apache.felix.logback,\
slf4j.api
and
logback.configurationFile=${fileuri;${.}/logback.xml}

<configuration scan="true">
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<pattern>%date [%thread] %-5level %logger{35} - %msg%n</pattern>
</encoder>
</appender>
<logger name="aQute" level="INFO" />
<logger name="bndtools" level="INFO" />
<logger name="org.bndtools" level="INFO" />
<root level="ERROR">
<appender-ref ref="STDOUT" />
</root>
</configuration>

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

So here some statistics - again, opening one of my test.bndrun files after a fresh start of Eclipse:

  • cache size: 1003
  • times addFile was called: 52294
  • times an SHA256 for a file was requested: 51239
  • updates to cached SHA256 due to file having a new modification timestamp: 0
  • cache hits: 50235
  • cache misses: 1004 (probably a off-by-one bug here - cache misses should probably be equal to the cache size)

That gives 1055 times that addFile was called but the SHA256 was not later needed. Assuming the calculation of the SHA256 takes maybe ~5-10ms that would sum to up like 5-10 seconds of saved time. Not much but also not nothing.

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

... I tried running Eclipse from the terminal and I do see some messages there, but not from the logger call I added to the cache code. So I used the debugger with the logpoint again ;)

@bjhargrave
Copy link
Member

You will probably need to edit the logback.xml to set the desired log level for your logger name.

@bjhargrave
Copy link
Member

1055 is only 2% of the times addFile was called. So that does not seem to be large enough to justify retaining the complexity of the DeferredValue code in addition to adding the complexity of a SHA cache.

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

Still, 10 seconds additional delay simply when trying to open a bndrun file in the editor is noticeable by the user... personally, I'd vote for keeping the deferral in.

@bjhargrave
Copy link
Member

You don't know it is 10 sec delay. You just pulled a number out of thin air :-)

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

Well, if I look at my logpoint output, I see lots of times in the 5ms-10ms range for calculating the hash sums. As I said before, I have some files that take several seconds for the hash to be calculated, so the true average is probably higher than 10ms.
1000 saved calls to SHA256 (the 2%) times 10ms = 10 seconds - not pulled out of thin are but rather an informed estimate.

@bjhargrave
Copy link
Member

I think that 2% of the total time spent computing SHA hash values is a more accurate time estimate since you don't know that the files whose SHA was not needed have expensive SHA calculations or not.

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

Here's a more accurate measurement. I took the first 1000 timings of SHA256 calculations from the run I logged ealier. These sum up to 17847 ms - that's ~17ms on average per file.

@rotty3000
Copy link
Contributor

FYI here's some notes on logging in m2e/bndtools on eclipse https://github.com/bndtools/bnd/wiki/Eclipse-Logging-with-Logback

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

Here are the timings ;) batch of 1000, times in ms. The longest in this batch was 1039ms, but I have seen also longer calculations in the full run.

timings.log.gz

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

So if we assume 10ms per SHA256 calculation and we had 1004 cache misses in the above calculation, that would be 10 seconds with the deferral in place. If we removed the deferral, we'd get an additional 10 seconds - that would add up to 20 seconds in total - twice the time as without deferral.

@reckart
Copy link
Contributor Author

reckart commented Sep 8, 2022

Ok, what we do not know is if the calls saved by the deferral would have been cache hits - in that case, the deferral would be indeed pointless.

@reckart
Copy link
Contributor Author

reckart commented Sep 9, 2022

So, new numbers from my little real-world scenario here:

  • 1010 - so many SHA256 for unique paths made it to the cache
  • 1139 - for so many unique paths, addFile was called

That means, the SHA256 calculation for ~100 files was saved in this case by the deferral. Assuming a conservative 10ms per file, that would mean a slow-down by at least 1 second if the deferral would be removed for this scenario.

@bjhargrave bjhargrave self-assigned this Sep 16, 2022
bjhargrave added a commit to bjhargrave/bnd that referenced this issue Sep 19, 2022
The cache reduces the need to create new Resource objects, including
SHA-256 computation, for unchanged files.

Fixes bndtools#5367

Signed-off-by: BJ Hargrave <bj@hargrave.dev>
@bjhargrave
Copy link
Member

@reckart Try out the resource cache in PR #5372 to see if it addresses this performance issue for you.

bjhargrave added a commit to bjhargrave/bnd that referenced this issue Sep 20, 2022
The cache reduces the need to create new Resource objects, including
SHA-256 computation, for unchanged files.

Fixes bndtools#5367

Signed-off-by: BJ Hargrave <bj@hargrave.dev>
bjhargrave added a commit to bjhargrave/bnd that referenced this issue Sep 20, 2022
The cache reduces the need to create new Resource objects, including
SHA-256 computation, for unchanged files.

Fixes bndtools#5367

Signed-off-by: BJ Hargrave <bj@hargrave.dev>
bjhargrave added a commit to bjhargrave/bnd that referenced this issue Sep 20, 2022
The cache reduces the need to create new Resource objects, including
SHA-256 computation, for unchanged files.

Fixes bndtools#5367

Signed-off-by: BJ Hargrave <bj@hargrave.dev>
bjhargrave added a commit to bjhargrave/bnd that referenced this issue Sep 21, 2022
The cache reduces the need to create new Resource objects, including
SHA-256 computation, for unchanged files.

Fixes bndtools#5367

Signed-off-by: BJ Hargrave <bj@hargrave.dev>
bjhargrave added a commit to bjhargrave/bnd that referenced this issue Sep 21, 2022
The cache reduces the need to create new Resource objects, including
SHA-256 computation, for unchanged files.

Fixes bndtools#5367

Signed-off-by: BJ Hargrave <bj@hargrave.dev>
bjhargrave added a commit to bjhargrave/bnd that referenced this issue Sep 21, 2022
The cache reduces the need to create new Resource objects, including
SHA-256 computation, for unchanged files.

Fixes bndtools#5367

Signed-off-by: BJ Hargrave <bj@hargrave.dev>
@reckart
Copy link
Contributor Author

reckart commented Sep 21, 2022

@bjhargrave thanks :) I'm building and trying it.

@reckart
Copy link
Contributor Author

reckart commented Sep 21, 2022

Seems like the cache might not be used in all access paths to the SHA256.

When I try opening a .bndrun file after starting Eclipse, it takes very long with the PR code. In VisualVM, I cannot see the cache on the sampling trace either.

Screenshot 2022-09-21 at 09 37 01

Screenshot 2022-09-21 at 09 37 39

@reckart
Copy link
Contributor Author

reckart commented Sep 21, 2022

When I attach a debugger with a log point in SHA256.digest(File), I can see that after a short time the method has already been called 9 times for the same file.

bjhargrave added a commit to bjhargrave/bnd that referenced this issue Sep 23, 2022
The cache reduces the need to process files to Resource objects,
including SHA-256 computation, for unchanged files.

Fixes bndtools#5367

Signed-off-by: BJ Hargrave <bj@hargrave.dev>
bjhargrave added a commit to bjhargrave/bnd that referenced this issue Sep 26, 2022
The cache reduces the need to process files to Resource objects,
including SHA-256 computation, for unchanged files.

Fixes bndtools#5367

Signed-off-by: BJ Hargrave <bj@hargrave.dev>
bjhargrave added a commit to bjhargrave/bnd that referenced this issue Sep 26, 2022
The cache reduces the need to process files to Resource objects,
including SHA-256 computation, for unchanged files.

Fixes bndtools#5367

Signed-off-by: BJ Hargrave <bj@hargrave.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants