New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
profile runtime and optimize performance #31
Comments
to profile using the tests:
and then add the following to defer profile.Start(profile.ProfilePath(".")).Stop() via: https://hackernoon.com/go-the-complete-guide-to-profiling-your-code-h51r3waz |
here's cpu.pprof from running tests on #30
|
Leaving this open for future optimization work. The kubectl binary is a great test case, it's massive and thoroughly exercises most code paths when run with |
Easier way to profile use https://www.speedscope.app/ and generate profiles by just importing _ "net/http/pprof" at the top of a package and run with:
Runtime is still dominated by scanning for moduledata but also the time it takes to parse types via ParseType_impl. Can probably minimize the moduledata scans a bit with some algorithm adjustments to the recovery. We're probably close to scanning through memory as fast as possible. I was able to see it's using bytealg.Index which is an AVX2 hand rolled assembly routine - we hit the fast path as long as our needle is < 64 bytes which we always will be. |
Unreasonable memory consumption leading to OOM in sample https://github.com/EdgeGuardP/EdgeGuard-Stealer/releases/tag/EdgeStealer4.0 |
the regex convert from yara is too slow, maybe we can convert it to rangel patterns |
@virusdefender is it specifically the conversion routine that is slow or more generally the memory scanning? Can you show some benchmarks that demonstrate the performance issue perhaps? |
using the pprof result in #31 (comment) it shows the regex match is slow |
@virusdefender that pprof result is out of date, if you can profile the application on the current master you may find it's faster. If not, please provide some data suggesting where else we may be missing optimizations :) |
GoReSym is pretty slow and this makes it difficult to deploy at a large scale. Despite being written in Go and compiled to native code, it may take seconds or longer to process input files. For example, to run the entire test suite, it takes many minutes.
Given that Go programs can execute in a split second and their runtime uses all the metadata that GoReSym parses, it should be possible for GoReSym to run in a split second too, at least in the common case. (I understand that in obfuscated binaries that some scans may take a bit longer).
We should profile GoReSym to identify the slow parts of the algorithm and optimize their implementation. Our goal should be for GoReSym to finish within 0.10s for the common case.
The text was updated successfully, but these errors were encountered: