Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Homestead: ShanghaiLove_Homestead failing #2406

Closed
jochem-brouwer opened this issue Nov 4, 2022 · 18 comments
Closed

Homestead: ShanghaiLove_Homestead failing #2406

jochem-brouwer opened this issue Nov 4, 2022 · 18 comments

Comments

@jochem-brouwer
Copy link
Member

jochem-brouwer commented Nov 4, 2022

When running tests with test-all-hardforks label, CI now always fails since ShanghaiLove_Homestead makes node run out of memory.

When I go back into time, for instance: c70a158, which is #2244 (this did test all hardforks, including ShanghaiLove_Homestead which passed), it now locally also fails.

Starting to think this is not our problem, but the latest node versions might have a problem?

Note, to test;

cd packages/vm
npm run test:blockchain -- --fork=Homestead --test=ShanghaiLove_Homestead

The problem in this test is that in ShanghaiLove, the contract keeps DELEGATECALLing itself, this copies a ton of code. In TangerineWhistle this is solved because now the cost of DELEGATECALL goes from 40 to 700, so you cannot reach a super deep stack.

Node sample err;

<--- Last few GCs --->

[50229:0x592db50]    37417 ms: Scavenge 2026.8 (2079.0) -> 2020.8 (2080.0) MB, 5.1 / 0.0 ms  (average mu = 0.668, current mu = 0.382) allocation failure; 
[50229:0x592db50]    37433 ms: Scavenge 2027.6 (2080.0) -> 2021.6 (2080.7) MB, 4.5 / 0.0 ms  (average mu = 0.668, current mu = 0.382) allocation failure; 
[50229:0x592db50]    37449 ms: Scavenge 2028.5 (2080.7) -> 2022.4 (2096.7) MB, 6.3 / 0.0 ms  (average mu = 0.668, current mu = 0.382) allocation failure; 


<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
 1: 0xb6dd00 node::Abort() [node]
 2: 0xa7db1c  [node]
 3: 0xd474e0 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 4: 0xd47887 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 5: 0xf24c45  [node]
 6: 0xf3712d v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
 7: 0xf1182e v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
 8: 0xf12bf7 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
 9: 0xef3dca v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [node]
10: 0x12b736f v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [node]
11: 0x16e8fb9  [node]
Aborted (core dumped)

Note; node --version: v18.9.0

@jochem-brouwer
Copy link
Member Author

jochem-brouwer commented Nov 4, 2022

Checked on 18.0.0 and 16.0.0, both crash, the CI used a node version of node 16 so that's weird. 16.18 crashes as well.

@jochem-brouwer
Copy link
Member Author

Cannot reproduce at any code where the test does not fail. My ts-node is also at the right version per package.json.

@acolytec3
Copy link
Contributor

Can confirm I was able to reproduce too.

@acolytec3
Copy link
Contributor

I haven't solved it yet but this article was immensely helpful for getting me started with the nodejs debugging interface for Chrome Devtools - https://medium.com/@paul_irish/debugging-node-js-nightlies-with-chrome-devtools-7c4a1b95ae27#.pmqejrn8q

Now I just have to figure out why the statemanager is causing us to run out of memory :-)

@acolytec3
Copy link
Contributor

What's weird is if I run this test with node --inspect-brk --require ts-node/register ./test/tester --blockchain --stack-size=1500 --fork=Homestead --test=ShanghaiLove_Homestead and start the test from the Chrome Devtools debugger, it passes. 🤔

@jochem-brouwer
Copy link
Member Author

What could be possible is that this DoS vector DELEGATECALLs into the same contract each time, so we keep copying the code. However this does not really explain why we go to 2.4 Gb+ memory so it goes out-of-memory. It seems that when we delegatecall, the runCall part in EVM is taking a ton of time/memory, so that would need further investigation (I think also apart from this investigation why it takes so long + so much memory..?)

@acolytec3
Copy link
Contributor

So basically we're now susceptible to the same Shanghai DoS attack that geth was. The client finally made it to 2016! :-)

@holgerd77
Copy link
Member

Just to put things in perspective a bit, we have realized around our dealing with the Shanghai dDoS attacks here #1536 that we are susceptible to various different attack vectors regarding performance/out-of-memory for quite some time, so not sure if this really deserves a "very important" label.

But it would be for sure nice/good if we fix/improve on those things over time.

@holgerd77
Copy link
Member

(but definitely no "we need to sove it now" thing)

@jochem-brouwer
Copy link
Member Author

This PR causes the problem: #2285

@acolytec3
Copy link
Contributor

Interesting. I wonder what about the inner workings of those two libraries is different and causes this. If we decide to go this way, it shouldn't be too too hard to just revert to the older library. I'll have to look at the API but it should be pretty similar.

@ZLY201
Copy link
Contributor

ZLY201 commented Nov 7, 2022

I find the reason.

The space complexity of functional-red-black-tree saving copies is O(logn) but js-sdsl do it with O(n) because the operation new OrderedMap(this.map) save the whole tree copy.

functional-red-black-tree saves only one link in the tree by this.map = this.map.insert().

Sadly, js-sdsl doesn't currently support this particular way of saving memory because it's not a general feature.

@holgerd77
Copy link
Member

I find the reason.

The space complexity of functional-red-black-tree saving copies is O(logn) but js-sdsl do it with O(n) because the operation new OrderedMap(this.map) save the whole tree copy.

functional-red-black-tree saves only one link in the tree by this.map = this.map.insert().

Sadly, js-sdsl doesn't currently support this particular way of saving memory because it's not a general feature.

Thanks for digging into this so quickly, super great!!! 💯

We'll give this some time to play around with this a bit more and then see and decide how best to handle this (switching back to the old package would be unlucky, but we'll see. Maybe there are alternative ways here, adopting the caching mechism or something like that.).

@ZLY201
Copy link
Contributor

ZLY201 commented Nov 9, 2022

I think the following data structure might be helpful in this case, but it's not suitable to add to js-sdsl, maybe create a new awesome-js-sdsl repo is a good idea.

class OrderedMapWithRollback<K, V> extends OrderedMap<K, V> {
  private isRecording = false;
  private recordNow: ([K, V] | K)[] = [];
  private recordAll: ([K, V] | K)[][] = [];
  setElement(key: K, value: V) {
    this.recordNow.push(key);
    super.setElement(key, value);
  }
  eraseElementByKey(key: K) {
    const value = super.getElementByKey(key);
    if (value === undefined) return;
    this.recordNow.push([key, value]);
    super.eraseElementByKey(key);
  }
  startRecord() {
    if (this.isRecording) return;
    this.isRecording = true;
    this.recordNow = [];
  }
  endRecord() {
    if (!this.isRecording) return;
    this.isRecording = false;
    this.recordAll.push(this.recordNow);
  }
  rollback(depth = 1) {
    if (depth <= 0 || this.recordAll.length < depth) {
      throw new RangeError();
    }
    while (depth--) {
      const record = this.recordAll.pop()!;
      for (const op of record) {
        if (Array.isArray(op)) {
          super.setElement(op[0], op[1]);
        } else super.eraseElementByKey(op);
      }
    }
  }
}

@ZLY201
Copy link
Contributor

ZLY201 commented Nov 9, 2022

I created a PR. js-sdsl/js-sdsl#127

@holgerd77
Copy link
Member

I created a PR. js-sdsl/js-sdsl#127

Pretty cool! 💯 We'll try soon. 🙂

@holgerd77
Copy link
Member

Solved by #2630, will close.

@jochem-brouwer
Copy link
Member Author

Test is skipped but just checked, it passes. Will push to #2634

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants