Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nextflow run hello: ERROR ~ a fault occurred in an unsafe memory access operation #4942

Open
fbnrst opened this issue Apr 22, 2024 · 4 comments

Comments

@fbnrst
Copy link

fbnrst commented Apr 22, 2024

Bug report

Expected behavior and actual behavior

Expected behavior: nextflow run hello should work.

Actual behavior:
About a week ago, I installed nextflow using mamba (i.e. conda) on our clsuter, and it worked just fine. I had also tested the hello world example. Now, I wanted to start a new pipeline and got an ERROR, and I also get the same error when I try to run the hello world example, see below under Program output .

Steps to reproduce the problem

mamba create -n nextflow23.10 -c conda-forge -c bioconda -c defaults nextflow=23.10 openjdk=20 -y
mamba activate nextflow23.10
nextflow run hello

This temporarily solved the problem. But then the error came back. And I even observed that it once worked again, but then the error came back.

Program output

$ nextflow run hello
N E X T F L O W  ~  version 23.10.1
Launching `https://github.com/nextflow-io/hello` [gigantic_wozniak] DSL2 - revision: 7588c46ffe [master]
ERROR ~ a fault occurred in an unsafe memory access operation

 -- Check '.nextflow.log' file for details

Environment

  • Nextflow version: 23.10.1
  • Java version: openjdk 20.0.2-internal 2023-07-18
  • Operating system: CentOS 7.4.1708
  • Bash version: GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)

Additional context

It is probably hard to tell what is going on with the limited information I can provide at the moment. If anyone has ideas what I should try or which infos I should provide, I would be very grateful

nextflow.log

@pditommaso
Copy link
Member

This is commonly related to the lack of temporary disk storage. There at least of similar issues https://github.com/nextflow-io/nextflow/issues?q=is%3Aissue+unsafe+memory+is%3Aclosed

@fbnrst
Copy link
Author

fbnrst commented Apr 22, 2024

I do not see an issue with disk storage, everywhere I look, there seems to be plenty of space. However, I did realise the following: It seems to work when I run the pipeline in my home directory. But if I run it from a directory on our lustre filesystem nextflow seems to crash:

$ cd ~/temp/
$ nextflow run hello
N E X T F L O W  ~  version 23.10.1
Launching `https://github.com/nextflow-io/hello` [scruffy_chandrasekhar] DSL2 - revision: 7588c46ffe [master]
executor >  local (4)
[e0/3d25bb] process > sayHello (4) [100%] 4 of 4 ✔
Ciao world!

Bonjour world!

Hello world!

Hola world!
$ cd /path/on/lustre
$ nextflow run hello
N E X T F L O W  ~  version 23.10.1
Launching `https://github.com/nextflow-io/hello` [evil_baekeland] DSL2 - revision: 7588c46ffe [master]
ERROR ~ a fault occurred in an unsafe memory access operation

 -- Check '.nextflow.log' file for details

Not yet sure how this might help, but at least I can now reproduce it more systematically.

@fbnrst
Copy link
Author

fbnrst commented Apr 25, 2024

Just wanted to add that running things in my home directory is not an option, because there I do not have enough space.
I also talked to our admin and he cannot see why disk space should be an issue on our lustre file system. Any other ideas how I can track down this issue? I am kind of stuck and at the moment cannot work with nextflow at all.

@pditommaso pditommaso changed the title nextflow run hell: ERROR ~ a fault occurred in an unsafe memory access operation nextflow run hello: ERROR ~ a fault occurred in an unsafe memory access operation Apr 25, 2024
@bentsherman
Copy link
Member

The error happens specifically with the LevelDB cache:

java.lang.InternalError: a fault occurred in an unsafe memory access operation
	at java.base/jdk.internal.misc.Unsafe.copyMemory0(Native Method)
	at java.base/jdk.internal.misc.Unsafe.copyMemory(Unsafe.java:806)
	at java.base/jdk.internal.misc.ScopedMemoryAccess.copyMemoryInternal(ScopedMemoryAccess.java:147)
	at java.base/jdk.internal.misc.ScopedMemoryAccess.copyMemory(ScopedMemoryAccess.java:129)
	at java.base/java.nio.ByteBuffer.putArray(ByteBuffer.java:1333)
	at java.base/java.nio.ByteBuffer.put(ByteBuffer.java:1192)
	at org.iq80.leveldb.util.Slice.getBytes(Slice.java:246)
	at org.iq80.leveldb.impl.MMapLogWriter.writeChunk(MMapLogWriter.java:208)
	at org.iq80.leveldb.impl.MMapLogWriter.addRecord(MMapLogWriter.java:186)
	at org.iq80.leveldb.impl.VersionSet.writeSnapshot(VersionSet.java:329)
	at org.iq80.leveldb.impl.VersionSet.logAndApply(VersionSet.java:284)
	at org.iq80.leveldb.impl.DbImpl.<init>(DbImpl.java:223)
	at org.iq80.leveldb.impl.Iq80DBFactory.open(Iq80DBFactory.java:83)
	at nextflow.cache.DefaultCacheStore.openDb(DefaultCacheStore.groovy:78)
	at nextflow.cache.DefaultCacheStore.open(DefaultCacheStore.groovy:106)
	at nextflow.cache.DefaultCacheStore.open(DefaultCacheStore.groovy)
	at nextflow.cache.CacheDB.open(CacheDB.groovy:59)
	at nextflow.Session.init(Session.groovy:420)
	at nextflow.script.ScriptRunner.execute(ScriptRunner.groovy:128)
	at nextflow.cli.CmdRun.run(CmdRun.groovy:372)
	at nextflow.cli.Launcher.run(Launcher.groovy:500)
	at nextflow.cli.Launcher.main(Launcher.groovy:672)

I guess leveldb is memory-mapping the db file, maybe this operation is not supported by Lustre, or your particular implementation / configuration of Lustre. I would ask your sys admin about this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants