Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read files to scan from stdin to use find for excluding of files, folders and mount points #42

Closed
beckerr-rzht opened this issue Dec 16, 2021 · 6 comments · May be fixed by #75
Closed

Comments

@beckerr-rzht
Copy link

beckerr-rzht commented Dec 16, 2021

It would be great if the files to be scanned could be read from stdin.
This would open up a whole new set of possibilities together with find.

Example:

find / -xdev -type f | java -jar log4j-detector-2021.12.16.jar --stdin

This would scan all files in the local root filesystem, but omit /dev, /proc, etc. and all NFS mounts.

Using find, the following issues would be easy to solve: #11, #39 and #40,

@zhurkin
Copy link

zhurkin commented Dec 17, 2021

The find on large volumes just freezes . It is better to make an explicit exception in the program

@beckerr-rzht
Copy link
Author

beckerr-rzht commented Dec 17, 2021

I don't know such problems with find, but I just want to scan files on all local filesystems only.

For example I'm actually using this find options:

find  / \( -type d \( -fstype autofs -o -fstype fuse.sshfs -o -fstype nfs -o -fstype proc -o -fstype sshfs -o -fstype sysfs -o -fstype tmpfs \) -prune -o -type f \) \
    -type f -print | java -jar log4j-detector-2021.12.17.jar --stdin

@beckerr-rzht
Copy link
Author

The current precompiled version 2021.12.17 supporting --stdin is here:
https://github.com/beckerr-rzht/log4j-detector/raw/master/log4j-detector-2021.12.17.jar

@juergenhoetzel
Copy link

You can build and execute command lines from standard input using xargs:

find  / \( -type d \( -fstype autofs -o -fstype fuse.sshfs -o -fstype nfs -o -fstype proc -o -fstype sshfs -o -fstype sysfs -o -fstype tmpfs \) -prune -o -type f \)     -type f -name "*.jar"|xargs java -jar log4j-detector-2021.12.17.jar

@beckerr-rzht
Copy link
Author

Note the following when using xargs:
Using xargs can always be slower if many files are passed, because the java process may have to be started several times.

When using xargs, parameters and environment variables together may only occupy a maximum of 4096 bytes in the worst case. The size of the environment of root is around 2000 bytes (depending on operating system and configuration).
A "medium" installation of Ubuntu Desktop has about 400000 files.

This would result in the following comparison:

  • with --stdin the java process is started exactly once.
  • without --stdin xargs starts the java process about 10000 times.

But this is of course only the worst case, which should occur rarely.
The actual values of the particular system are provided by xargs --show-limits.

But xargs has one advantage in any case:
The parameter -P allows to run several processes in parallel.
So e.g.:

find \ -xdev | xargs -rn100 -P8 java -jar log4j-detector-2021.12.17.jar

... will start 8 processes scanning in parallel. Here -r prevents the process from being started without parameters and -n100 determines that 100 arguments are passed at a time.

Provided you have enough CPU, this could speed up the detector scan.
However, in such cases the tool parallel should be preferred, because it is much more flexible.

Regardless, I hope that my pull request #43 will be accepted.

@beckerr-rzht beckerr-rzht changed the title Read file to scan from stdin Read files to scan from stdin to use find for excluding of files, folders and mount points Dec 19, 2021
@juliusmusseau
Copy link
Contributor

I did this in my own way. See v2021.12.20 which adds a new --stdin flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants