Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: clarification for Unix.open_process_args_in #10057

Conversation

rbjorklin
Copy link

I tripped myself up on this as can be seen here. Hoping this small change will help others in the future.

I used the Conventional Commits style, I hope that's okay.

@rbjorklin rbjorklin force-pushed the clarify-documentation-for-Unix.open_process_args_in branch from cb6c964 to fab57a8 Compare November 28, 2020 07:30
@gasche
Copy link
Member

gasche commented Nov 28, 2020

Searching the web suggests that you are far from the only one to make this mistake. See for example this piece of Reason code

  let ic =
    Sys.win32
      // HACK: Not sure why this command doesn't work on Linux / macOS, and vice versa...
      ? Unix.open_process_args_in(
          "node",
          [|"node", "-e", "console.log(process.execPath)"|],
        )
      : Unix.open_process_in("node -e 'console.log(process.execPath)'");

It looks like Windows resolves the executable name passed to open_process_args*, adding to the confusion.

Could you maybe consider adding an example use in the documentation, which would make the recommendations visually obvious? Code blocks in ocamldoc use the {[...]} markup.

.

I wonder if we should consider changing open_process_args* to lookup the executable in the PATH (there is logic in the Unix module to implement this easily). My understanding of the function was that it was introduced to avoid the complexity of argument parsing by the shell, but there is not much in the original PR #1792 about resolution of the executable itself.

An alternative since OCaml 4.10 is to use Sys.command (Filename.quote_command cmd args). If I understand correctly, this lets us pass arguments as an unescaped list, but the cmd command will still be resolved through the shell.

@gasche
Copy link
Member

gasche commented Nov 28, 2020

(cc @xavierleroy @nojb)

@nojb
Copy link
Contributor

nojb commented Nov 28, 2020

I wonder if we should consider changing open_process_args* to lookup the executable in the PATH (there is logic in the Unix module to implement this easily). My understanding of the function was that it was introduced to avoid the complexity of argument parsing by the shell, but there is not much in the original PR #1792 about resolution of the executable itself.

Indeed, to summarize my recollection:

  • open_process{,_in,_out,_full}: builds a shell command line and executes through shell; as a side-effect, the program is looked up in PATH
  • open_process_args{,_in,_out,_full}: executes the program with the given arguments "verbatim". This uses execv on Unix so the program is not looked up in the PATH. However, on Windows (which requires an ad-hoc implementation), the program is looked up in the PATH, see
    exefile = caml_search_exe_in_path(wcmd);

So indeed it looks like there is a mismatch between Unix/Windows regarding the args functions. This may have been overlooked when the args functions where introduced because up until that point, only /bin/sh was used as "program" argument to the exec* function, which didn't need to be looked up in PATH.

So to summarize, looks like replacing execv by execvp in the Unix implementation of the args functions shwould be a plausible fix for the present issue.

Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed documentation is correct. However, the convention on argument arrays (that the first element matches the command name and the subsequent elements are the arguments proper) is consistently used in the Unix library, e.g. in execv, execve, execvp, execvpe, create_process, create_process_env, open_process_args_in, open_process_args_out, open_process_args, and open_process_args_full. So, documenting it for open_process_args_in and not for any other function is problematic.

@xavierleroy
Copy link
Contributor

So to summarize, looks like replacing execv by execvp in the Unix implementation of the args functions shwould be a plausible fix for the present issue.

It is a plausible change, making these functions more usable, with a small backward incompatibility.

However it is not a "fix for the present issue", which also includes a misunderstanding of what an "argument array" is in the Unix world. This is a bit of information that should be put somewhere but not repeated 10 times for the 10 Unix functions that are affected.

@bobot
Copy link
Contributor

bobot commented Nov 30, 2020

And for the misunderstanding of what an "argument array" is in the Unix world, I think the proposed changed is misleading. The first argument doesn't have to match the name of the command ( the best explananation I found), it should be present but can be arbitrary, except if the command use it for modifying its behavior.

@frou
Copy link

frou commented Dec 13, 2020

I created a separate issue for the open_process_args* and PATH aspect. @gasche @nojb

@rbjorklin
Copy link
Author

Superseded by #10084.

@rbjorklin rbjorklin closed this Mar 21, 2021
@rbjorklin rbjorklin deleted the clarify-documentation-for-Unix.open_process_args_in branch March 21, 2021 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants