Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Option groups #486

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Add Option groups #486

wants to merge 2 commits into from

Conversation

tbidne
Copy link

@tbidne tbidne commented May 15, 2024

Resolves #270.

This is an alternative to #231 by @larskuhtz, for whom I am grateful for blazing the trail.


First, thanks for the great library. This is my attempt to resolve #270, as this is a feature I really want (and it seems I am not alone) 🙂.

Also, this isn't nearly as bad as the diff numbers imply; about 80% of the changes are test boilerplate.

Background

These are notes on why solving this isn't trivial. Feel free to skip if you're already familiar with the difficulties:

Click to expand

Commands

Before I jump into the design / impl, I want to first look at why command groups work so well, and why it's harder to do the same for option groups. For commands, the relevant types are:

data CommandFields a = CommandFields
  { cmdCommands :: [(String, ParserInfo a)]
  , cmdGroup :: Maybe String }

command :: String -> ParserInfo a -> Mod CommandFields a
commandGroup :: String -> Mod CommandFields a
subparser :: Mod CommandFields a -> Parser a

-- usage
subparser
  ( command "cmd1" ...
      <> command "cmd2" ...
      <> commandGroup "Some group"
  )

We can add multiple commands with command, add a group with commandGroup, and combine all of this together using the Semigroup instance. It is easy to add the concept of a group here since the basic modifier -- CommandFields -- already collects multiple commands together. So all we have to do is add an optional label and render it appropriately.

Options

Now let's look at Option.

-- Option does not show up in the signatures for user-facing functions like
-- 'option', but it is used internally i.e. part of the Parser type.
data Option a = Option
  { optMain :: OptReader a
  , optProps :: OptProperties
  }

data OptionFields a = OptionFields
  { optNames :: [OptName]
  , optCompleter :: Completer
  , optNoArgError :: String -> ParseError }

-- Focusing on 'option', but the same applies to argument and flag since they
-- all go through the Option type.
option :: ReadM a -> Mod OptionFields a -> Parser a

Unlike CommandFields, OptionFields does not take in multiple options, so we cannot easily add an optional group label. Perhaps we can simply make one? Say,

data GroupedOptions a = GroupedOptions
  { options :: [Option a]
  , optGroup :: Maybe String
  }

Unfortunately, this won't work! The fundamental problem is different options usually have different types -- unlike different commands, which must have the same type -- so we cannot, say, group an Option Int and Option String together (at least not without existential types, which would probably be terrible).

Solutions

As far as I can tell, there are really only two possibilities for a user-facing groupOption :: String -> ....

1. Add an optional group to OptionFields (and ArgumentFields, FlagFields)

This is what the previous PR did:

class HasGroup f where
  setGroup :: String -> f a -> f a
  getGroup :: f a -> Maybe String

instance HasGroup OptionFields where ...

-- usage
parseFoo =
  optional
      ( strOption
          ( long "file-log-path"
              <> metavar "PATH"
              <> help "Log file path"
              <> setGroup "Some group" -- use OptionFields' HasGroup instance
          )
      )

The advantage is that this fits well with standard optparse usage, and it is probably the most obvious way to retrofit option groups into the existing framework.

The disadvantage is that we have to individually add setGroup "Some group" to every option we want in "Some Group". It's clunky, and opens up the possibility for typos creating multiple groups. This is also inconsistent with how command groups work, as you only have to specify the group once for the latter. Unfortunately our hand is forced due to options having different types.

I assume this is the primary reason the PR was declined, perhaps along with some concern over the necessary implementation changes.

2. Add an optional group to OptProperties

This is what this PR does. Because this is at a "higher level" than Mod, our user-facing function is:

parserOptionGroup :: String -> Parser a -> Parser a

-- usage
-- every parser that constitutes someParser will be grouped together
parserOptionGroup "Some group" someParser

The disadvantage is that isn't the usual Mod semigroup pattern.

The advantage is that we only have to specify the group in one place, and all "downstream" parsers will automatically belong to the same group.

The other advantage is that the code changes are quite small. Other than the rendering logic (which is nearly identical to the first PR, thanks @larskuhtz!), the only modification here is adding the new field to OptProperties, and providing a set function.

Examples

The tests show examples of what this looks like. For a "realer" example, I tried this out on a CLI app with many options. Here is the diff, and here is the original help page (brace yourself):

Click to expand
Shrun: A tool for running shell commands concurrently.

Usage: shrun [-c|--config PATH] [-i|--init STRING] 
             [-t|--timeout (NATURAL | STRING)] [--common-log-key-hide] 
             [--command-log-poll-interval NATURAL] 
             [--command-log-read-size BYTES] [--console-log-command] 
             [--console-log-command-name-trunc NATURAL] 
             [--console-log-line-trunc (NATURAL | detect)] 
             [--console-log-strip-control (all | smart | none)] 
             [--console-log-timer-format (digital_compact | digital_full | prose_compact | prose_full)]
             [-f|--file-log (default | PATH)] 
             [--file-log-command-name-trunc NATURAL] 
             [--file-log-delete-on-success] 
             [--file-log-line-trunc (NATURAL | detect)] 
             [--file-log-mode (append | write)] 
             [--file-log-size-mode (warn BYTES | delete BYTES | nothing)] 
             [--file-log-strip-control (all | smart | none)] 
             [--notify-action (final |command | all)] 
             [--notify-system (dbus | notify-send | apple-script)] 
             [--notify-timeout (never | NATURAL)] [--default-config] 
             [-v|--version] Commands...

  Shrun runs shell commands concurrently. In addition to providing basic timing
  and logging functionality, we also provide the ability to pass in a config
  file that can be used to define aliases for commands.

  In general, each option --foo has a --no-foo variant that disables cli and
  toml configuration for that field. For example, the --no-console-log-command
  option will instruct shrun to ignore both cli --console-log-command and toml
  console-log.command, ensuring the default behavior is used (i.e. no command
  logging).

  See github.com/tbidne/shrun#README for full documentation.

Available options:
  -c,--config PATH         Path to TOML config file. If this argument is not
                           given we automatically look in the XDG config
                           directory e.g. ~/.config/shrun/config.toml. The
                           --no-config option disables --config and the
                           automatic XDG lookup.

  --no-config              Disables --config.

  -i,--init STRING         If given, init is run before each command. That is,
                           'shrun --init ". ~/.bashrc" foo bar' is equivalent to
                           'shrun ". ~/.bashrc && foo" ". ~/.bashrc && bar"'.

  --no-init                Disables --init.

  -t,--timeout (NATURAL | STRING)
                           Non-negative integer setting a timeout. Can either be
                           a raw number (interpreted as seconds), or a "time
                           string" e.g. 1d2h3m4s or 2h3s. Defaults to no
                           timeout.

  --no-timeout             Disables --timeout.

  --common-log-key-hide    By default, we display the key name from the legend
                           over the actual command that was run, if the former
                           exists. This flag instead shows the literal command.
                           Commands without keys are unaffected.

  --no-common-log-key-hide Disables --common-log-key-hide.

  --command-log-poll-interval NATURAL
                           Non-negative integer used in conjunction with
                           --console-log-command and --file-log that determines
                           how quickly we poll commands for logs, in
                           microseconds. A value of 0 is interpreted as infinite
                           i.e. limited only by the CPU. Defaults to 10,000.
                           Note that lower values will increase CPU usage. In
                           particular, 0 will max out a CPU thread.

  --no-command-log-poll-interval
                           Disables --command-log-poll-interval.

  --command-log-read-size BYTES
                           The max number of bytes in a single read when
                           streaming command logs (--console-log-command and
                           --file-log). Logs larger than --command-log-read-size
                           will be read in a subsequent read, hence broken
                           across lines. The default is '1 kb'.

  --no-command-log-read-size
                           Disables --command-log-read-size.

  --console-log-command    The default behavior is to swallow logs for the
                           commands themselves. This flag gives each command a
                           console region in which its logs will be printed.
                           Only the latest log per region is show at a given
                           time.

  --no-console-log-command Disables --console-log-command.

  --console-log-command-name-trunc NATURAL
                           Non-negative integer that limits the length of
                           commands/key-names in the console logs. Defaults to
                           no truncation.

  --no-console-log-command-name-trunc
                           Disables --console-log-command-name-trunc.

  --console-log-line-trunc (NATURAL | detect)
                           Non-negative integer that limits the length of
                           console logs. Can also be the string literal
                           'detect', to detect the terminal size automatically.
                           Defaults to no truncation. Note that "log prefixes"
                           (e.g. labels like [Success], timestamps) are counted
                           towards the total length but are never truncated.

  --no-console-log-line-trunc
                           Disables --console-log-line-trunc.

  --console-log-strip-control (all | smart | none)
                           Control characters can wreak layout havoc, thus we
                           include this option. 'all' strips all such chars.
                           'none' does nothing i.e. all chars are left
                           untouched. The default 'smart' attempts to strip only
                           the control chars that affect layout (e.g. cursor
                           movements) and leaves others unaffected (e.g.
                           colors). This has the potential to be the 'prettiest'
                           though it is possible to miss some chars. This option
                           is experimental and subject to change.

  --no-console-log-strip-control
                           Disables --console-log-strip-control.

  --console-log-timer-format (digital_compact | digital_full | prose_compact | prose_full)
                           How to format the timer. Defaults to prose_compact
                           e.g. '2 hours, 3 seconds'.

  --no-console-log-timer-format
                           Disables --console-log-timer-format.

  -f,--file-log (default | PATH)
                           If a path is supplied, all logs will additionally be
                           written to the supplied file. Furthermore, command
                           logs will be written to the file irrespective of
                           --console-log-command. Console logging is unaffected.
                           This can be useful for investigating command
                           failures. If the string 'default' is given, then we
                           write to the XDG state directory e.g.
                           ~/.local/state/shrun/shrun.log.

  --no-file-log            Disables --file-log.

  --file-log-command-name-trunc NATURAL
                           Like --console-log-command-name-trunc, but for
                           --file-logs.

  --no-file-log-command-name-trunc
                           Disables --file-log-command-name-trunc.

  --file-log-delete-on-success
                           If --file-log is active, deletes the file on a
                           successful exit. Does not delete the file if shrun
                           exited via failure.

  --no-file-log-delete-on-success
                           Disables --file-log-delete-on-success.

  --file-log-line-trunc (NATURAL | detect)
                           Like --console-log-line-trunc, but for --file-log.

  --no-file-log-line-trunc Disables --file-log-line-trunc.

  --file-log-mode (append | write)
                           Mode in which to open the log file. Defaults to
                           write.

  --no-file-log-mode       Disables --file-log-mode.

  --file-log-size-mode (warn BYTES | delete BYTES | nothing)
                           Sets a threshold for the file log size, upon which we
                           either print a warning or delete the file, if it is
                           exceeded. The BYTES should include the value and
                           units e.g. warn 10 mb, warn 5 gigabytes, delete
                           20.5B. Defaults to warning at 50 mb.

  --no-file-log-size-mode  Disables --file-log-size-mode.

  --file-log-strip-control (all | smart | none)
                           --console-log-strip-control for file logs created
                           with --file-log. Defaults to all.

  --no-file-log-strip-control
                           Disables --file-log-strip-control.

  --notify-action (final |command | all)
                           Sends notifications for various actions. 'Final'
                           sends off a notification when Shrun itself finishes
                           whereas 'command' sends off one each time a command
                           finishes. 'All' implies 'final' and 'command'.

  --no-notify-action       Disables --notify-action.

  --notify-system (dbus | notify-send | apple-script)
                           The system used for sending notifications. 'dbus' and
                           'notify-send' available on linux, whereas
                           'apple-script' is available for osx.

  --no-notify-system       Disables --notify-system.

  --notify-timeout (never | NATURAL)
                           When to timeout success notifications. Defaults to 10
                           seconds.Note that the underlying notification system
                           may not support timeouts.

  --no-notify-timeout      Disables --notify-timeout.

  --default-config         Writes a default config.toml file to stdout.

  -h,--help                Show this help text

Version: 0.9

And here is the same with the new groups:

Click to expand
Shrun: A tool for running shell commands concurrently.

Usage: shrun [-c|--config PATH] [-i|--init STRING] 
             [-t|--timeout (NATURAL | STRING)] [--common-log-key-hide] 
             [--command-log-poll-interval NATURAL] 
             [--command-log-read-size BYTES] [--console-log-command] 
             [--console-log-command-name-trunc NATURAL] 
             [--console-log-line-trunc (NATURAL | detect)] 
             [--console-log-strip-control (all | smart | none)] 
             [--console-log-timer-format (digital_compact | digital_full | prose_compact | prose_full)]
             [-f|--file-log (default | PATH)] 
             [--file-log-command-name-trunc NATURAL] 
             [--file-log-delete-on-success] 
             [--file-log-line-trunc (NATURAL | detect)] 
             [--file-log-mode (append | write)] 
             [--file-log-size-mode (warn BYTES | delete BYTES | nothing)] 
             [--file-log-strip-control (all | smart | none)] 
             [--notify-action (final |command | all)] 
             [--notify-system (dbus | notify-send | apple-script)] 
             [--notify-timeout (never | NATURAL)] [--default-config] 
             [-v|--version] Commands...

  Shrun runs shell commands concurrently. In addition to providing basic timing
  and logging functionality, we also provide the ability to pass in a config
  file that can be used to define aliases for commands.

  In general, each option --foo has a --no-foo variant that disables cli and
  toml configuration for that field. For example, the --no-console-log-command
  option will instruct shrun to ignore both cli --console-log-command and toml
  console-log.command, ensuring the default behavior is used (i.e. no command
  logging).

  See github.com/tbidne/shrun#README for full documentation.

Available options:
  -c,--config PATH         Path to TOML config file. If this argument is not
                           given we automatically look in the XDG config
                           directory e.g. ~/.config/shrun/config.toml. The
                           --no-config option disables --config and the
                           automatic XDG lookup.

  --no-config              Disables --config.

  -i,--init STRING         If given, init is run before each command. That is,
                           'shrun --init ". ~/.bashrc" foo bar' is equivalent to
                           'shrun ". ~/.bashrc && foo" ". ~/.bashrc && bar"'.

  --no-init                Disables --init.

  -t,--timeout (NATURAL | STRING)
                           Non-negative integer setting a timeout. Can either be
                           a raw number (interpreted as seconds), or a "time
                           string" e.g. 1d2h3m4s or 2h3s. Defaults to no
                           timeout.

  --no-timeout             Disables --timeout.

  --default-config         Writes a default config.toml file to stdout.

  -h,--help                Show this help text

Command Logging:
  --command-log-poll-interval NATURAL
                           Non-negative integer used in conjunction with
                           --console-log-command and --file-log that determines
                           how quickly we poll commands for logs, in
                           microseconds. A value of 0 is interpreted as infinite
                           i.e. limited only by the CPU. Defaults to 10,000.
                           Note that lower values will increase CPU usage. In
                           particular, 0 will max out a CPU thread.

  --no-command-log-poll-interval
                           Disables --command-log-poll-interval.

  --command-log-read-size BYTES
                           The max number of bytes in a single read when
                           streaming command logs (--console-log-command and
                           --file-log). Logs larger than --command-log-read-size
                           will be read in a subsequent read, hence broken
                           across lines. The default is '1 kb'.

  --no-command-log-read-size
                           Disables --command-log-read-size.


Common Logging:
  --common-log-key-hide    By default, we display the key name from the legend
                           over the actual command that was run, if the former
                           exists. This flag instead shows the literal command.
                           Commands without keys are unaffected.

  --no-common-log-key-hide Disables --common-log-key-hide.


Console Logging:
  --console-log-command    The default behavior is to swallow logs for the
                           commands themselves. This flag gives each command a
                           console region in which its logs will be printed.
                           Only the latest log per region is show at a given
                           time.

  --no-console-log-command Disables --console-log-command.

  --console-log-command-name-trunc NATURAL
                           Non-negative integer that limits the length of
                           commands/key-names in the console logs. Defaults to
                           no truncation.

  --no-console-log-command-name-trunc
                           Disables --console-log-command-name-trunc.

  --console-log-line-trunc (NATURAL | detect)
                           Non-negative integer that limits the length of
                           console logs. Can also be the string literal
                           'detect', to detect the terminal size automatically.
                           Defaults to no truncation. Note that "log prefixes"
                           (e.g. labels like [Success], timestamps) are counted
                           towards the total length but are never truncated.

  --no-console-log-line-trunc
                           Disables --console-log-line-trunc.

  --console-log-strip-control (all | smart | none)
                           Control characters can wreak layout havoc, thus we
                           include this option. 'all' strips all such chars.
                           'none' does nothing i.e. all chars are left
                           untouched. The default 'smart' attempts to strip only
                           the control chars that affect layout (e.g. cursor
                           movements) and leaves others unaffected (e.g.
                           colors). This has the potential to be the 'prettiest'
                           though it is possible to miss some chars. This option
                           is experimental and subject to change.

  --no-console-log-strip-control
                           Disables --console-log-strip-control.

  --console-log-timer-format (digital_compact | digital_full | prose_compact | prose_full)
                           How to format the timer. Defaults to prose_compact
                           e.g. '2 hours, 3 seconds'.

  --no-console-log-timer-format
                           Disables --console-log-timer-format.


File Logging:
  -f,--file-log (default | PATH)
                           If a path is supplied, all logs will additionally be
                           written to the supplied file. Furthermore, command
                           logs will be written to the file irrespective of
                           --console-log-command. Console logging is unaffected.
                           This can be useful for investigating command
                           failures. If the string 'default' is given, then we
                           write to the XDG state directory e.g.
                           ~/.local/state/shrun/shrun.log.

  --no-file-log            Disables --file-log.

  --file-log-command-name-trunc NATURAL
                           Like --console-log-command-name-trunc, but for
                           --file-logs.

  --no-file-log-command-name-trunc
                           Disables --file-log-command-name-trunc.

  --file-log-delete-on-success
                           If --file-log is active, deletes the file on a
                           successful exit. Does not delete the file if shrun
                           exited via failure.

  --no-file-log-delete-on-success
                           Disables --file-log-delete-on-success.

  --file-log-line-trunc (NATURAL | detect)
                           Like --console-log-line-trunc, but for --file-log.

  --no-file-log-line-trunc Disables --file-log-line-trunc.

  --file-log-mode (append | write)
                           Mode in which to open the log file. Defaults to
                           write.

  --no-file-log-mode       Disables --file-log-mode.

  --file-log-size-mode (warn BYTES | delete BYTES | nothing)
                           Sets a threshold for the file log size, upon which we
                           either print a warning or delete the file, if it is
                           exceeded. The BYTES should include the value and
                           units e.g. warn 10 mb, warn 5 gigabytes, delete
                           20.5B. Defaults to warning at 50 mb.

  --no-file-log-size-mode  Disables --file-log-size-mode.

  --file-log-strip-control (all | smart | none)
                           --console-log-strip-control for file logs created
                           with --file-log. Defaults to all.

  --no-file-log-strip-control
                           Disables --file-log-strip-control.


Notifications:
  --notify-action (final |command | all)
                           Sends notifications for various actions. 'Final'
                           sends off a notification when Shrun itself finishes
                           whereas 'command' sends off one each time a command
                           finishes. 'All' implies 'final' and 'command'.

  --no-notify-action       Disables --notify-action.

  --notify-system (dbus | notify-send | apple-script)
                           The system used for sending notifications. 'dbus' and
                           'notify-send' available on linux, whereas
                           'apple-script' is available for osx.

  --no-notify-system       Disables --notify-system.

  --notify-timeout (never | NATURAL)
                           When to timeout success notifications. Defaults to 10
                           seconds.Note that the underlying notification system
                           may not support timeouts.

  --no-notify-timeout      Disables --notify-timeout.


Version: 0.9

Remarks

  • There is a question of what to do in the nested case e.g.

    parserOptionGroup "General group" $
      (,)
        <$> parserA
        <*> parserOptionGroup "B group" parserB

    I chose to always override the group, so in this case both parserA and parserB will be part of General Group. I judged this simpler, but we could choose instead to only overwrite the group when it is Nothing. This would render parserA in General Group and parserB in B group (flat; rendering is never nested).

    See the comment on optPropertiesGroup in Builder.hs.

  • Another question concerns duplicate groups e.g.

    a <- parserOptionGroup "Group A" parserA
    b <- parserOptionGroup "Group B" parserB
    c <- parserOptionGroup "Group A" parserC

    I chose to make groups unique i.e. both parserA and parserC will go in the same Group A.

    See the comment on sortGroupFst in Internal.hs.

  • I'm not very familiar with optparse's internals, so please do let me know if I've missed anything obvious. I'm not married to this design, but I would really like to make some progress here.

Thanks!

@tbidne tbidne force-pushed the parser-groups branch 2 times, most recently from 752442b to 94f61bc Compare May 15, 2024 10:55
@tbidne
Copy link
Author

tbidne commented May 15, 2024

Decided to see how this might help out the linked fourmolu issue: fourmolu/fourmolu#18

One-line change 🙂:

diff --git a/app/Main.hs b/app/Main.hs
index cd53a8a..2a8943f 100644
--- a/app/Main.hs
+++ b/app/Main.hs
@@ -469,7 +469,7 @@ sourceTypeParser =
     ]
 
 printerOptsParser :: Parser PrinterOptsPartial
-printerOptsParser = parsePrinterOptsCLI mkOption
+printerOptsParser = parserOptionGroup "Printer options" $ parsePrinterOptsCLI mkOption
   where
     mkOption name helpText placeholder =
       option (Just <$> eitherReader parsePrinterOptType) . mconcat $
Click to expand help page
Usage: fourmolu [-v|--version] [--manual-exts] [--print-defaults] 
                [-i | (-m|--mode MODE)] [-q|--quiet] [-o|--ghc-opt OPT] 
                [-f|--fixity FIXITY] [-r|--reexport REEXPORT] 
                [-p|--package PACKAGE] [-u|--unsafe] [-d|--debug] 
                [-c|--check-idempotence] [--color WHEN] [--start-line START] 
                [--end-line END] [--indentation INT] [--column-limit OPTION] 
                [--function-arrows OPTION] [--comma-style OPTION] 
                [--import-export-style OPTION] [--indent-wheres BOOL] 
                [--record-brace-space BOOL] [--newlines-between-decls INT] 
                [--haddock-style OPTION] [--haddock-style-module OPTION] 
                [--let-style OPTION] [--in-style OPTION] 
                [--single-constraint-parens OPTION] 
                [--single-deriving-parens OPTION] [--unicode OPTION] 
                [--respectful BOOL] [--no-cabal] [--stdin-input-file ARG] 
                [-t|--source-type TYPE] [FILE]

Available options:
  -h,--help                Show this help text
  -v,--version             Print version of the program
  --manual-exts            Display extensions that need to be enabled manually
  --print-defaults         Print default configuration options that can be used
                           in fourmolu.yaml
  -i                       A shortcut for --mode inplace
  -m,--mode MODE           Mode of operation: 'stdout' (the default), 'inplace',
                           or 'check'
  -q,--quiet               Make output quieter
  -o,--ghc-opt OPT         GHC options to enable (e.g. language extensions)
  -f,--fixity FIXITY       Fixity declaration to use (an override)
  -r,--reexport REEXPORT   Module re-export that Fourmolu should know about
  -p,--package PACKAGE     Explicitly specified dependency (for operator
                           fixity/precedence only)
  -u,--unsafe              Do formatting faster but without automatic detection
                           of defects
  -d,--debug               Output information useful for debugging
  -c,--check-idempotence   Fail if formatting is not idempotent
  --color WHEN             Colorize the output; WHEN can be 'never', 'always',
                           or 'auto' (the default)
  --start-line START       Start line of the region to format (starts from 1)
  --end-line END           End line of the region to format (inclusive)
  --no-cabal               Do not extract default-extensions and dependencies
                           from .cabal files
  --stdin-input-file ARG   Path which will be used to find the .cabal file when
                           using input from stdin
  -t,--source-type TYPE    Set the type of source; TYPE can be 'module', 'sig',
                           or 'auto' (the default)
  FILE                     Haskell source files to format or stdin (the default)

Printer options:
  --indentation INT        Number of spaces per indentation step (default: 4)
  --column-limit OPTION    Max line length for automatic line breaking (default:
                           none)
  --function-arrows OPTION Styling of arrows in type signatures (choices:
                           "trailing", "leading", or "leading-args") (default:
                           trailing)
  --comma-style OPTION     How to place commas in multi-line lists, records,
                           etc. (choices: "leading" or "trailing") (default:
                           leading)
  --import-export-style OPTION
                           Styling of import/export lists (choices: "leading",
                           "trailing", or "diff-friendly") (default:
                           diff-friendly)
  --indent-wheres BOOL     Whether to full-indent or half-indent 'where'
                           bindings past the preceding body (default: false)
  --record-brace-space BOOL
                           Whether to leave a space before an opening record
                           brace (default: false)
  --newlines-between-decls INT
                           Number of spaces between top-level declarations
                           (default: 1)
  --haddock-style OPTION   How to print Haddock comments (choices:
                           "single-line", "multi-line", or "multi-line-compact")
                           (default: multi-line)
  --haddock-style-module OPTION
                           How to print module docstring (default: same as
                           'haddock-style')
  --let-style OPTION       Styling of let blocks (choices: "auto", "inline",
                           "newline", or "mixed") (default: auto)
  --in-style OPTION        How to align the 'in' keyword with respect to the
                           'let' keyword (choices: "left-align", "right-align",
                           or "no-space") (default: right-align)
  --single-constraint-parens OPTION
                           Whether to put parentheses around a single constraint
                           (choices: "auto", "always", or "never") (default:
                           always)
  --single-deriving-parens OPTION
                           Whether to put parentheses around a single deriving
                           class (choices: "auto", "always", or "never")
                           (default: always)
  --unicode OPTION         Output Unicode syntax (choices: "detect", "always",
                           or "never") (default: never)
  --respectful BOOL        Give the programmer more choice on where to insert
                           blank lines (default: true)

CC @georgefst

@tbidne tbidne force-pushed the parser-groups branch 3 times, most recently from 06431d5 to 37ac35e Compare May 16, 2024 00:05
@HuwCampbell
Copy link
Collaborator

HuwCampbell commented May 16, 2024

Hi, thanks for talking the time to write this and justify your approach so well.

You've pretty much nailed the reasons as to why this is a tricky thing for this API. The way things stand we currently have no functions apart from the core type classes which act over the Parser type and modify it. So fmap and <*> work, but that's about it, everything else is a modifier when building a point Parser.

Now this needn't be a strict requirement, as, for example, Parsec comes with a <?> operator for adding documentation to parsers. But it's a nice principal to keep in mind as it means that the library stays composable.

That said, this PR looks like a pretty reasonable solution on first read through. While the function does need to do a tree traversal, which strikes me as a little odd, it seems to be a relatively simple way to achieve the behaviour the issue is asking for.

I'll have a proper look and think when I get some spare cycles.

With your choice of nested cases I would probably go the other way, with inner cases taking precedence, otherwise things may not compose as well, or one could actually nest the groups as well. But that might be overkill.

The other thing to note is that it will look different to command groups again, which would be a bit of an annoying inconsistency.

Cheers.

@tbidne
Copy link
Author

tbidne commented May 16, 2024

Hi, thanks for the quick response.

  1. Re: nested groups. Sure, having the inner group take precedence ("do not overwrite Just") is reasonable. I don't personally have any plans for nested groups, so either way is fine with me.

    Actually composing nested groups and rendering them as such is an interesting idea. Truth be told, I didn't pursue it since I wanted the simplest thing that worked, and the rendering logic for nesting could be tricky. But I can see if it's not too hard.

  2. I'm glad you brought up the inconsistency with command groups, since I was just about to comment about that. Right now there are (at least?) two things command groups do differently:

    1. Command groups are not alphabetically sorted.

    2. Duplicate command groups are not merged; they are all listed as created.

    I sorta like the current merging logic, but that's really just my aesthetic preferences, and it is probably a good idea to prefer consistency with command groups where possible. Alphabetically sorting is probably too prescriptive in any case, since I don't believe optparse does this anywhere else. I made a commit here exploring this. I can bring those changes to this PR if you agree that command group consistency is a good idea.

  3. My only other thought at the moment concerns Global options:. Specifically, do they require any special treatment? So far I haven't given it any thought at all i.e. they've inherited the same treatment as Available options:. My gut instinct is they should be treated the same, but I don't think I've actually rendered them for any of my apps before, so I wanted to explicitly ask.

@HuwCampbell
Copy link
Collaborator

HuwCampbell commented May 16, 2024

Using a nested group would just be a List instead of a Maybe, so it would be easy to explore.

It's been a while since I implemented command groups, but I thought they were actually merged as well, though I would be happy to remove that logic as I think it's redundant.

@tbidne
Copy link
Author

tbidne commented May 16, 2024

Edit: I think I figured out what you mean. Consecutive command groups are merged i.e.

hsubparser ( commandGroup "Group 1" ... )
  <|> hsubparser ( commandGroup "Group 1" ... )

This will indeed be merged. What I currently have here allows for merging all (including non-consecutive) identical groups regardless of the order. It would be the equivalent of the following, for command groups:

-- the two "Group 1"s will be merged
hsubparser ( commandGroup "Group 1" ... )
  <|> hsubparser ( commandGroup "Group 2" ... )
  <|> hsubparser ( commandGroup "Group 1" ... )

It does this by sorting first, then grouping. I like it, but that's probably too opinionated, and a departure from how optparse otherwise works.


Expand outdated comment

Here's an example of what happens to commands:

parseCommand =
      hsubparser
        ( command "list 2" (info (pure List) $ progDesc "Lists elements")
        )
        <|> hsubparser
        ( command "list" (info (pure List) $ progDesc "Lists elements")
            <> command "print" (info (pure Print) $ progDesc "Prints table")
            <> commandGroup "Info commands"
        )
        <|> hsubparser
          ( command "delete" (info (pure Delete) $ progDesc "Deletes elements")
          )
        <|> hsubparser
          ( command "query" (info (pure Query) $ progDesc "Runs a query")
              <> commandGroup "Query commands"
          )
Available commands:
  list 2                   Lists elements

Info commands
  list                     Lists elements
  print                    Prints table

Available commands:
  delete                   Deletes elements

Query commands
  query                    Runs a query

@tbidne

This comment was marked as outdated.

@tbidne
Copy link
Author

tbidne commented May 16, 2024

Apologies for the moving target, but I think the conversation has possibly become hard to follow due to separate threads (my fault!), so I've pushed some of the changes to this branch, to make for an easier review. In particular, the commit changes the following:

  • Duplicate groups are only merged when they are consecutive. If they are not consecutive, then they repeat. This is how command groups work.
  • No sorting.
  • Nested groups are concatenated and displayed as Group 1.Group 2, as discussed above. This seems quite reasonable to me (up to formatting changes).

Eventually I'd like to make my added tests more robust, so I'll probably redo those, but for now I'm happy with the way this works, so I'll wait for more feedback before changing anything.

- Nested groups no longer overwrite; we concatenate them instead.
- We no longer sort groups alphabetically; this means, in particular,
  that duplicate groups are only merged when they are consecutive.
  This behavior matches command groups.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Option groups, à la command groups
2 participants