Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin and/or passed-callable mechanisms for customizing Inventory.suggest() scoring #207

Open
bskinn opened this issue Aug 29, 2021 · 7 comments

Comments

@bskinn
Copy link
Owner

bskinn commented Aug 29, 2021

Related discussion here.


Need to research plugin system best practices. Given the narrow scope of the plugging, a fully featured plugin system tool like pluggy may be overkill. OTOH, pluggy is dependent on by lots of things, so it might already be in most installed environments anyways.

Usual hierarchical sourcing of callables:

  1. Direct argument(s) to Inventory.suggest (select on a per-call basis; API usage)
  2. Slots on Inventory (programmatic default definition, avoiding need for per-call passing; API usage)
  3. Mechanism for indicating alternative callable on CLI invocation (CLI usage only)
    a. Might be too unwieldy to be practical, but could be useful/convenient for initial stages of trialing an alternative scorer.
  4. Environment variables (user execution context configuration; API and CLI usage, key for CLI)
  5. Installed-plugin entry_points (operating environment configuration; API and CLI usage, key for CLI)

Any other plugin mechanisms?

  • Another option might be a decorator-based system, similar to Dash callbacks
    • There'd have to be some system for keeping track of the plugged scorers internally--probably an ID string
    • Plugins would decorate their scoring functions at import time to register them
  • A package-level scorer registration function
    • Similarly -- plugins would decorate their scorers at import time to register

Once this is implemented, the docs will have to be updated, since fuzzywuzzy is specifically named as the scoring function in many places.


Other note:

  • Some sort of protocol or template or namedtuple spec or something to define how scorers should return their results(?)
  • Perhaps start implementation in a private suggest module, and then over time exposing public abstract classes/interfaces/protocols for plugins to implement to
@bskinn bskinn added type: enhancement ✨ Something to add issue: maybe 🤔 Being considered, but not certain labels Aug 29, 2021
@bskinn bskinn added this to the v2.2 milestone Aug 29, 2021
@bskinn bskinn modified the milestones: v2.2, v3.0 Dec 12, 2021
@eirrgang
Copy link

To some extent, the plug-in mechanism and the plug-in selection may be decoupled.

How would a command-line user select an alternative scoring mechanism? Unless either all detected scorers would be used, or some heuristic would select a scoring mechanism based on the query, then presumably a command line option would name an alternate scoring function, right?

Once the alternative is named by the user, you could scan the entry_points for the sphobjinv namespace for the named implementation. But you could also just use the convention of looking for toolname-pluginname, and use importlib to try to import a specifically-named thing from sphobjinv-<name>. These are not mutually exclusive, and could be combined with other ways of registering plugins, like environment variables or config files.

Special plugin names is easy and common. Sphinx seems to encourage sphinx-tool or sphinxcontrib-tool package names, but doesn't rely on the convention internally, as far as I can tell. Pytest encourages the pytest-name convention, but doesn't use it for run-time discovery (preferring entry points).

@bskinn
Copy link
Owner Author

bskinn commented Apr 20, 2022

Yeah, I've never written a plugin system before, so I definitely will be consulting existing best practices. My current plan for the cascade of how a pluggable would be selected is in the numbered list in the original issue comment.

(1) and (2) would just involve passing/storing callable Python objects, simple enough.

For (3) and (4), I figure I would use the same entry-point syntax as setuptools: package.subpackage.subpackage:scorer, where scorer would need to be a suitable callable -- sphobjinv would just from package.subpackage.subpackage import scorer as scoring_func, and then use scoring_func appropriately.

For (5), I would want to use the stricter approach of entry_points -- I don't really like the design pattern of casting about the importable namespace for things that look like they have the right format, and then attempting an import of a specific magic function. Too many edge cases, and too much room for footgun. Also, if we use the flake8 style approach to entry_points, then there's a lot of flexibility to have multiple scorers installed at the same time.

[options.entry_points]
sphobjinv.scorer = my_scorer_name=my_scorer.scorer

I guess that's one of the big questions to decide, though -- will suggest only use an alternative plugged-in scorer if explicitly instructed to do so, somehow? Or would there be some heuristic for automatically choosing an available pluggable scorer, if one or more are installed/visible?

@bskinn
Copy link
Owner Author

bskinn commented Apr 20, 2022

Hehe, exactly what you said:

How would a command-line user select an alternative scoring mechanism? Unless either all detected scorers would be used, or some heuristic would select a scoring mechanism based on the query, then presumably a command line option would name an alternate scoring function, right?

@bskinn
Copy link
Owner Author

bskinn commented Apr 20, 2022

The CLI arg for (3) could line up with the entry_points entry... e.g., sphobjinv suggest --scorer my_scorer_name ....

Then, the env variable for (4) would just be something like SPHOBJINV_SCORER=my_scorer_name.

Ahh, and (5) wouldn't be a mechanism for specifying the scorer to use, but just for provisioning a scorer to be available...!

@bskinn
Copy link
Owner Author

bskinn commented Oct 25, 2022

For plugin suggest functionality, need to provide hooks for setup functions, for both one-at-a-time and all-at-once scorers, and then also a closeout hook for at least the all-at-once mode. Not immediately obvious why you would need a closeout for the one-at-a-time... Except maybe to dispose resources. Probably best to expose it, JIC.

Should also expose a mechanism to let the plugin define the default suggest threshold for CLI display, because different metrics may tend to lie in different ranges of 0-100.

@bskinn
Copy link
Owner Author

bskinn commented Dec 7, 2022

Will want an API and CLI for listing available scorers.

Will want API and CLI for picking a scorer based on an entry_point spec, as opposed to only exposing scorers via name/ID. This should just be a matter of exposing functionality created as part of the entry_points integration implementation here.

This should hopefully make it easier for anyone developing a scorer, especially in the early stages, because then they can just pass the entry point, which (I think) doesn't even require them to have set up packaging for their code... it could just be in a free module.

@bskinn
Copy link
Owner Author

bskinn commented Jul 6, 2023

Sprinkling this in various issues: rapidfuzz as a fuzzywuzzy alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants