Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Expected extensions output mode to Magika CLI #78

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nhonx
Copy link

@nhonx nhonx commented Feb 19, 2024

Add -e / --expected-exts mode to Magika CLI, which output one or several expected file extensions of input file, in case the input extension is missed or incorrect

@nhonx nhonx mentioned this pull request Feb 19, 2024
@nhonx nhonx changed the title Add Expected extensions output mode to Magika CLI feat:Add Expected extensions output mode to Magika CLI Feb 19, 2024
@nhonx nhonx changed the title feat:Add Expected extensions output mode to Magika CLI feat: Add Expected extensions output mode to Magika CLI Feb 19, 2024
@reyammer
Copy link
Collaborator

Thank you for the PR! I'll need to give some thoughts about it, we have received a similar request and need to think how to integrate it for the long term. Leaving this open for now for visibility; will follow up later on. Thanks!

@nhonx
Copy link
Author

nhonx commented Feb 22, 2024

@reyammer: I think we should think about this as a long-term feature. And we also need to update more correct/expected extensions to https://github.com/google/magika/blob/main/python/magika/config/content_types_config.json
I see lots of extensions is missed (empty) here, and some are missing the full list of extensions, for example: Javascript should include .ts, .jsx, .ts also.

@reyammer
Copy link
Collaborator

yes, indeed. We are already working on v2, and adding many more extensions / types. your examples are very spot on. And sorry for not following up on this, didn't have time to think about this and we are currently pushing for starting new training round. Will get to this once I have a moment, and I agree we should have a feature in this direction!

@nhonx
Copy link
Author

nhonx commented Apr 1, 2024

Hi @reyammer , should I close this? As I see we have new code adding to main branch and leading to conflict on this PR.

ia0 added a commit to ia0/magika that referenced this pull request Apr 17, 2024
ia0 added a commit that referenced this pull request Apr 18, 2024
- Uses the latest release candidate of the ORT library, which provides async inference.
- The CLI creates only one session and shares it between the different inference tasks.
- Implements the features extraction reference tests.
- Implements the %e format for #78.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants