-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide parity for both escC and unescC to support more awkward filenames #220
Comments
Try the
Anything else will require Phil's attention, but he is currently away until mid-September. |
Seems like
The expected result would have been more like:
The PS: I'll look into applying https://exiftool.org/faq.html#Q21 to see if that helps. |
This topic must be handled carefully because code injection from malicious file names is a real possibility. At the moment, ExifTool doesn't do more than necessary because this is the safest way to proceed. I would have to dedicate a good block of time to expanding this to cover all possible characters/escapes, and without a real-life use case I don't know if this would be a worthwhile way to spend my time. Your tests seem to be theoretical -- have you seen file names like this in the wild? |
Rarely, but I do try to write software that handles the datatypes as they are (at least on unix filenames are defined as any sequence of bytes except For some prior art, imagemagick also interprets filenames but provides the (I don't really mind if it can't print them nicely using C-escape encode but I would want the bytes that go in to be the same as the bytes out even if that has to go through an encode/decode layer, e.g. to and from xml entities. As much as possible anyway.) |
I have been experimenting with awkward filenames, using
-@
and#[CSTR]
to support them. When my filename contains\b
and\f
for example, I and attempt to store this under thexmp-dc:source
tag while creating a MIE file, it converts the backspace and formfeed into.
dots making it impossible to recover the original filename.Looking at the codebase I notice that there is support for a much wider range of typical C-style escapes when unescaping them, but few for escaping:
Is there a reason why
escC
couldn't be at parity withunescC
(or potentially one derived from the other)?I.e.
However just adding more mappings to
escC
or edittingunescapeChar
doesn't seem to be enough as my local tests show the same result with unrecognised escapes being turned into.
.The text was updated successfully, but these errors were encountered: