Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mimetypes.guess_type returns None for "somefile.txt" in only Azure durable function #1075

Open
tkumpumak opened this issue May 6, 2024 · 0 comments

Comments

@tkumpumak
Copy link

Image is mcr.microsoft.com/azure-functions/python:4-python3.11

mimetypes.guess_type returns correctly 'text/plain' when testing on windows or docker linux container. Unfortunately on Azure durable function python code it for some reason returns None. Also if the whole module is copied from cpython 3.11 to the image and imported then it works correctly.

Following test code, that also contains parts of mimetype.guess_type was run on Windows and on Azure durable function. Result print follows below.

So I'm out of ideas how to debug this more. What code Azure durable function could be running, it seems that it's not copy of the mimetype.py, but the data inside mimetype seems to be correct.

doc_name = "file://test.txt"
mime_types = mimetypes.MimeTypes()
mime_type, encoding = mime_types.guess_type(doc_name)

if mime_type is not None:
logger.error("Test1: doc_name:" + doc_name + " mimetype: " + mime_type)
else:
logger.error("Test1: doc_name:" + doc_name + " mimetype: None")

mime, _ = mimetypes.guess_type(doc_name, False)
from mimetypes import _db #initialized on mimetypes.guess_type

if mime is not None:
logger.error("Test2: doc_name:" + doc_name + " mimetype: " + mime)
else:
logger.error("Test2: doc_name:" + doc_name + " mimetype: None")

logger.error("Available types: " + str(_db.types_map[True]))

import posixpath
import urllib

url = os.fspath(doc_name)
scheme, url = urllib.parse._splittype(url)

logger.error("url:" + url + " scheme: " + scheme) #these are ok

strict = False
base, ext = posixpath.splitext(url)
while (ext_lower := ext.lower()) in _db.suffix_map:
base, ext = posixpath.splitext(base + _db.suffix_map[ext_lower])

encodings_map is case sensitive

logger.error("ext:" + ext)
if ext in _db.encodings_map:
encoding = _db.encodings_map[ext]
base, ext = posixpath.splitext(base)
else:
encoding = None
ext = ext.lower()
logger.error("ext:" + ext)
types_map = _db.types_map[True]
if ext in types_map:
logger.error("Return 1, ext:" + ext + " mimetype: " + types_map[ext])
return types_map[ext]
elif strict:
logger.error("Return 2, ext:" + ext + " mimetype: None")
return None, encoding
types_map = _db.types_map[False]
if ext in types_map:
logger.error("Return 3, ext:" + ext + " mimetype: " + types_map[ext])
return types_map[ext]
else:
logger.error("Return 3, ext:" + ext + " mimetype: None")
return None

Windows:
Test1: doc_name:file://test.txt mimetype: text/plain
Test2: doc_name:file://test.txt mimetype: text/plain
Available types: **SNIP long list, List contains: ** '.n3': 'text/n3', '.txt': 'text/plain', '.bat': 'text/plain',
ext:.txt
ext:.txt
Return 1, ext:.txt mimetype: text/plain

Azure durable function:
Test1: doc_name:file://test.txt mimetype: None
Test2: doc_name:file://test.txt mimetype: None
Available types: **SNIP long list, List contains: ** '.n3: text/n3, .txt: text/plain, .bat: application/x-msdos-program',
ext:.txt
ext:.txt
Return 1, ext:.txt mimetype: text/plain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant