New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] Miniflare malforms utf8 characters #326
Comments
Hey! I did some digging on this because it looked interesting. Here is the response from miniflare if you pull up developer tools -> Network: Notice the response was never impacted. "ń" is still there. Running a check on the character in nodejs terminal: "ń".charCodeAt(0)
// 324 https://unicode-table.com/en/0144/ With some digging the answer is that HTML by default treats everything as utf-8!!! Wow, how old school html is. You have two options:
const content = `<!DOCTYPE html>
<html>
<head>
<title>ń</title>
</head>
<body>
<p>ń</p>
</body>
</html>`;
export default {
fetch(req) {
const BOM = [0xEF, 0xBB, 0xBF];
// convert content to UTF-8 with BOM
const contentUTF8 = new TextEncoder().encode(content);
const contentWithBOM = new Uint8Array(BOM.length + contentUTF8.length);
contentWithBOM.set(BOM);
contentWithBOM.set(contentUTF8, BOM.length);
return new Response(contentWithBOM, {
status: 200,
headers: {
'content-type': 'text/html'
}
});
}
}
|
Just passing on this issue from @Hexstream's referenced issue which I was debugging. It's a pages site using wrangler. When running We initially thought it was a wrangler issue, however then narrowed it down to miniflare somewhere between the wrangler server returning the response and it reaching the browser. As this issue occurs in Miniflare but not production, I have a feeling there's something that can be done here for it to be resolved. |
I understand the issue now. To clarify for this thread: When a browser accepts a request, it first converts anything not within utf-8 like so: This is called percent encoding. However, a quick look at Miniflare, it indeed does not decode the URI prior to checking against the file location, but keeps the percent encoding. This is also a problem with other special characters. As stated before though, Miniflare is returning the data correctly, it's up to the user to ensure HTML with UTF-16 is parsed correctly. |
[edit: Mostly responding to @Skye-31, sorry for late reply...] My issue has everything to do with URLs and nothing to do with HTML. Basically, entering these URLs directly in the address bar,
(You can also enter the URLs in the address bar with I seem to understand that miniflare is not mapping from percent-encoded URLs to files correctly. [edit: #327 indeed looks like a likely fix!] |
Hey! 👋 Apologies for the delayed response. I've recently returned from a long holiday and am just starting to catch up on issues now. I think this issue is caused by Miniflare not populating More specifically, As a temporary solution, it looks like there's a import { getAssetFromKV, NotFoundError } from "@cloudflare/kv-asset-handler";
import manifestJSON from "__STATIC_CONTENT_MANIFEST";
const assetManifest = JSON.parse(manifestJSON);
export default {
async fetch(request, env, ctx) {
try {
return await getAssetFromKV(
{ request, waitUntil: ctx.waitUntil.bind(ctx) },
{
ASSET_NAMESPACE: env.__STATIC_CONTENT,
ASSET_MANIFEST: assetManifest,
pathIsEncoded: globalThis.MINIFLARE, // ⬅️ This is the important bit
}
);
} catch (e) {
if (e instanceof NotFoundError) {
return new Response(null, { status: 404 });
}
throw e;
}
},
}; |
Using the following code, and running
miniflare --modules worker.js
results in the file being malformed in a way it shouldn't be, as shown with the image.The text was updated successfully, but these errors were encountered: