Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deno.readDir won't return all files #12998

Open
Sembiance opened this issue Dec 5, 2021 · 6 comments
Open

Deno.readDir won't return all files #12998

Sembiance opened this issue Dec 5, 2021 · 6 comments
Labels
bug Something isn't working runtime Relates to code in the runtime crate

Comments

@Sembiance
Copy link
Contributor

Sembiance commented Dec 5, 2021

If a filename has funny characters Deno.readDir will silently ignore those files.

First unzip bug.tar.zip which produces bug.tar file
Now untar bug.tar which produces bug directory
Now run deno run --allow-read readDirBug.js:

for await(const entry of Deno.readDir("bug")) {
	console.log(entry);
}

It only shows 2 entries, should be 4.

NodeJS finds these files just fine:

import { readdir } from 'fs/promises';

const entries = await readdir("bug");
for (const entry of entries) {
	console.log(entry);
}

System info:

Linux 5.14.7
deno 1.16.4 (release, x86_64-unknown-linux-gnu)
v8 9.7.106.15
typescript 4.4.2

EDIT: Added attachment, bug.tar.zip

@lucacasonato
Copy link
Member

@lucacasonato lucacasonato added bug Something isn't working runtime Relates to code in the runtime crate labels Dec 6, 2021
@andreubotella
Copy link
Contributor

Cause: https://github.com/denoland/deno/blob/d31378726e78490e88b8a9ec3001b86ea009d978/runtime/ops/fs.rs

You didn't link to a specific line – I'm assuming you mean

// Not all filenames can be encoded as UTF-8. Skip those for now.


I've done some work on puzzling out the web-exposed filenames that you get from file controls, at w3c/FileAPI#161. Of course, this gets a lot more complicated with file system access APIs. Is this something we want?

@lucacasonato
Copy link
Member

You didn't link to a specific line – I'm assuming you mean

Yeah. Github's "Copy permalink" doesn't copy line numbers it seems 😔

I think the behavior we want is whatever Node does here. So either replace the invalid chars with replacement chars, or use WTF-16 encoding.

@Sembiance
Copy link
Contributor Author

Sembiance commented Dec 8, 2021

Looks like this is a long standing issue: #627

Sounds like one possible solution would be a Deno equilivant OsString that could be used Deno wide with all file operations including readDir. However based on the discussion in that issue, while it would fix all filename handling issues, it would be a huge undertaking.

@shanmukhateja
Copy link

shanmukhateja commented Dec 12, 2021

Hi @Sembiance @lucacasonato I would like to take a shot at this.

Do we have a public list of replacement chars to use? I'd like to try the "replace invalid chars" approach.

EDIT: I noticed Nemo, my (Linux) file manager shows 0xFFFD char in place of the invalid \xF3n char in "Informaci\xF3n.DOC" and "Informaci\xF3n.DOC.info".

Do I replace all invalid chars with this char instead?

@andreubotella
Copy link
Contributor

andreubotella commented Dec 12, 2021

Hi @shanmukhateja, and thank you for your interest, but unfortunately this isn't a simple issue of replacing characters from the JS side. The code that deals with interacting with the filesystem is Rust code, including the code that skips files with invalid filenames here. And changing it to not skip them would already be enough to replace those invalid bytes or code units with the replacement character.

But making this change would have further effects on the Deno ecosystem, because some of the files returned by Deno.readDir() would not be accessible with Deno.readTextFile() or other APIs, since you could have two files with different filenames that map to the same Deno-exposed filename. Fixing this would probably require a significant re-architecturing of Deno's filesystem APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working runtime Relates to code in the runtime crate
Projects
None yet
Development

No branches or pull requests

4 participants