Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpreting LoadObj return value -- strict OBJ specification #272

Open
SeanCurtis-TRI opened this issue Mar 30, 2020 · 3 comments
Open

Comments

@SeanCurtis-TRI
Copy link
Contributor

I've encountered an issue when using LoadObj to parse a file. Specifically, what happens when the wrong file type is provided to the parser. In this case, LoadObj doesn't return an error, it simply leaves the vector of shape_t empty.

The reason for this, based on a coarse once over on the code, is that the parsing takes an XML-like approach. It looks for specific tokens at the beginning of lines (e.g., f, v, vt, etc.) If it finds such a token and the remainder of the line is not consistent with what it expects to see, it returns false. However, if it finds no such token at the beginning of the line, the line is simply ignored. Thus, if I were to pass an STL file into LoadObj, the parsing would report "success", but give me empty geometry.

If we applied a stricter interpretation of the Obj spec, a line that isn't strictly valid OBJ and isn't commented out should produce an error as being malformed for an OBJ file. This would give us a greater power to distinguish the output of LoadObj.

I'm willing to submit a PR in this regard -- but I want to gauge the interest of the set of users in this regard.

@syoyo
Copy link
Collaborator

syoyo commented Mar 30, 2020

strict parse mode will be nice to have!

the line is simply ignored

.obj is super flexible format and it may be difficult to detect input data is invalid or not.

Currently unknown tokens are parsed as unknown_parameter and user can do whatever they want with it, so we cannot report error in non-strict mode(current implementation of tinyobjloader).

.obj has many commands(tokens)

http://www.martinreddy.net/gfx/3d/OBJ.spec

so writing a robust(strict) parser would require lots of work.

For a while, we could do

  • Check if input file is ASCII or Binary and report an error if a file contains Binary data.
  • Report a parse error when unsupported command(e.g. bezier surface) was found in .obj in strict mode.

@paulmelnikow
Copy link
Contributor

I'm getting stuck on this issue too.

  • Check if input file is ASCII or Binary and report an error if a file contains Binary data.

This makes sense to me. Although I'd also really like to reject files containing JSON or XML as well.

Short of implementing the full spec, two options come to mind:

  1. Add generic handling for unsupported commands, which checks that the command is made of letters, and is followed by some other tokens that are valid for OBJ. This would reject non-OBJ-like things, such as XML and most every JSON file.
  2. Add a flag that emits warnings on unknown commands.

@syoyo
Copy link
Collaborator

syoyo commented Nov 18, 2020

@paulmelnikow

Add generic handling for unsupported commands, which checks that the command is made of letters, and is followed by some other tokens that are valid for OBJ

I see. We can scan the head of a file(e.g. first N bytes) then we can reject a file which contains invalid charactes(e.g. { for JSON, < for XML, non-ascii characters for Binary data). This would require less work to implement.

Add a flag that emits warnings on unknown commands.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants