missing lexer rules if tokenVocab defined #157

robstoll · 2014-05-13T19:59:41Z

Fix is due to a problem I encountered, see http://stackoverflow.com/questions/22285912/antlr-v3-token-file-array-and-null

Problem was: lexer rules for tokens defined in the tokenVocab where not generated which resulted in NullPointerExceptions in the lexer.

This pull request fixes this issue (and I have fixed some broken links in the README.txt). The reasons why I want to use a tokenVocab already in the parser are explained in the StackOverflow thread

sharwell · 2014-05-21T18:00:22Z

tool/src/main/java/org/antlr/tool/AssignTokenTypesBehavior.java

-																grammar.getTokenType(t.getText())==Label.INVALID )
+			  Character.isLowerCase(currentRuleName.charAt(0))) && 
+			  (hasTokenVocabAndIsParserOrCombined ||
+			  grammar.getTokenType(t.getText())==Label.INVALID ))


I'm not convinced this is completely correct.

Why do you change the behavior for parser grammars? A parser grammar would use a tokenVocab = LexerGrammarName, and that lexer grammar name could itself have a tokenVocab = CustomTokensFile. It appears that only combined grammars are impacted by the scenario you are describing.

Does the condition assume that the imported tokenVocab defines tokens which are referenced in the parser? What happens if the referenced tokenVocab file is empty, or otherwise does not define a literal which is referenced in a parser rule?

Thanks for your feedback.

I will change the condition to combined grammars only, my mistake.

I guess you are right that the relaxation of the condition might be too relaxed. I assumed the use cases you described are covered by existing tests (and they all passed). I have only written a test for the use case were the defined tokens in tokenVocab are referenced in the parser (as a side notice, should I open an issue for this bug?)

sharwell · 2014-05-21T18:01:25Z

Have you tried separating your grammar into separate lexer and parser grammars, and then using the tokenVocab option in your lexer grammar to import the customized token assignments?

…d normal parser before) - fixed token vocab parser, it got into an endless loop when there was an error at the end of the file - added tests for: -- empty token vocab file -- token vocab file with errors -- token vocab file which includes referenced tokens -- token vocab file which includes non-referenced tokens

robstoll · 2014-06-30T19:40:31Z

As mentioned in my previous comment I have changed the condition to combined parsers only. Furthermore I have added the desired tests and also fixed a bug in the token vocab parser on the go.
I also tested it with my own grammar and it worked fine :)

Btw. separating the grammar is not an option since I do not want more maintenance than necessary (and it's just matter of adding this feature and we are good to go).

…t empty lines in a tokenVocab

robstoll · 2014-08-30T23:27:00Z

Are there any open questions or issues left?

parrt · 2014-09-04T17:32:28Z

we will look at this for the next release.

)

sharwell · 2014-09-29T17:51:11Z

I created an alternate implementation of this feature. Note that you don't need to include each token twice in the .tokens file. For example, the following is sufficient to cover the assignment for all of TypeArray, 'array', Null, and 'null':

TypeArray=394
Null=395

robstoll · 2014-09-29T19:01:41Z

Sounds promising. I will try it out as soon as your pull request is merged.

robstoll · 2014-10-10T08:31:59Z

Please consider to cherry pick the commit: robstoll@20d859a

Would be nice to be able to add comments and new lines to token files. I am sorry that I did not put that into a separate pull request. Could create one though if you like.

robstoll added 2 commits May 12, 2014 21:20

Fixed broken links in README.txt

2a3b4ac

Fixes missing lexer rule creation for tokens defined in tokenVocab

f0e84f3

sharwell reviewed May 21, 2014
View reviewed changes

robstoll mentioned this pull request Jul 3, 2014

Readme link is broken #158

Closed

- added the ability that one can add comments on an own line or inser…

20d859a

…t empty lines in a tokenVocab

sharwell added a commit to sharwell/antlr3 that referenced this pull request Sep 29, 2014

Add regression tests for antlr#157

41e71a7

sharwell added a commit to sharwell/antlr3 that referenced this pull request Sep 29, 2014

Support explicit token assignments in combined grammars (fixes antlr#157

2331f6d

)

sharwell mentioned this pull request Sep 29, 2014

Support explicit token assignments in combined grammars #165

Open

robstoll closed this Sep 29, 2014

This was referenced Sep 29, 2014

Fixes #163 - tokens with backslashes #164

Closed

tokens with backslashes need to keep backslash during parsing #166

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

missing lexer rules if tokenVocab defined #157

missing lexer rules if tokenVocab defined #157

robstoll commented May 13, 2014

sharwell May 21, 2014

robstoll May 22, 2014

sharwell commented May 21, 2014

robstoll commented Jun 30, 2014

robstoll commented Aug 30, 2014

parrt commented Sep 4, 2014

sharwell commented Sep 29, 2014

robstoll commented Sep 29, 2014

robstoll commented Oct 10, 2014

missing lexer rules if tokenVocab defined #157

missing lexer rules if tokenVocab defined #157

Conversation

robstoll commented May 13, 2014

sharwell May 21, 2014

Choose a reason for hiding this comment

robstoll May 22, 2014

Choose a reason for hiding this comment

sharwell commented May 21, 2014

robstoll commented Jun 30, 2014

robstoll commented Aug 30, 2014

parrt commented Sep 4, 2014

sharwell commented Sep 29, 2014

robstoll commented Sep 29, 2014

robstoll commented Oct 10, 2014