Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change inner loops to use int not YY_CHAR, removing need for separate NUL table #370

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nickd4
Copy link
Contributor

@nickd4 nickd4 commented Jun 25, 2018

I am interested in creating scanners with %option noecs nometa-ecs, it removes several lookups from the scanner inner loop, and thus it may be a good compromise between compressed and full tables.

Also, I'm studying the table format closely, in order to teach myself how flex works (this was originally why I turned off equivalence classes, i.e. to simplify the tables to make them more human-readable).

In the process I noticed several things about the generated code.

Firstly, the "jam" state, i.e. the last N entries in the "yy_nxt" and "yy_chk" tables, contained an unused entry, it is generated with 257 transitions to itself rather than only 256, and the 257th could never be accessed (according to the inner loop code I saw). So I wanted to remove this, not that it is a big space issue or anything, but mainly just because I found it a bit confusing and I wanted to tighten things up.

Secondly, the code of the function yy_try_NUL_trans(), was slightly unfortunate as shown here:

/* yy_try_NUL_trans - try to make a transition on the NUL character
 *
 * synopsis
 *      next_state = yy_try_NUL_trans( current_state );
 */
    static yy_state_type yy_try_NUL_trans  (yy_state_type yy_current_state )
{
        int yy_is_jam;

        yy_current_state = yy_NUL_trans[yy_current_state];
        yy_is_jam = (yy_current_state == 0);

                return yy_is_jam ? 0 : yy_current_state;
}

We can see that the yy_is_jam variable is completely unnecessary in this particular combination. But looking at the code in "gen.c" which generates this routine, it is clear why it generates such code (since there are various options for the first and second block of code, interfaced by the yy_is_jam variable).

The code in yy_get_previous_state() is also not totally ideal as it combines basically the code from the ordinary inner loop plus the code from yy_try_NUL_trans(), via an if/else, executed on each character.

All of these things occur because of a decision made in "nfa.c" about whether to generate the NUL transition table, which it wouldn't normally do when equivalence classes are in use, but it does in this case, to accommodate the fact that there are 0x101 characters including the end-of-buffer character.

In my opinion a better way is to make the inner loop able to use 0x101 characters directly, so that the NUL transitions can be stored in the ordinary transition tables (in the 0x100 spot, which is not ideal but there are justifiable reasons for it; perhaps a future pull request could add options to remove the end-of-buffer character and associated optimizations, for applications where simplicity is better than speed).

Indeed the comments in "nfa.c" suggest the same possibility, although it wasn't implemented at the time.

So I went ahead and made the changes and it appears to work. I also checked the generated assembly code for the scanner inner loop before and after the change and it appears to be a slight improvement (whether or not equivalence classes are in use). I've listed the reason for this in a comment in "gen.c".

I've attached an example lex.yy.c before and after the change, you can "diff" them to see what has changed, but the main changes are in the yy_try_NUL_trans() and yy_get_previous_state() routines. I did this test on "scan.l" using the current release version of flex, not the latest development version. This is because I don't have the latest autotools handy, so please do re-test the pull request if it is accepted.

In the attached example files, we can see that the "jam" state is now at yy_chk[] location 30271 not 30174, reflecting that the states are now slightly larger which cost about 100 words, but on the other hand the yy_NUL_trans[] array which was previously 1113 words, is no longer required, a significant saving. The downside is that recovering from NULs in the input now requires a compressed table lookup, which is slower. If this was an issue I'd suggest to remove the end-of-buffer optimizations altogether.

lex.yy.c.zip

@westes
Copy link
Owner

westes commented Apr 25, 2024

Could you rebase and resolve the conflicts? (As you can see, we're finally clearing out the backlog of pr's.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants