Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

character } falls under {characters} rule on IBM z/OS #586

Open
alexgubanow opened this issue Sep 12, 2023 · 6 comments
Open

character } falls under {characters} rule on IBM z/OS #586

alexgubanow opened this issue Sep 12, 2023 · 6 comments

Comments

@alexgubanow
Copy link

Im working on a port of Flex v2.6.4 to IBM z/OS.
During testing found that } slips into {characters} rule. Currently workaround is to have rule code like:
if(yytext == '}' ) {return '}'; } else { /* main logic*/}
i do have a warning:
scanner.l:320: warning, rule cannot be matched
Application compiled with EBCDIC charset, it is different from ASCII. But such a problem only observed with }, while { character works fine.
Does any one has idea what / where / how to ??

Part of scanner.l:

{characters} {
        if(*((const char *)yytext) != '}')
        {
            /* characters logic */
        }
        else
        {
            return '}';
        }
    }

"[" {
        return '[';
    }

"]" {
        return ']';
    }

"{" {
        return '{';
    }

"}" {
        return '}';
    }
@Mightyjo
Copy link
Contributor

I've been puzzling over this for a couple of hours.

I have three questions for you. I beg your pardon if they are very silly.

  1. Do you already have some lex that works on z/OS? Doesn't need to be Flex, just a lex that works in EBCDIC.
  2. Do you have a yacc or bison that works on z/OS?
  3. Did you define the {characters} pattern in the definitions section of your scanner? (If so, what is the definition?)

I have a guess: Your {characters} pattern isn't getting defined the way you expect. I see why you may need it. I'd try having Flex dump your scanner tables and see if your character classes look right. I suspect whichever class includes uppercase alphabetic characters also includes '}' and a bunch of undefined points between I and J.

I think the fix will be in src/parse.y. Near the beginning you'll find the CCL_EXPR macro that assumes isascii() returns true, which it won't. Near the end you'll find the ccl_expr rule that determines what the [:alpha:], etc. classes match. They depend on the CCL_EXPR macro, so they might not be working correctly. The range class definition is just above those and it's almost certainly wrong for EBCDIC, too.

@alexgubanow
Copy link
Author

alexgubanow commented Sep 13, 2023

Wow, i did not expected any reaction to this ticket, while you have possibly already found problem :)
this is great.

  1. depends from company, everyone has own setup, as ZOS comes only with shell and few other unix utils. RocketSoftware has flex v2.5.4 ported by someone some years ago, nobody knows history where it came from :)
  2. yes, we have bison v3.0.4, came from same person who did flex. As well i have ported bison v3.3.2.
  3. yes, it is defined as
    characters [a-zA-Z0-9]+[a-zA-Z0-9_]*
    i have tested with \w+ - not working at all, as well with {[a-zA-Z0-9]+[a-zA-Z0-9_]*} instead of {characters}- same behaviour

To get flex v2.6.4 compiled, i have used flex v2.5.4 and bison v3.0.4.
flex v2.5.4 is used to compile same scanner.l, everything is working fine.

I do have access to sources from where flex v2.5.4 was built, but i have not found any suspicious changes or something worth attention.
I can try to compare official src/parse.y from flex v2.5.4 with what we have.

@Mightyjo
Copy link
Contributor

Neat! I've seen mailing list chatter about a patch for EBCDIC during the 2.3 era but I couldn't find the sources.

The ranges like [a-z] are what's causing the problem. They are contiguous in ascii but broken up into 9 character blocks in EBCDIC. If you break them up further into contiguous sequences they should work.

Escape sequences like '\w' are defined similarly, but I forget which file they're in.

@alexgubanow
Copy link
Author

okay, we are getting closer:
replacing the [a-zA-Z0-9]+[a-zA-Z0-9_]* by:
[abcdefghijklmnopqrstuvwxyABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789]+[abcdefghijklmnopqrstuvwxyABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_]*
solves issue

I have reviewed CCL_EXPR macro in 2.5.4 version, it is different, but copypaste this macro from our v2.5.4 into 2.6.4 did nothing.

@Mightyjo
Copy link
Contributor

Did the warning about rules that can't be matched also go away?

Reading the z/OS 2.5 docs tonight. You probably don't need to worry about the CCL_EXPR macro or the use of functions like isalpha(). Looks the z/OS XL C/C++ library defines them in terms of the current locale (e.g. IBM-1047, ISO8859-1). Even isascii() is available with BSD semantics if you define the _XOPEN_SOURCE macro before including ctype.h.

I don't see isascii() in the z/OS Metal C library reference, but it would just test whether the argument fits in 7 bits. Something like:

int isascii(int c) {
  return ((c & 0xFFFFFF80) == 0);
}

Adjust for sizeof(int), inline, etc.

@alexgubanow
Copy link
Author

alexgubanow commented Sep 15, 2023

yes, warning went away.
docs for zos is here https://www-40.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R5Library?OpenDocument
Particulalry you are interested in z/OS XL C/C++ Runtime Library Reference https://www-40.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R5sc147314?OpenDocument
I do compile with :
-Wc,xplink -D_XPLATFROM_SOURCE=1 -DI370 -D_UNIX03_SOURCE -D_UNIX03_THREADS -D_POSIX_THREADS
Also config.h has:

#define _ALL_SOURCE 1
#define _XOPEN_SOURCE 600

This means, isascii() should behave like you normally expect.

metalC is only C, there is no library from IBM, you have to create your own functions, even malloc, etc. There is something called Callable Services, but it is out of this issue scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants