Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Python underscores in numeric literals #3038

Closed
marvintensuan opened this issue Aug 15, 2021 · 0 comments · Fixed by #3039
Closed

Add support for Python underscores in numeric literals #3038

marvintensuan opened this issue Aug 15, 2021 · 0 comments · Fixed by #3039

Comments

@marvintensuan
Copy link
Contributor

marvintensuan commented Aug 15, 2021

Information

  • Language: Python
  • Plugins: none

Description
Python supports underscores in numeric literals (PEP 515). That is, you can use underscores for better readability of code.

>>> 2_021
2021
>>> 20_21
2021
>>> 0b_1111_1100_101
2021
>>> 0o_3745
2021
>>> 0x7_e5
2021

Code snippet

prism-python.js uses regex to look up for numbers.

'number': /(?:\b(?=\d)|\B(?=\.))(?:0[bo])?(?:(?:\d|0x[\da-f])[\da-f]*(?:\.\d*)?|\.\d+)(?:e[+-]?\d+)?j?\b/i,

We can break this down as follows:

/(?:\b(?=\d)|\B(?=\.))
(?:0[bo])?
(?:
    (?:\d|0x[\da-f])
    [\da-f]*
    (?:\.\d*)?  |  \.\d+
)	
(?:e[+-]?\d+)?
j?
\b/i

I would propose the following modifications:

/(?:\b(?=\d)|\B(?=\.))
(?:0[bo](_)?)?
(?:
    (?:\d|0x(_)?[\da-f])
    ([\da-f]|[\da-f]_)*
    (?:\.\d*)?  |  \.\d+  |  ((\d+_)*(\.)?(\d+)?)*
)	
(?:e[+-]?\d+)?
([^_]j)?
\b/i

There are five modifications here:

  • (?:\.\d*)? | \.\d+ --> (?:\.\d*)? | \.\d+ | ((\d+_)*(\.)?(\d+)?)* — to recognize underscores in between numbers.
  • [\da-f]* --> ([\da-f]|[\da-f]_)*; — underscores in hexadecimals
  • (?:0[bo])? --> (?:0[bo](_)?)? — underscores after 0b and 0o, i.e. 0b_0001 and 0o_754
  • (?:\d|0x[\da-f]) --> (?:\d|0x(_)?[\da-f]) — underscore after 0x, i.e. 0x_badface
  • j? --> ([^_]j)? — supress underscores before j. e.g. 4_2j ✔️ 42_j ❌

In one line, it should look like this:

/(?:\b(?=\d)|\B(?=\.))(?:0[bo](_)?)?(?:(?:\d|0x(_)?[\da-f])([\da-f]|[\da-f]_)*(?:\.\d*)?|\.\d+|((\d+_)*(\.)?(\d+)?)*)(?:e[+-]?\d+)?([^_]j)?\b/i

Test page

The code being highlighted incorrectly.
100000 + 2_000
0b1000 + 0b_0011_1111_0100_1110
0x01af + 0xcafe_f00d

EDIT: added more modifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant