Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(bash) Options (like -lt etc) just don't work and it looks like they never did. #2673

Closed
joshgoebel opened this issue Sep 5, 2020 · 7 comments
Labels
bug help welcome Could use help from community language

Comments

@joshgoebel
Copy link
Member

Hmm... Options (like -lt etc) just don't work and it looks like they never did. That's because \b (word boundary) doesn't treat hyphens as a part of a word. We have to replace \b with explicit definition of a boundary.
On the other hand, if we do, we'll get rather ugly elements like <span class="hljs-_">...</span>, which was supposed, as I understand, as relevance boosters. But do we really what it?

Originally posted by @egor-rogov in #2669 (comment)


Sample:

# test keyword usages with hyhens at their start
if [ $VAR -ne 0 ]; then echo 'hi'; fi
[ $FOO -eq 0 ] && [ ! -f $BAR ] || exit 2
@joshgoebel joshgoebel changed the title bug(bash) Options (like -lt etc) just don't work and it looks like they never did. That's because \b (word boundary) doesn't treat hyphens as a part of a word. We have to replace \b with explicit definition of a boundary. bug(bash) Options (like -lt etc) just don't work and it looks like they never did. Sep 5, 2020
@joshgoebel
Copy link
Member Author

joshgoebel commented Sep 5, 2020

On the other hand, if we do, we'll get rather ugly elements like ..., which was supposed, as I understand, as relevance boosters.

I don't mind that... it's a neat trick for relevance boosting with keywords. I'm also fine with formalizing/documenting it it so that _ class would be ignored and ONLY used for relevance and not generate markup.

That's because \b (word boundary) doesn't treat hyphens as a part of a word. We have to replace \b with explicit definition of a boundary.

Ah, which we have no GREAT way to do currently without look-behind I don't think?

@joshgoebel joshgoebel added bug help welcome Could use help from community language labels Sep 5, 2020
@joshgoebel
Copy link
Member Author

Potentially related: #2429

@joshgoebel
Copy link
Member Author

If the goal was just to fix this they could be be moved into a normal rule and removed from keywords... and the normal rule would match \s as boundary and have no className.

@sirosen
Copy link
Contributor

sirosen commented Sep 5, 2020

Just chiming in since I cared about a related issue: I'd say it's not at all obvious that highlighting these terms is desirable.

git branch -a, echo -ne, ssh -l, and a wide variety of other patterns can appear which contain these strings, but not in a context in which they ought to be treated specially.

I'm also a bit surprised by the set of options in that list? It seems like a smattering of options from test, but not even all of the common ones. I'd be happy to help revise the list of options and add some tests there if someone can explain to me what the "relevance boosting" of that group does. (i.e. Why should those be listed there at all? Deleting that section doesn't break any tests.)

@joshgoebel
Copy link
Member Author

joshgoebel commented Sep 5, 2020

I'd say it's not at all obvious that highlighting these terms is desirable.

They were never intended to be highlighted (sorry if I implied that before), hence _ which has no CSS styles.

explain to me what the "relevance boosting" of that group does.

It makes the language "more likely" to be detected as Bash (vs say C++) for the auto-detect system.

Deleting that section doesn't break any tests.

Because it's never worked properly evidentially so hence it currently effects nothing. :-)


and a wide variety of other patterns can appear which contain these strings, but not in a context in which they ought to be treated specially.

Indeed. Whether or not to delete this and just forget it ever existed is not a terrible question to be asking. :)

@joshgoebel
Copy link
Member Author

joshgoebel commented Sep 5, 2020

@egor-rogov I vote to remove [for now] since we don't have sufficient (deep) tests to guarantee even if we fixed this it would even be a real improvement (in detecting Bash) as it seems like there are tons of possibilities for false positives.

Thoughts?

@egor-rogov
Copy link
Collaborator

@joshgoebel +1 to remove.
And, as you've said, in order to boost relevance keywords could be moved into a regular rule with no className. So there's no need in special _ class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help welcome Could use help from community language
Projects
None yet
Development

No branches or pull requests

3 participants