Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for indentation based syntax #8

Open
seflless opened this issue Jul 9, 2011 · 6 comments
Open

Support for indentation based syntax #8

seflless opened this issue Jul 9, 2011 · 6 comments

Comments

@seflless
Copy link

seflless commented Jul 9, 2011

Hey Francisco. We'd met and had a brief conversation about elegantly supporting whitespace based syntax ala python/coffeescript.

Are you still thinking of doing that, or have you come to the conclusion that it's either not your preference or architecturally sound?

@tolmasky
Copy link
Owner

tolmasky commented Jul 9, 2011

I would certainly still like to do it but I don't think its possible in the way we spoke that night. If I recall correctly, essentially what we wanted was something like:

SignifcantWhitespaceReplacement = "{"
SignifcantWhitespaceReplacementClosing = "}"

Such that in an initial pass, something like

if blah
    do_something

would be changed to:

if blah
{
    do_something
}

And then you could write your grammar using { and }. Unfortunately, there's no easy way to do this initial pass, because we have no context as to the rest of the language. For example, we may very well enter into strings and misreplace whitespace. This might be feasible in something like LALR where you have a set of tokens that you may be able to safely navigate, but even then you are going to be doing significant language-aware calculations because the whitespace may be a harmless delimiter (like the whitespace between the if keyword and the condition which you wouldn't want to replace).

As such, I am still very much interested in doing this, and doing it in a purely declarative way on top of that, but I simply haven't found a good way to do it yet. The best I've seen requires code predicates which I really dislike because they need to hold on to global state (the current "indentation level"). Any ideas?

@seflless
Copy link
Author

seflless commented Jul 9, 2011

I'll post some thoughts tomorrow, my answer was starting to get pretty long. I'll go over it tomorrow, look at how you are actually doing it already to synchronize with your mental model. I saved my work in progress into a text file and will look into it tomorrow after porting over a part of my language's syntax to language.js.

@seflless
Copy link
Author

I'm having a hard time figuring out exactly how to use the project. How do you built it, use Language Visualizer, and run the command line etc. When you have some time could you write up some documentation. I started reading through code, but it was taking longer to get my head around than it would with something to play with.

No rush, just when you get a chance. I'd like to take a crack at experimenting with some ideas. I'm starting to think it just has to be built into the runtime as a special set of characters that get fed into the productions matching code. Or just built in behaviour. But until I get my hands dirty, I'll never know.

@tolmasky
Copy link
Owner

I'm right about to head out, but I can give you these quick steps and then expand on them later if its not enough:

  1. I've made it such that you can just do:
    $ cd path/to/language
    $ npm link
    $ language -g yourgrammer.language > parser.js
  2. To use the language visualizer, you want to build a browser version, so:
    $ language -g yourgrammer.language --browser=Parser > parser.js

Then copy parser.js into LanguageVisualizer/ and run LanguageVisualizer/index.html in your browser

That should be it, hope that helps!

Thanks,

Francisco

On Jul 9, 2011, at 7:04 PM, francoislaberge wrote:

I'm having a hard time figuring out exactly how to use the project. How do you built it, use Language Visualizer, and run the command line etc. When you have some time could you write up some documentation. I started reading through code, but it was taking longer to get my head around than it would with something to play with.

No rush, just when you get a chance. I'd like to take a crack at experimenting with some ideas. I'm starting to think it just has to be built into the runtime as a special set of characters that get fed into the productions matching code. Or just built in behaviour. But until I get my hands dirty, I'll never know.

Reply to this email directly or view it on GitHub:
#8 (comment)

@tolmasky tolmasky reopened this Jul 10, 2011
@tolmasky
Copy link
Owner

BTW, did you mean to close this?

@seflless
Copy link
Author

That's weird, I left a Comment and hit Comment & Close. I'm not seeing my Comment. Was saying that: I'm working on generating a parser that does track indentation, and then generates a special character that is fed into the production matching logic, it's getting messy quickly.

Looking at Python as a test case. There are a bunch of edge cases where whitespace matters or not depending on the surrounding code. For example IF statements.

Valid:

if 1==1:
    print "This be truth."

Invalid:

if x==10
:
    print "Ten it is"

Valid:

if ( 1==1
):
    print "This be harsh truth."

And then there is detecting double indentation, which makes bracket insertion as a quick hack definitely not work.

There are a lot more cases. But it was a good exploration of supporting, I'll keep dabbling in it. I'm back on work work today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants