parser: simplify array expressions #9243

Octachron · 2020-01-13T14:57:02Z

Currently, there are 27 productions related to array indexing in the parser, corresponding to the cartesian product:

(3 basic rule: get + set + missing parentheses) * (3 parenthesis kind) * (3 indexing families)

Most of those productions are closely related to each other. This PR proposes to replaces those 27 productions by 2 main production (one for get and the other for set), and 4 helper rules (2 for for the first factor of the cartesian product above and one for each of the remaining two factors).

Along the way, the helper functions for building array expressions have been reorganized to make it clearer that constructing an array indexing function is mostly a question of finding the right name, and determining how to transform the index expression.

parsing/parser.mly

gasche · 2020-01-13T21:42:09Z

parsing/parser.mly

+    -> array_dim * (arg_label * expression) list;
+  name:
+    Lexing.position * Lexing.position -> 'dot -> bool -> paren_kind -> array_dim
+    -> Longident.t Location.loc


Some documentation of what those two functions are doing could help (we can sort of guess, but only sort of). Maybe just giving an example expressed with concrete syntax instead of AST fragments? If I understand correctly, index builds the argument to the indexing function, and name is the name of the indexing function. (This documentation suggests that maybe name is more fundamental and should be placed first in the definition?)

I have added some documentation and examples.

gasche · 2020-01-13T21:47:46Z

parsing/parser.mly

-                       [Nolabel, arr;
-                        Nolabel, ghexp(Pexp_array coords);
-                        Nolabel, newval]))
+let array_name loc _ assign paren_kind n =


The fact that the 'dot parameter is ignored here means that it only makes sense for one of your two instances. Maybe it should not be part of the record definition, and instead dotop_family should not be a single family, but families indexed over the choice of dot? (That said, I'm not sure how to factor the grammar with this different representation, as currently dot is chosen by array_{get,set}.)

gasche · 2020-01-13T21:49:25Z

parsing/parser.mly

@@ -2130,6 +2105,30 @@ let_pattern:
      { $1 }
 ;

+%inline array_set(dot, left, mid, right):
+  | simple_expr dot left mid right LESSMINUS expr
+      { $1, $2, $3, $4, Some $7 }


I would prefer to use named parameters instead of positions, especially for $7.

I think it is weird that there are four parameters that are always used in the same way in the grammar. Could they be packed into a single symbol that returns a tuple?

The difficulty is that the other function in the cartesian product would need to produce the symbol variants, rather than use there sub symbol directly. There might be a better characterization, but it doesn't seem that straightforward.

gasche · 2020-01-13T21:50:53Z

parsing/parser.mly

+      }
+;
+%inline all_array_expr(op, dot, mid):
+  | op(dot, LPAREN { Paren }, mid, RPAREN)       { $1 }


Another way to approach this could be

%inline parens(mid): | LPAREN v=mid RPAREN { (Paren, v) } %inline all_array_expr(op, dot, mid) | op(dot, parens(mid)) { $1 }

gasche

I think the current state of the implementation is good, thanks!

gasche · 2020-01-17T19:31:21Z

Changes

@@ -139,6 +139,9 @@ Working version
 - #9211, #9215: fix Makefile dependencies of compilerlibs archives and dynlink
  (Gabriel Scherer, review by Vincent Laviron and David Allsopp)

+  -#9243, simplify parser rules for array indexing operations


a space is missing before the PR number

Octachron · 2020-02-17T09:05:20Z

Note: the increase of the parser size is unexpected, I will investigate it before merging.

xavierleroy · 2020-07-17T12:22:08Z

Note: the increase of the parser size is unexpected, I will investigate it before merging.

Any news?

Octachron · 2020-09-09T13:16:42Z

I did not succeed in decreasing the change in the size of the generated parse with the initial code. However, simplifying the grammar by removing higher-order rule worked better.

damiendoligez

Looks good except for a typo in a comment and a small question.

damiendoligez · 2021-01-22T17:54:20Z

parsing/parser.mly

+
+    For instance, for builtin arrays, if Clflags.unsafe is set,
+    * [ a.[index] ]     =>  [String.unsafe_get]
+    * [ a.{x;y} <- 1 ]  =>  [ Bigarray.Array2.unsafe_set]


Suggested change

* [ a.{x;y} <- 1 ] => [ Bigarray.Array2.unsafe_set]

* [ a.{x,y} <- 1 ] => [ Bigarray.Array2.unsafe_set]

damiendoligez · 2021-01-25T14:05:22Z

parsing/parser.mly

+    let assign = if assign then "<-" else "" in
+    let mid = match n with
+        | Many -> ";.."
+        | One | Two | Three -> "" in


This looks wrong. Surely the Two and Three cases should return ";.."?
If I understand correctly, the user_index function below means that they could simply be assert false.

This is the problem with impossible cases, I am not sure if they ought to be "", ";" and ";;", or ";.." if they were possible. An assert false will be simpler and clearer indeed.

gasche

Approved again.

gasche · 2021-01-30T08:00:46Z

It's been 12 hours and the Travis jobs are still pending. Is it finally the time to move away from Travis completely? (cc @dra27 who knows about this stuff)

gasche · 2021-01-30T17:47:32Z

The tests finished 20 hours after the last push. Merging.

xavierleroy · 2021-01-30T17:54:39Z

The tests finished 20 hours after the last push.

I guess this is "continuous integration" for some value of "continuous"...

When I was a wee lad, I was told about batch processing, where you submit your computing job on punched cards or tape and get a listing with "Syntax error" on it the day after. Plus ça change, plus c'est la même chose, as they say in English.

gasche reviewed Jan 13, 2020

View reviewed changes

gasche approved these changes Jan 17, 2020

View reviewed changes

gasche reviewed Jan 17, 2020

View reviewed changes

Octachron force-pushed the simplify_parser_for_indexop branch from 2398838 to a841284 Compare January 27, 2020 09:13

Octachron force-pushed the simplify_parser_for_indexop branch from a841284 to c840be6 Compare September 9, 2020 13:11

damiendoligez approved these changes Jan 25, 2021

View reviewed changes

parser: refactorize array expressions

b271d3e

Octachron force-pushed the simplify_parser_for_indexop branch from c840be6 to b271d3e Compare January 29, 2021 16:35

gasche approved these changes Jan 29, 2021

View reviewed changes

gasche added the merge-me label Jan 29, 2021

gasche merged commit e30cca3 into ocaml:trunk Jan 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parser: simplify array expressions #9243

parser: simplify array expressions #9243

Octachron commented Jan 13, 2020

gasche Jan 13, 2020

Octachron Jan 17, 2020

gasche Jan 13, 2020

gasche Jan 13, 2020

Octachron Jan 17, 2020

gasche Jan 13, 2020

gasche left a comment

gasche Jan 17, 2020

Octachron commented Feb 17, 2020

xavierleroy commented Jul 17, 2020

Octachron commented Sep 9, 2020

damiendoligez left a comment

damiendoligez Jan 22, 2021

damiendoligez Jan 25, 2021

Octachron Jan 25, 2021

gasche left a comment

gasche commented Jan 30, 2021

gasche commented Jan 30, 2021

xavierleroy commented Jan 30, 2021

	* [ a.{x;y} <- 1 ] => [ Bigarray.Array2.unsafe_set]
	* [ a.{x,y} <- 1 ] => [ Bigarray.Array2.unsafe_set]

parser: simplify array expressions #9243

parser: simplify array expressions #9243

Conversation

Octachron commented Jan 13, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gasche left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Octachron commented Feb 17, 2020

xavierleroy commented Jul 17, 2020

Octachron commented Sep 9, 2020

damiendoligez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gasche left a comment

Choose a reason for hiding this comment

gasche commented Jan 30, 2021

gasche commented Jan 30, 2021

xavierleroy commented Jan 30, 2021