Stan highlighting needs fixing #3236

spinkney · 2021-11-27T15:02:54Z

Information

Language: Stan

Description
Lots of updates to Stan since 2.24 plus a few other things.

First the other things. I believe the Stan code needs an additional parameter to indicate the "blocks" of the language:

Code blocks should have a color dedicated to them
- functions, data, transformed data, parameters, transformed parameters, model, generated quantities

Changes since 2.28

array[] is now a valid keyword
complex is a valid keyword literal

The higher order functions that are valid are:

      ## Algebraic equation solver
      "algebra_solver", "algebra_solver_newton",

      ## Ordinary differential equation
      "ode_rk45", "ode_rk45_tol", "ode_ckrk", "ode_ckrk_tol", "ode_adams",
      "ode_adams_tol", "ode_bdf", "ode_bdf_tol", "ode_adjoint_tol_ctl",

      ## 1D integrator
      "integrate_1d",

      ## Reduce-sum function
      "reduce_sum", "reduce_sum_static",

It's probably easier to just have all the functions. I've grabbed all this from a recent pr request in the rouge library

Long list of functions

# Integer-Valued Basic Functions

      ## Absolute functions
      "abs", "int_step",

      ## Bound functions
      "min", "max",

      ## Size functions
      "size",

      # Real-Valued Basic Functions

      ## Log probability function
      "target", "get_lp",

      ## Logical functions
      "step", "is_inf", "is_nan",

      ## Step-like functions
      "fabs", "fdim", "fmin", "fmax", "fmod", "floor", "ceil", "round",
      "trunc",

      ## Power and logarithm functions
      "sqrt", "cbrt", "square", "exp", "exp2", "log", "log2", "log10",
      "pow", "inv", "inv_sqrt", "inv_square",

      ## Trigonometric functions
      "hypot", "cos", "sin", "tan", "acos", "asin", "atan", "atan2",

      ## Hyperbolic trigonometric functions
      "cosh", "sinh", "tanh", "acosh", "asinh", "atanh",

      ## Link functions
      "logit", "inv_logit", "inv_cloglog",

      ## Probability-related functions
      "erf", "erfc", "Phi", "inv_Phi", "Phi_approx", "binary_log_loss",
      "owens_t",

      ## Combinatorial functions
      "beta", "inc_beta", "lbeta", "tgamma", "lgamma", "digamma",
      "trigamma", "lmgamma", "gamma_p", "gamma_q",
      "binomial_coefficient_log", "choose", "bessel_first_kind",
      "bessel_second_kind", "modified_bessel_first_kind",
      "log_modified_bessel_first_kind", "modified_bessel_second_kind",
      "falling_factorial", "lchoose", "log_falling_factorial",
      "rising_factorial", "log_rising_factorial",

      ## Composed functions
      "expm1", "fma", "multiply_log", "ldexp", "lmultiply", "log1p",
      "log1m", "log1p_exp", "log1m_exp", "log_diff_exp", "log_mix",
      "log_sum_exp", "log_inv_logit", "log_inv_logit_diff",
      "log1m_inv_logit",

      ## Special functions
      "lambert_w0", "lambert_wm1",

      ## Complex Conversion Functions
      "get_real", "get_imag",

      # Complex-Valued Basic Functions

      ## Complex Construction Functions
      "to_complex",

      # Array Operations

      ## Reductions
      "sum", "prod", "log_sum_exp", "mean", "variance", "sd", "distance",
      "squared_distance", "quantile",

      ## Array size and dimension function
      "dims", "num_elements",

      ## Array broadcasting
      "rep_array",

      ## Array concatenation
      "append_array",

      ## Sorting functions
      "sort_asc", "sort_desc", "sort_indices_asc", "sort_indices_desc",
      "rank",

      ## Reversing functions
      "reverse",

      # Matrix Operations

      ## Integer-valued matrix size functions
      "num_elements", "rows", "cols",

      ## Dot products and specialized products
      "dot_product", "columns_dot_product", "rows_dot_product", "dot_self",
      "columns_dot_self", "rows_dot_self", "tcrossprod", "crossprod",
      "quad_form", "quad_form_diag", "quad_form_sym", "trace_quad_form",
      "trace_gen_quad_form", "multply_lower_tri_self_transpose",
      "diag_pre_multiply", "diag_post_multiply",

      ## Broadcast functions
      "rep_vector", "rep_row_vector", "rep_matrix",
      "symmetrize_from_lower_tri",

      ## Diagonal matrix functions
      "add_diag", "diagonal", "diag_matrix", "identity_matrix",

      ## Container construction functions
      "linspaced_array", "linspaced_int_array", "linspaced_vector",
      "linspaced_row_vector", "one_hot_int_array", "one_hot_array",
      "one_hot_vector", "one_hot_row_vector", "ones_int_array",
      "ones_array", "ones_vector", "ones_row_vector", "zeros_int_array",
      "zeros_array", "zeros_vector", "zeros_row_vector", "uniform_simplex",

      ## Slicing and blocking functions
      "col", "row", "block", "sub_col", "sub_row", "head", "tail",
      "segment",

      ## Matrix concatenation
      "append_col", "append_row",

      ## Special matrix functions
      "softmax", "log_softmax", "cumulative_sum",

      ## Covariance functions
      "cov_exp_quad",

      ## Linear algebra functions and solvers
      "mdivide_left_tri_low", "mdivide_right_tri_low", "mdivide_left_spd",
      "mdivide_right_spd", "matrix_exp", "matrix_exp_multiply",
      "scale_matrix_exp_multiply", "matrix_power", "trace", "determinant",
      "log_determinant", "inverse", "inverse_spd", "chol2inv",
      "generalized_inverse", "eigenvalues_sym", "eigenvectors_sym",
      "qr_thin_Q", "qr_thin_R", "qr_Q", "qr_R", "cholseky_decompose",
      "singular_values", "svd_U", "svd_V",

      # Sparse Matrix Operations

      ## Conversion functions
      "csr_extract_w", "csr_extract_v", "csr_extract_u",
      "csr_to_dense_matrix",

      ## Sparse matrix arithmetic
      "csr_matrix_times_vector",

      # Mixed Operations
      "to_matrix", "to_vector", "to_row_vector", "to_array_2d",
      "to_array_1d",

      # Higher-Order Functions

      ## Algebraic equation solver
      "algebra_solver", "algebra_solver_newton",

      ## Ordinary differential equation
      "ode_rk45", "ode_rk45_tol", "ode_ckrk", "ode_ckrk_tol", "ode_adams",
      "ode_adams_tol", "ode_bdf", "ode_bdf_tol", "ode_adjoint_tol_ctl",

      ## 1D integrator
      "integrate_1d",

      ## Reduce-sum function
      "reduce_sum", "reduce_sum_static",

      ## Map-rect function
      "map_rect",

      # Deprecated Functions
      "integrate_ode_rk45", "integrate_ode", "integrate_ode_adams",
      "integrate_ode_bdf",

      # Hidden Markov Models
      "hmm_marginal", "hmm_latent_rng", "hmm_hidden_state_prob"
    ]

Code snippet

test page

functions {
  array[] real add_array(array[] real x, real y) {
   int n = num_elements(x);
   array[n] z;
   
  for (i in 1:n) {
      z[n] = x[n] + y;
    }

    return z;
    }
}
  data {
    int n;
    array[n] real x;
  }
  parameters {
    real y;
  }
  transformed parameters {
    real z = add(x, y);
  }
  model {
    y ~ std_normal();
  }
}

The BNF grammars page was updated. I see that this is referenced in the prism stan code. The updated file is stan bnf grammars 2.28.

The text was updated successfully, but these errors were encountered:

RunDevelopment · 2021-11-28T12:47:51Z

Hi @spinkney!

I made a PR adding the missing keywords and HOFs (#3238).

I believe the Stan code needs an additional parameter to indicate the "blocks" of the language:

"Parameter"? You mean a token, right? If so, then this will be difficult. Program blocks are context-free, so we can't easily detect them using regexes.

What do you need the new token for?

spinkney · 2021-11-28T12:57:13Z

First, thank you for being so responsive! It's a sunday and I really appreciate it.

Yes program blocks. I'm thinking it would mimic the highlighting contexts of vscode. So those program blocks would have separate highlighting. Maybe just using the keyword for those blocks and if a "{" is detected after removing/skipping over all white space?

RunDevelopment · 2021-11-28T13:02:04Z

Ah, that's what you want to do. That's pretty easy. I'll add an alias to those keywords, so you can give them a different color.

spinkney · 2021-11-28T14:16:52Z

Also, we need to make sure that complex numbers are highlighted correctly
vscode shows this:

issues:

complex is not highlighted
to_complex is not highlighted
get_real is not highlighted
and 3i - 40e-3i + 1e10i should each be highlighted as "green", just like regular numbers

RunDevelopment · 2021-11-28T15:29:31Z

@spinkney You might also want to comment on the VSCode grammar of Stan. That being said, I added support for imaginary number literals.

spinkney · 2021-11-28T15:46:10Z

Yes, it is already done :) ivan-bocharov/stan-vscode#7 (comment).

Thank you!

rok-cesnovar · 2021-11-28T16:17:41Z

Thank you so much @RunDevelopment!

spinkney added the language-definitions label Nov 27, 2021

spinkney changed the title ~~Stan highlighting~~ Stan highlighting needs fixing Nov 27, 2021

RunDevelopment mentioned this issue Nov 28, 2021

Stan: Added missing keywords and HOFs #3238

Merged

spinkney mentioned this issue Nov 28, 2021

stanc3 online demo for pedantic mode, canonicalizer, auto format, etc stan-dev/stanc3#577

Closed

RunDevelopment added the bug label Nov 28, 2021

RunDevelopment closed this as completed in #3238 Dec 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stan highlighting needs fixing #3236

Stan highlighting needs fixing #3236

spinkney commented Nov 27, 2021 •

edited

RunDevelopment commented Nov 28, 2021

spinkney commented Nov 28, 2021

RunDevelopment commented Nov 28, 2021

spinkney commented Nov 28, 2021

RunDevelopment commented Nov 28, 2021

spinkney commented Nov 28, 2021

rok-cesnovar commented Nov 28, 2021

Stan highlighting needs fixing #3236

Stan highlighting needs fixing #3236

Comments

spinkney commented Nov 27, 2021 • edited

RunDevelopment commented Nov 28, 2021

spinkney commented Nov 28, 2021

RunDevelopment commented Nov 28, 2021

spinkney commented Nov 28, 2021

RunDevelopment commented Nov 28, 2021

spinkney commented Nov 28, 2021

rok-cesnovar commented Nov 28, 2021

spinkney commented Nov 27, 2021 •

edited