How do I round trip ron files preserving formatting and comments? #216

vi · 2020-04-22T20:31:18Z

Are there Ron parsers for making automated edits to Ron files?

As far as I understand, whitespace-preserving parsing should be based on something other that Serde. Does Ron have some Ron-specific crate that goes beyond Serde's API?

kvark · 2020-04-22T22:13:37Z

Oh interesting! I'm not aware of anything like that, no.

xMAC94x · 2020-09-29T07:47:19Z

I would also like to see a feature to modify ron files preserving the comments. (format would be nice-to-have).
One option would be to edit the file directly.
However preserving comments (but not format) would be possible by saving comments to in deserialize, and print them back on serialize.
This could allow config files with alternatives

{
   // supported: http, https, ftp
   "protocol": "http"
}

kvark · 2020-09-29T13:19:38Z

Could you tell more? What RON would you modify programmatically, and how, while preserving comments?

xMAC94x · 2020-09-29T13:36:06Z

I contribute to a rust game called veloren (https://gitlab.com/veloren/veloren/)
We are storing our settings file with the help of RON format.
E.g.

$cat server_settings.ron 
(
    gameserver_address: "0.0.0.0:14004",
    metrics_address: "0.0.0.0:14005",
    auth_server_address: Some("https://auth.veloren.net"),
    max_players: 100,
    world_seed: 59686,
    server_name: "Veloren Alpha aa",
    server_description: "dddd",
    start_time: 32400,
    admins: [

        "Pfau",
        "zesterer",
        "xMAC94x",
        "Timo",
        "Songtronix",
        "slipped",
        "Sharp",
        "Acrimon",
        "imbris",
        "YuriMomo",
        "Vechro",
        "AngelOnFira",
        "Nancok",
        "Qutrin",
        "Mckol",
        "Treeco",    ],
    map_file: None,
    persistence_db_dir: "saves",
    max_view_distance: Some(30),
    banned_words_files: [],
    max_player_group_size: 6,
    player_timeout: (
        secs: 2,
        nanos: 0,
    ),
)

now the user has the option to modify his settings during the game. e.g. by writing in the chat
/adminify foovar to add user foobar to the admin list.
When we add someone to the admin list we want to store this. so we are saving the information back to the file.

So we are reading and writing to our settings file.

In this process, all comments get lost.

However, it would be awsome, to allow comments.
E.g. we as developers, might want to put the default world_seed as a comment.

Or a user wants to temporarrily removes an admin from the list. but when he comments a person out, and restarts the game (and the file got saved again) the name is completly lost.

kvark · 2020-10-02T14:34:32Z

Interesting, thank you for the info!
I wonder how difficult this would be. Is there a precedent in other formats (JSON/YAML/TOML) based on Serde that would preserve the comments?

torkleyy · 2020-10-06T08:20:29Z

Yes: https://github.com/ordian/toml_edit

This crate allows you to parse and modify toml documents, while preserving comments, spaces* and relative order* or items.

torkleyy · 2020-10-06T08:21:12Z

I don't think it's using serde though.

vi · 2020-10-06T10:03:46Z

Maybe like with cbor rewrite, RON should have two crates: low-level serde-independent and high-level serde-based?

kvark · 2020-10-06T15:07:39Z

A full rewrite on non-serde would also open up some possibilities, but that's a huge amount of work that we don't know how to allocate resources into. So this would unlikely happen until somebody motivated would do it.

Boscop · 2020-11-18T14:20:17Z

I'd also really like to have this feature, also for modifying user-written config files.

Btw, is there something like serde that preserves whitespace & comments, that any format can piggyback on? If not, how hard would it be to write?
So that not every format crate has to reinvent the wheel..

Boscop · 2020-11-18T14:23:26Z

One workaround I can think of, to get this feature with the current crates, (but with some overhead) is:

Use serde to parse, then serialize it again to string.
"zip" the original source with the re-serialized one, to generate a source map (containing spans, also keep comments)
edit the data
serialize the data to string
inverse apply the source map to the serialized string to restore original structure

The source map would have to have a tree structure (matching the type nesting) so that it won't result in dumb string comparison when reapplying the source map.
It could be generated with a proc-macro from the types.

Actually I think the process can be simplified with a custom Serializer:

Use serde to parse, then serialize it with a custom Serializer which works like this:
It's like the normal RON serializer but injected with some additional code:
At each nesting level, it compares the original string with the serialized string, and stores any discrepancies in whitespace & comments. After this, it returns this info to use later.
(It's like a tree zipper, like a visitor that visits both trees in parallel.)
edit the data
Use the gathered whitespace&comment-diff info when serializing the data to string

This info can be stored in types that are isomorphic to the original types: Instead of the original types, each field has the type WhitespaceAndCommentData. These isomorphic types can be generated with a proc-macro.

@xMAC94x Would you be interested in this workaround? :)

kvark · 2020-11-18T14:24:12Z

Might be worth looking at Serde alternatives for this, such as https://github.com/dtolnay/miniserde and https://github.com/not-fl3/nanoserde

Boscop · 2020-11-18T14:34:21Z

@kvark It looks like those 2 crates also don't preserve whitespace/comment information.

xMAC94x · 2020-11-20T16:32:38Z

@Boscop sounds like a feasible workaround. it would be cool if there exist a procmacro/Serializer or at least an example coding for that :)

Boscop · 2020-11-24T12:13:24Z

@xMAC94x I implemented this approach once, for my transpiler from Rust to "Rust with pythonic syntax" because I needed to have the exact spans of all tokens, to be able to generate a sourcemap, so that I could backtranslate the lines/columns of compilation errors back to lines/columns in the pythonic source code (in my cargo proxy).
(At the time there was no crate that could parse Rust and would give me the spans of all tokens, only a way to get TokenStreams and a parser on top of that, but it threw away comments and whitespace info.)
This approach worked very well!

coderedart · 2021-09-22T15:48:18Z

Just to express interest in this old issue, I want comments in the config file that explain what the specific option does. Ideally, it would just take the doc comments of the field and insert those into the serialized format. pretty much all existing crates like toml/yaml/json/xml parsers have atleast one issue asking for this. but it seems this is not happening as serde doesn't provide doc comments to the backend crates :(

having comments in the config file really helps users to be a little more daring and try out various options, so its a good thing to have imo. one of the core strengths of Rust is documentation of libraries. compared to older languages, documentation was part of the codebase for rust from early on and it makes sense for RON to consider it :)

vi · 2021-09-22T18:42:03Z

but it seems this is not happening as serde doesn't provide doc comments to the backend crates :(

What if, to store data about comments and whitespace style, do one of:

Store things in adjacent "hidden" elements in parent dicts. This would allow editing them prior to serialisation;
Store things in root dictionary being serialized/deserialized. This would make attaching round-trippable deserialisation into exsting system easier, but less convenient for editing comments from code;
Make deserialization produce two things: main object and special metadata block that can be optionally used for serialisation to fine-tune it?

Zireael07 · 2022-05-01T10:36:23Z

Bump. Still an issue. Also, serde independence (or ability to use something smaller) would be very welcome.

tryoxiss · 2024-01-25T18:00:00Z

Given ron is designed to support all of serdes data model, and serde is a rust standard, I think serde independence is likely a bad idea, at least for the main parser (although an altnerate fork such as ron-noserde maybe, but as mentioned that is a lot of work).

I think the work around suggested by xMAC94x could be good! Though I highly suspect that a non-destructive serialisation would come with performance downsides, so it should be a feature that swaps the standard functions for a non-destructive set.

vi · 2024-01-25T18:43:03Z

altnerate fork such as ron-noserde

Maybe it should be something like ron-lowlevel, which can be used directly (for tricky cases) or through ron facade (for basic use cases), with ron depending on ron-lowlevel.

serde is a rust standard

Serde is something like "greatest common divisor" of all parsing formats - it allows swapping formats for simple cases, but when you need something tricky (human-editable files, super performance, support of somewhat incorrect files), you often skip Serde and work directly with a low-level library.

Each format being represented as a Serde library (if possible) is a good idea; formats being limited to what Serde can is not, even for formats which are directly inspired by Serde.

juntyr · 2024-01-25T18:52:07Z

serde is a rust standard

Serde is something like "greatest common divisor" of all parsing formats - it allows swapping formats for simple cases, but when you need something tricky (human-editable files, super performance, support of somewhat incorrect files), you often skip Serde and work directly with a low-level library.

Each format being represented as a Serde library (if possible) is a good idea; formats being limited to what Serde can is not, even for formats which are directly inspired by Serde.

I think that's a good way to think about it. @torkleyy in the past put some work into a rewrite of ron that has its own AST parsing with more functionality and then uses that to also provide serde features. As we are trying to make ron more robust, in particular to enhance its Value type, I do think we'll end up somewhere in this direction. Especially since I've been working on fuzzing all of ron, including its interaction with serde attributes, I've realised just how much the current serde limitations also impact ron. While serde could be extended to fix most of these issues, now that impl trait in trait function return types is stable, this may take a while. Once I have some more time and energy to put into ron again (or if anyone wants to push this feature earlier), slowly transitioning towards a non-serde-driven parser might be a good direction to take.

tryoxiss · 2024-01-25T21:13:18Z

altnerate fork such as ron-noserde

Maybe it should be something like ron-lowlevel, which can be used directly (for tricky cases) or through ron facade (for basic use cases), with ron depending on ron-lowlevel.

serde is a rust standard

Serde is something like "greatest common divisor" of all parsing formats - it allows swapping formats for simple cases, but when you need something tricky (human-editable files, super performance, support of somewhat incorrect files), you often skip Serde and work directly with a low-level library.

Each format being represented as a Serde library (if possible) is a good idea; formats being limited to what Serde can is not, even for formats which are directly inspired by Serde.

and

serde is a rust standard

Serde is something like "greatest common divisor" of all parsing formats - it allows swapping formats for simple cases, but when you need something tricky (human-editable files, super performance, support of somewhat incorrect files), you often skip Serde and work directly with a low-level library.
Each format being represented as a Serde library (if possible) is a good idea; formats being limited to what Serde can is not, even for formats which are directly inspired by Serde.

I think that's a good way to think about it. @torkleyy in the past put some work into a rewrite of ron that has its own AST parsing with more functionality and then uses that to also provide serde features. As we are trying to make ron more robust, in particular to enhance its Value type, I do think we'll end up somewhere in this direction. Especially since I've been working on fuzzing all of ron, including its interaction with serde attributes, I've realised just how much the current serde limitations also impact ron. While serde could be extended to fix most of these issues, now that impl trait in trait function return types is stable, this may take a while. Once I have some more time and energy to put into ron again (or if anyone wants to push this feature earlier), slowly transitioning towards a non-serde-driven parser might be a good direction to take.

I did not realise that, I thought serde did some of the heavier lifting for parts. As long as there is a simple serde interface I would have no problems of it being independent now that I know that. (Even though I see myself working with lowlevel-ron directory a considerable amount for non-destructive serialisation, a serde interface is very nice to have for more plug-and-play use cases).

And having serde independce would allow for non-destructive serialisation without workarounds, which is good imo!

kvark added the question label Apr 22, 2020

CAD97 mentioned this issue May 2, 2020

Bump version to 0.6.0 #196

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do I round trip ron files preserving formatting and comments? #216

How do I round trip ron files preserving formatting and comments? #216

vi commented Apr 22, 2020

kvark commented Apr 22, 2020

xMAC94x commented Sep 29, 2020

kvark commented Sep 29, 2020

xMAC94x commented Sep 29, 2020

kvark commented Oct 2, 2020

torkleyy commented Oct 6, 2020

torkleyy commented Oct 6, 2020

vi commented Oct 6, 2020

kvark commented Oct 6, 2020

Boscop commented Nov 18, 2020

Boscop commented Nov 18, 2020 •

edited

kvark commented Nov 18, 2020

Boscop commented Nov 18, 2020

xMAC94x commented Nov 20, 2020

Boscop commented Nov 24, 2020 •

edited

coderedart commented Sep 22, 2021

vi commented Sep 22, 2021

Zireael07 commented May 1, 2022

tryoxiss commented Jan 25, 2024

vi commented Jan 25, 2024 •

edited

juntyr commented Jan 25, 2024

tryoxiss commented Jan 25, 2024 •

edited

How do I round trip ron files preserving formatting and comments? #216

How do I round trip ron files preserving formatting and comments? #216

Comments

vi commented Apr 22, 2020

kvark commented Apr 22, 2020

xMAC94x commented Sep 29, 2020

kvark commented Sep 29, 2020

xMAC94x commented Sep 29, 2020

kvark commented Oct 2, 2020

torkleyy commented Oct 6, 2020

torkleyy commented Oct 6, 2020

vi commented Oct 6, 2020

kvark commented Oct 6, 2020

Boscop commented Nov 18, 2020

Boscop commented Nov 18, 2020 • edited

kvark commented Nov 18, 2020

Boscop commented Nov 18, 2020

xMAC94x commented Nov 20, 2020

Boscop commented Nov 24, 2020 • edited

coderedart commented Sep 22, 2021

vi commented Sep 22, 2021

Zireael07 commented May 1, 2022

tryoxiss commented Jan 25, 2024

vi commented Jan 25, 2024 • edited

juntyr commented Jan 25, 2024

tryoxiss commented Jan 25, 2024 • edited

Boscop commented Nov 18, 2020 •

edited

Boscop commented Nov 24, 2020 •

edited

vi commented Jan 25, 2024 •

edited

tryoxiss commented Jan 25, 2024 •

edited