Skip to content
This repository has been archived by the owner on Mar 25, 2024. It is now read-only.

double quotes lost when deserializing and serializing strings containing only numbers on serde_yaml 0.9 #347

Closed
horacimacias opened this issue Dec 31, 2022 · 6 comments · Fixed by #383

Comments

@horacimacias
Copy link

horacimacias commented Dec 31, 2022

I'm reading some yaml, editing and then saving it again after doing to_string.
For some reason I'm not understanding, some String values which only contain numbers end up not having the double quotes after to_string.
I'm feeding this yaml to something that expects Strings, not numbers, and I'm expecting serde_yaml to respect whichever type the value was.
Here's a minimal reproduceable test, which passes if I use serde_yaml 0.8.26 but does not pass if I use 0.9:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=6de99a49da249819a76e5fcd25342ec3

extern crate serde;
extern crate serde_yaml;


#[test]
fn can_serialize_with_quotes() {
    use serde_yaml::Mapping;
    let original_config = r#"---
configuration:
  agent: "007"
"#;
    let config: Mapping = serde_yaml::from_str(original_config).expect("should parse yaml");
    let config = serde_yaml::to_string(&config).expect("should serialize");
    assert_eq!(config, original_config);
}

basically this input:

configuration:
  agent: "007"

ends up as:

configuration:
  agent: 007

(there is also another difference between 0.8 and 0.9 regarding having or not having ---\n at the beginning of the String, but this is not relevant as far as I can tell; the issue is that a String value ends up being converted to a Number value).

Is there any way to avoid this String->Number conversion or some other way to make this test pass?

@horacimacias horacimacias changed the title double quotes lost when deserializing and serializing strings containing only numbers double quotes lost when deserializing and serializing strings containing only numbers on serde_yaml 0.9 Dec 31, 2022
@dtolnay
Copy link
Owner

dtolnay commented Dec 31, 2022

This is behaving correctly as far as I can tell. 007 is a !!str in yaml, not a !!int. If a different library you are using is interpreting it as an int, that is a bug in the other library.

Here is the spec section that determines what untagged scalars are int: https://yaml.org/spec/1.2.2/#1022-tag-resolution.

@dtolnay dtolnay closed this as completed Dec 31, 2022
@horacimacias
Copy link
Author

sorry, I should have checked this properly. I think you're right. thanks!

@james-jra
Copy link

james-jra commented Jul 20, 2023

I'm not sure I agree with this resolution. @dtolnay any chance you can take another look?

From the linked article I see that:

Scalars with the “?” non-specific tag (that is, plain scalars) are matched with an extended list of regular expressions.

One of which is [-+]? [0-9]+, which resolves to tag:yaml.org,2002:int (Base 10).

I think 007 is a plain scalar and matches the above regular expression, so it should be interpreted as an int. Therefore the round-trip of "007" to 007 is not valid, and parsing 007 as an integer is valid.

I've tested this with rust-yaml, yq, and PyYaml, which all agree so far that 007 is an int.

E.g.

use serde_yaml; // 0.9.24
use yaml_rust;  // 0.4.5

fn main() {
    use serde_yaml::Mapping;

    let original_config = r#"---
agent: "007"
"#;

    let parsed_serde_yaml: Mapping = serde_yaml::from_str(original_config).unwrap();

    // serde_yaml knows it's a string
    assert_eq!(parsed_serde_yaml["agent"], serde_yaml::Value::String("007".into()));
    let serialized_serde_yaml = serde_yaml::to_string(&parsed_serde_yaml).unwrap();

    // Serializes it back to 007, no quotes.
    assert_eq!(serialized_serde_yaml, r#"agent: 007
"#);

    // serde_yaml parses it back as a string, so we're self-consistent.
    let parsed_serde_yaml: Mapping = serde_yaml::from_str(&serialized_serde_yaml).unwrap();
    assert_eq!(parsed_serde_yaml["agent"], serde_yaml::Value::String("007".into()));

    // But yaml_rust parses it as an integer
    let parsed_yaml_rust = yaml_rust::YamlLoader::load_from_str(&serialized_serde_yaml).unwrap();
    let doc = &parsed_yaml_rust[0];
    // thread 'main' panicked at 'assertion failed: `(left == right)`
    // left: `Integer(7)`,
    // right: `String("007")`',
    assert_eq!(doc["agent"], yaml_rust::Yaml::String("007".into()));
}

@dtolnay
Copy link
Owner

dtolnay commented Jul 20, 2023

Yeah, good call.

Fixed in serde_yaml 0.9.25.

@Ramilito
Copy link

Ramilito commented Dec 6, 2023

I'm getting similar behaviour using 0.9.26 but not in 0.8.26 where serializing and then deserializing results in an unquoted string as output.

The input:

          - name: KUBERNETES
            value: "yes"

Output:

          - name: KUBERNETES
            value: yes

@deadmilkman
Copy link

Having a similar issue with a key containing a string with the single character y. Even if a create the mapping value as a string, when outputing the yaml, y is not quoted. This ends up being interpreted as a bool instead of a string value by kubectl.

Reference: https://yaml.org/type/bool.html

Question: Is there a way of forcing values of type string to be always quoted when outputing as a string?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants