Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect deserialization with yaml feature and keys containing periods #417

Open
Halkcyon opened this issue Jan 19, 2023 · 7 comments
Open

Comments

@Halkcyon
Copy link

Halkcyon commented Jan 19, 2023

When using yaml as a config format, the deserialization incorrectly stops at periods in keys. I do not observe this behavior when manually deserializing the string with serde_yaml or other deserializers (such as ruamel.yaml or pyyaml in Python). As a result, I am getting errors about keys not being present in the mapped value even though they definitely exist.

MCVE:

192.168.1.1:
  an_arr:
    - 1
    - 2
    - 3
  another_value: 10
#[derive(Debug, serde::Deserialize)]
struct Conf {
  an_arr: Vec<u32>,
  another_value: u32,
}

fn main() {
  let settings = config::Config::builder()
    .add_source(config::File::from("config.yml"))
    .build()
    .unwrap();

  println!(
    "{:?}",
    settings
      .try_deserialize::<HashMap<String, Conf>>()
      .unwrap()
  );
}

Result:

The application panicked (crashed).
Message:  called `Result::unwrap()` on an `Err` value: missing field `another_value`
Location: src/main.rs:51

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
                                ⋮ 9 frames hidden ⋮                               
  10: core::result::Result<T,E>::unwrap::hdf9dbc5c9c26ec5d
      at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/core/src/result.rs:1113
  11: ui::main::he2f965d0d6a736c9
      at /home/redacted/src/main.rs:49
        47 │         println!(
        48 │             "{:?}",
        49 >             settings
        50 │                 .try_deserialize::<HashMap<String, Conf>>()
        51 │                 .unwrap()
@matthiasbeyer
Copy link
Collaborator

Hi! Thanks for your report. Would you mind filing a patch showcasing this? I would be especially interested in a comparison of config-rs and the plain use of the yaml crate in use by config-rs (serde_yaml IIRC).

Can all be in one testcase, in one file in /tests.

If you cannot spare the time for this, just ping me and I'll do it myself.

@Halkcyon
Copy link
Author

I expected serde_yaml, too, but it appears config-rs is using yaml-rust @matthiasbeyer

config-rs/Cargo.toml

Lines 20 to 35 in f12b93f

yaml = ["yaml-rust"]
ini = ["rust-ini"]
json5 = ["json5_rs"]
convert-case = ["convert_case"]
preserve_order = ["indexmap", "toml/preserve_order", "serde_json/preserve_order", "ron/indexmap"]
async = ["async-trait"]
[dependencies]
lazy_static = "1.0"
serde = "1.0.8"
nom = "7"
async-trait = { version = "0.1.50", optional = true }
toml = { version = "0.5", optional = true }
serde_json = { version = "1.0.2", optional = true }
yaml-rust = { version = "0.4", optional = true }

@matthiasbeyer
Copy link
Collaborator

Ah yeah, then that crate of course.

Sorry, I maintain so many projects that I lose track sometimes and I didn't look it up. My bad.

@Halkcyon
Copy link
Author

Digging into this deeper, it also appears yaml-rust has been unmaintained since 2020, so it's unlikely to be fixed upstream. I'll submit a patch test a bit later today.

@matthiasbeyer
Copy link
Collaborator

matthiasbeyer commented Jan 19, 2023

Awesome!

So it seems that we should switch YAML implementations at some point. Maybe it is worth doing this even though there's a rewrite ongoing (although slowly, #376) where we could use serde_yaml from the get-go.

... just thinking loud here.

Halkcyon added a commit to Halkcyon/config-rs that referenced this issue Jan 19, 2023
@Halkcyon
Copy link
Author

Alright, repro added to tests with rust-yaml vs. serde_yaml @matthiasbeyer

@polarathene
Copy link
Collaborator

polarathene commented Oct 19, 2023

For reference, the source of the problem has been identified: #418 (comment)

Switching YAML parser won't resolve this.

UPDATE: You can use keys with dots/periods if they're not defined at top-level. Although as referenced below, it's considered a bug (even though it'd seem necessary for the serde rename feature to work correctly to avoid config-rs internally splitting the key into nested tables).


This issue is also a duplicate where the cause was explained:

#[derive(Debug, Deserialize)]
struct Test {
  toplevel: Nested,
}

#[derive(Debug, Deserialize)]
struct Nested {
  #[serde(rename = "192.168.1.1")]
  ip_key: String,
  #[serde(rename = "hello.world")]
  hello_world: String,
  nested: String,
}

Input:

toplevel:
  hello.world: example
  192.168.1.1: a string value
toplevel.nested: valid

Outputs:

Test {
    toplevel: Nested {
        ip_key: "a string value",
        hello_world: "example",
        nested: "valid",
    },
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants