Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Need class 'Dictionary' for listing multiple key-value pairs #773

Open
VenkatTechnologist opened this issue May 10, 2024 · 5 comments

Comments

@VenkatTechnologist
Copy link
Contributor

See #769.

For things like hyperparameters, metrics, and metric thresholds, we need a class called 'Dictionary' which can hold a list of 'DictionaryEntry' values.

@zvr
Copy link
Member

zvr commented May 10, 2024

As you write, we already have DictionaryEntry for exactly this reason and we have implemented "dictionaries" by having multiple instances of this type.

I'm not sure exactly why there is a need for a class holding them. Can you explain (maybe with a simple example)?

@VenkatTechnologist
Copy link
Contributor Author

VenkatTechnologist commented May 10, 2024 via email

@zvr
Copy link
Member

zvr commented May 10, 2024

If the whole purpose is to simply group the entries together, I don't think there is a need for a "dictionary" class.

For example, when we record a build, it has a number of parameters. Therefore the Build class, has a parameters property (of type DictionaryEntry). Or we want to record all the environment -- again, a single property environment of type DictionaryEntry is enough, since you can have many of them.
Take a look at the definition.

This allows to record a "dictionary" of key/value pairs, without the need of a new class.

Some example data for such an object would be:

{
    "type": "Build",
    "buildId": "my_id",
    "buildType": "https://github.com/actions/deploy",
    "buildStartTime": "2024-05-10T12:34:45Z",
    "buildEndTime": "2024-05-10T12:34:54Z",
    "parameters": [
        { "key": "param1", "value": "val1" },
        { "key": "param2", "value": "val2" },
        { "key": "param3", "value": "val3" }
    ],
    "environment": [
        { "key": "user", "value": "guest" },
        { "key": "cwd", "value": "/home/guest" },
        { "key": "arch", "value": "x86_64" }
    ]
}

Is this clearer now?

@VenkatTechnologist
Copy link
Contributor Author

VenkatTechnologist commented May 10, 2024 via email

@zvr
Copy link
Member

zvr commented May 11, 2024

OK, let me try and write the alternative explicitly.

So, in the current state where environment is a 0..* DictionaryEntry, it is serialized, as shown in the previous comment, this way:

    "environment": [
        { "key": "user", "value": "guest" },
        { "key": "cwd", "value": "/home/guest" },
        { "key": "arch", "value": "x86_64" }
    ]

If we had defined a Dictionary class, how would we have done it? We want to represent a list of DictionaryEntry entries. In the RDF object model, there are no lists -- we simply allow properties to appear more than once. Therefore the definition of Dictionary would only contain an "entry" property, that could appear multiple times:

# Dictionary

## Properties

- entry:
  - type: DictionaryEntry
  - minCount: 1 (or 0)
  - maxCount: *

If we had this definition, the type of environment would be a 0..1 Dictionary.

But then the actual data would be serialized as:

    "environment": {
	"type": "Dictionary",
	"entry": [
	    { "key": "user", "value": "guest" },
	    { "key": "cwd", "value": "/home/guest" },
	    { "key": "arch", "value": "x86_64" }
	]
    }

As you can see, by defining the Dictionary class we only introduce an extra level in JSON, without any real benefit. That's why we defined only the DictionaryEntry class and not a collection of these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants