Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accurately treat species chemical composition #587

Open
daanboer opened this issue Nov 29, 2023 · 0 comments
Open

Accurately treat species chemical composition #587

daanboer opened this issue Nov 29, 2023 · 0 comments
Labels
Milestone

Comments

@daanboer
Copy link
Contributor

After accidentally losing the previous massive write-up on this topic, here is a shorter version. PR #547 introduced the possibility for generating species schemas that more accurately treat a species' chemical composition. This is currently not yet done, as all species simply use the

const SimpleParticle = object({
  particle: string().min(1),
  charge: number().int(),
});

zod type to represent their "chemical composition". A more accurate representation is beneficial as that allows for element-based queries and search interfaces, and opens the door to treatment of isotopes, checking element conservation in given reaction equations, etc.

This functionality can be incorporated in multiple ways. I am currently considering the following three options, with a current preference for the third.

1. A uniform Composition type

This type should capture all possible compositions and a composition property of this type would essentially replace the particle property in SimpleParticle. Such a Composition type could have the following form (represented as a zod type):

const Composition = array(tuple([Element, number().int().positive()]));

where Element is a union over all element short forms ($\mathrm{H}$, $\mathrm{He}$, etc.), and the numeric entry represents the stoichiometric coefficient. Note that more intricate chemical compositions such as HMDSO; $\mathrm{O}\left[\mathrm{Si}\left(\mathrm{CH}_3\right)_3\right]_2$, and description of isotopes; e.g. ${}^2\mathrm{H}_2\mathrm{O}$, are not yet supported. However, they can be added later through allowing additional metadata in each element entry, and by supporting recursion.

The advantage of this approach is that the species composition is universally queryable. The downsides comprise of the inability for generated schemas to strictly enforce the expected composition given a species type; a homonuclear diatom should list a single element with a stoichiometric coefficient of $2$ as its composition, and the somewhat verbose description of atomic compositions (a single element).

2. Specific composition types for each species type

This approach comprises of defining specific composition interfaces for different species types, e.g. the description of the species $\mathrm{H}_2$ (of type HomonuclearDiatom) could be simplified from

{
  "type": "HomonuclearDiatom",
  "composition": [["H", 2]],
  "charge": 0
}

to

{
  "type": "HomonuclearDiatom",
  "element": "H",
  "charge": 0
}

as the stoichiometric coefficient of $2$ is implicit.

The major benefit of this approach is that the generated schemas can now strictly enforce and expected composition given a species type, i.e. it is impossible to define a homonuclear, diatomic species with an incorrect composition. The downside is that the species composition is not universally queryable, which usually results in inconvenient, and inefficient implementations when using data that adheres to the resulting schemas.

3. A combination of both

Presumably, combining both approaches can result in a universally queryable composition interface, while allowing the generated schemas to strictly enforce expected composition based on the given species type. The idea is that specific composition types should always present a subset of the values presented by the universal Composition type. The Composition type can e.g. be implemented as follows:

const Composition = union([
  Element,
  array(tuple([Element, number().int().positive()])),
]);

Examples of specific composition types then include

const AtomComposition = Element;

for atomic species, and

const HomonuclearDiatomComposition = tuple([tuple([Element, literal(2)])]);

for homonuclear diatoms, and indeed, values that adhere to AtomComposition or HomonucleardiatomComposition do also adhere to Composition.

To simplify querying the database even further, single element compositions (atoms) can be transformed according to Element => tuple([tuple([Element, literal(1)])]) behind the scenes. The resulting in-database Composition type is then equivalent to the version presented in item 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant