You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After accidentally losing the previous massive write-up on this topic, here is a shorter version. PR #547 introduced the possibility for generating species schemas that more accurately treat a species' chemical composition. This is currently not yet done, as all species simply use the
zod type to represent their "chemical composition". A more accurate representation is beneficial as that allows for element-based queries and search interfaces, and opens the door to treatment of isotopes, checking element conservation in given reaction equations, etc.
This functionality can be incorporated in multiple ways. I am currently considering the following three options, with a current preference for the third.
1. A uniform Composition type
This type should capture all possible compositions and a composition property of this type would essentially replace the particle property in SimpleParticle. Such a Composition type could have the following form (represented as a zod type):
where Element is a union over all element short forms ($\mathrm{H}$, $\mathrm{He}$, etc.), and the numeric entry represents the stoichiometric coefficient. Note that more intricate chemical compositions such as HMDSO; $\mathrm{O}\left[\mathrm{Si}\left(\mathrm{CH}_3\right)_3\right]_2$, and description of isotopes; e.g. ${}^2\mathrm{H}_2\mathrm{O}$, are not yet supported. However, they can be added later through allowing additional metadata in each element entry, and by supporting recursion.
The advantage of this approach is that the species composition is universally queryable. The downsides comprise of the inability for generated schemas to strictly enforce the expected composition given a species type; a homonuclear diatom should list a single element with a stoichiometric coefficient of $2$ as its composition, and the somewhat verbose description of atomic compositions (a single element).
2. Specific composition types for each species type
This approach comprises of defining specific composition interfaces for different species types, e.g. the description of the species $\mathrm{H}_2$ (of type HomonuclearDiatom) could be simplified from
as the stoichiometric coefficient of $2$ is implicit.
The major benefit of this approach is that the generated schemas can now strictly enforce and expected composition given a species type, i.e. it is impossible to define a homonuclear, diatomic species with an incorrect composition. The downside is that the species composition is not universally queryable, which usually results in inconvenient, and inefficient implementations when using data that adheres to the resulting schemas.
3. A combination of both
Presumably, combining both approaches can result in a universally queryable composition interface, while allowing the generated schemas to strictly enforce expected composition based on the given species type. The idea is that specific composition types should always present a subset of the values presented by the universal Composition type. The Composition type can e.g. be implemented as follows:
for homonuclear diatoms, and indeed, values that adhere to AtomComposition or HomonucleardiatomComposition do also adhere to Composition.
To simplify querying the database even further, single element compositions (atoms) can be transformed according to Element => tuple([tuple([Element, literal(1)])]) behind the scenes. The resulting in-database Composition type is then equivalent to the version presented in item 1.
The text was updated successfully, but these errors were encountered:
After accidentally losing the previous massive write-up on this topic, here is a shorter version. PR #547 introduced the possibility for generating species schemas that more accurately treat a species' chemical composition. This is currently not yet done, as all species simply use the
zod
type to represent their "chemical composition". A more accurate representation is beneficial as that allows for element-based queries and search interfaces, and opens the door to treatment of isotopes, checking element conservation in given reaction equations, etc.This functionality can be incorporated in multiple ways. I am currently considering the following three options, with a current preference for the third.
1. A uniform
Composition
typeThis type should capture all possible compositions and a
composition
property of this type would essentially replace theparticle
property inSimpleParticle
. Such aComposition
type could have the following form (represented as azod
type):where$\mathrm{H}$ , $\mathrm{He}$ , etc.), and the numeric entry represents the stoichiometric coefficient. Note that more intricate chemical compositions such as HMDSO; $\mathrm{O}\left[\mathrm{Si}\left(\mathrm{CH}_3\right)_3\right]_2$ , and description of isotopes; e.g. ${}^2\mathrm{H}_2\mathrm{O}$ , are not yet supported. However, they can be added later through allowing additional metadata in each element entry, and by supporting recursion.
Element
is a union over all element short forms (The advantage of this approach is that the species composition is universally queryable. The downsides comprise of the inability for generated schemas to strictly enforce the expected composition given a species type; a homonuclear diatom should list a single element with a stoichiometric coefficient of$2$ as its composition, and the somewhat verbose description of atomic compositions (a single element).
2. Specific composition types for each species type
This approach comprises of defining specific composition interfaces for different species types, e.g. the description of the species$\mathrm{H}_2$ (of type
HomonuclearDiatom
) could be simplified fromto
as the stoichiometric coefficient of$2$ is implicit.
The major benefit of this approach is that the generated schemas can now strictly enforce and expected composition given a species type, i.e. it is impossible to define a homonuclear, diatomic species with an incorrect composition. The downside is that the species composition is not universally queryable, which usually results in inconvenient, and inefficient implementations when using data that adheres to the resulting schemas.
3. A combination of both
Presumably, combining both approaches can result in a universally queryable composition interface, while allowing the generated schemas to strictly enforce expected composition based on the given species type. The idea is that specific composition types should always present a subset of the values presented by the universal
Composition
type. TheComposition
type can e.g. be implemented as follows:Examples of specific composition types then include
for atomic species, and
for homonuclear diatoms, and indeed, values that adhere to
AtomComposition
orHomonucleardiatomComposition
do also adhere toComposition
.To simplify querying the database even further, single element compositions (atoms) can be transformed according to
Element => tuple([tuple([Element, literal(1)])])
behind the scenes. The resulting in-databaseComposition
type is then equivalent to the version presented in item 1.The text was updated successfully, but these errors were encountered: