You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to be able to automatically detect fields that may be useful for fuzzy record linking.
For instance, we may be able to join two datasets from completely different source databases on fields like dob, first_name etc.
However, in general these fields will have different names.
This may involve
Extending the schema to allow such fields to be identified
Agreeing on a standard alias for these fields so that the column name in the dataset can be translated into the standardised version e.g. dob and birthdate may be standardised to date_of_birth
We've added an alias property for columns, so if the column has a non-standard name, the alias can provide the standard name. But we don't yet know who might maintain a definitive list of those, and where.
An idea in the readme we haven't done anything with yet, that might (?) facilitate linking is specifying some kind of URI for columns, like <repo>/metadata/<folder>/<table_name>/<column_name> or <databasename>:<tablename>:<columnname>
The text was updated successfully, but these errors were encountered:
Re this issue in the metadata_schema repo:
@RobinL "Extending the schema to allow 'fuzzy' relationships between disparate tables - standard names": moj-analytical-services/metadata_schema#2
We've added an
alias
property for columns, so if the column has a non-standard name, the alias can provide the standard name. But we don't yet know who might maintain a definitive list of those, and where.An idea in the readme we haven't done anything with yet, that might (?) facilitate linking is specifying some kind of URI for columns, like
<repo>/metadata/<folder>/<table_name>/<column_name>
or<databasename>:<tablename>:<columnname>
The text was updated successfully, but these errors were encountered: