Feature request: terminology matcher with normalisation #62

bdura · 2022-04-25T17:27:35Z

Feature type

Matcher pipeline to handle the single label/multiple subconcepts use-case.

Description

As discussed in #58, we would certainly benefit from having EDS-NLP handle the nitty-gritty detail of matching a terminology with automatic concept normalisation.

For now, it is reasonably easy to match a terminology wherein the label is the normalisation. However, we could use the kb_id_ attribute (see spaCy documentation) to include a more hierarchical structure.

For instance, paracetamol/tylenol should probably get the label drug and a kb_id_ like ATC=N02BE01.

Proposition

We could modify the eds.matcher component to handle this case natively, or create a new component.

The text was updated successfully, but these errors were encountered:

gozat · 2022-04-26T12:28:14Z

In the spirit of spaCy, I just wonder whether such information has to be put in custom attributes or handled by the EntityLinker, that relates Span to KnowledgeBase (as far as I understand, tell me if I missed something).

For the example of paracétamol (an ingredient in ROMEDI nomenclature), one has several ATC for instance : https://www.romedi.fr/romedi/IN7310nlprjlh2sb3t0apdjfvtk6u0ifp3 and to get a precise ATC instance may or may not be resolved by the EntityLinker using information in the rest of the Doc. In addition, it may or may not be of interest for the user to resolve this entity ; user might be interested by ingredient and prefer fitting the drugs to their ingredients.

In short, instead of thinking in term of terminology, perhaps one could think of entities in terms of graph, and try to understand to which extend one can import graph properties inside the spaCy machinery.

bdura added enhancement New feature or request discussion Discussion about architecture choices labels Apr 26, 2022

percevalw mentioned this issue Jul 25, 2022

Simple terminology matcher #75

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: terminology matcher with normalisation #62

Feature request: terminology matcher with normalisation #62

bdura commented Apr 25, 2022

gozat commented Apr 26, 2022

Feature request: terminology matcher with normalisation #62

Feature request: terminology matcher with normalisation #62

Comments

bdura commented Apr 25, 2022

Feature type

Description

Proposition

gozat commented Apr 26, 2022