Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct eds.history pipeline to distinguish "medical history" from "history of current disease" #219

Open
marieverdoux opened this issue Sep 28, 2023 · 2 comments

Comments

@marieverdoux
Copy link
Contributor

As built, if the use_section=True config is applied to the eds.history pipeline, all "antécédents", "antécédents familiaux" and "histoire de la maladie" sections are used to tag entities as "history".

The problem is that :

  • "histoire de la maladie" refers to history of the current disease, and not to medical history.
  • "antécédents familiaux" refers to family diseases

I suggest removing "histoire de la maladie" and "antécédents familiaux" from section_history list in edsnlp/pipelines/qualifiers/history/patterns.py

If an entity refers to the history of the current disease, this will be found with the section title.

Thank you !

@Aremaki
Copy link
Collaborator

Aremaki commented Sep 28, 2023

Hello Marie,

Thank you for your feedbacks.
The pipe's name is ambiguous...
The idea of this pipe was to detect if the event (such as disease) occurs before the present time of the document.

Could you develop on your purposes when using the history pipe ?

@marieverdoux
Copy link
Contributor Author

Thanks Adam.

Usually in medical records, "antécédents" refer to previous diseases that are no longer of interest for the current visit, and "histoire de la maladie" details the history of the disease the visit is about. They are distinct categories of the medical record.

I think it can be useful in many contexts to differentiate both cases, for instance, if you want to know if the disease is still active.

In my curent application, I want to sort out documents related to current active disease, so I use the eds.history pipeline to filter out documents that would relate to old diseases. If I use the pipeline with use_section config, I end up filtering out all the entities under "histoire de la maladie", eventhough I want to keep them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants