Skip to content

🔨 This repository hosts a comprehensive suite of tools designed to streamline the processing and manipulation of medical imaging data.

License

Notifications You must be signed in to change notification settings

MIMBCD-UI/data-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Pipeline

License Last commit OpenCollective OpenCollective Gitter Twitter

This repository contains a data pipeline for processing medical imaging data. It includes modules for anonymizing DICOM files, encrypting patient IDs, extracting metadata, and processing the data. Additionally, the data pipeline offers flexibility and extensibility, allowing users to customize and expand its functionality according to specific project requirements. With a focus on scalability and performance optimization, the pipeline is capable of handling large volumes of medical imaging data efficiently. Its modular design fosters modularity and code reusability, promoting ease of maintenance and future enhancements.

Below are the key functionalities encapsulated within the pipeline:

  1. Anonymization Module: This module is responsible for anonymizing DICOM files, ensuring the removal of sensitive patient-related information while adhering to regulatory compliance standards. It sanitizes the data by eliminating identifiable attributes, thereby safeguarding patient privacy.

  2. Encryption Module: The encryption module adds an extra layer of security by encrypting patient IDs, thus enhancing data protection measures. By encrypting sensitive identifiers, the module ensures that patient information remains confidential and inaccessible to unauthorized parties.

  3. Metadata Extraction: This module facilitates the extraction of metadata from DICOM files, enabling users to access valuable information embedded within the imaging data. It parses the DICOM headers to retrieve essential metadata attributes, providing insights into the imaging parameters and acquisition details.

  4. Data Processing: The data processing module orchestrates the sequential execution of various operations, including preprocessing, analysis, and transformation of medical imaging data. It streamlines the processing pipeline, enabling seamless integration of diverse data processing tasks.

Encompassing these modules, the data pipeline provides a robust framework for effectively managing medical imaging data. Whether it involves anonymizing patient information, encrypting identifiers, extracting metadata, or processing imaging data, the pipeline offers a versatile solution tailored to meet the intricate demands of medical and biomedical imaging workflows (10.1007/s10278-021-00522-6). With its modular architecture, the pipeline facilitates seamless integration into existing healthcare systems and can be customized to accommodate specific use cases and requirements.

Modules

  • anonymizer.py: Module for anonymizing DICOM files by removing patient-related information and renaming them according to a specified format.
  • encryption.py: Module for encrypting patient IDs.
  • extractor.py: Module for extracting metadata from DICOM files.
  • main.py: Main script for executing the data processing pipeline.
  • processor.py: Module for processing medical imaging data.

Usage

To use the data pipeline, follow these steps:

  1. Clone the repository:
git clone https://github.com/MIMBCD-UI/data-pipeline.git
  1. Install the required dependencies by creating a virtual environment and installing the packages listed in requirements.txt:
cd data-pipeline
pip install -r requirements.txt
  1. Run the main script to execute the data processing pipeline:
python main.py

Contributing

Contributions are welcome! If you'd like to contribute to this project, please fork the repository and submit a pull request with your proposed changes.

License

This project is licensed under the MIT License.

Team

Our team brings everything together sharing ideas and the same purpose, developing even better work. In this section, we will nominate the full list of important people for this repository, as well as respective links.

Authors

Promoters

Companions

  • Hugo Lencastre
  • Nádia MourĂŁo
  • Miguel Bastos
  • Pedro Diogo
  • JoĂŁo Bernardo
  • Madalena Pedreira
  • Mauro Machado
  • Bruno Dias
  • Bruno Oliveira
  • LuĂ­s Ribeiro Gomes

Acknowledgements

This work was partially supported by national funds by FCT through both UID/EEA/50009/2013 and LARSyS - FCT Project 2022.04485.PTDC (MIA-BREAST) projects hosted by IST, as well as both BL89/2017-IST-ID and PD/BD/150629/2020 grants. We are indebted to those who gave their time and expertise to evaluate our work, who among others are giving us crucial information for the BreastScreening project.

Supporting

Our organization is a non-profit organization. However, we have many needs across our activity. From infrastructure to service needs, we need some time and contribution, as well as help, to support our team and projects.

Contributors

This project exists thanks to all the people who contribute. [Contribute].

Backers

Thank you to all our backers! 🙏 [Become a backer]

Sponsors

Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]


fct fccn ulisboa ist hff chtmad
Departments
dei dei
Laboratories
sipg isr larsys iti inesc-id
Domain
eu pt