Skip to content

Loads healthcare provider information from the National Plan and Provider Enumeration System (NPPES) into SQL Server.

License

Notifications You must be signed in to change notification settings

neilharvey94044/NPPES-Data-Load

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NPPES Data Load

Project: Batch Process for Loading Monthly NPPES Data Into SQL Server

Author: Neil Harvey

Date: April 2020 (during Covid-19 Stay At Home Order)

Tools: Powershell 7 scripts, Transact-SQL, bcp, SQL Server 2019

What is NPPES-Data-Load?

The National Plan and Provider Enumeration System (NPPES) maintains identifiers for all healthcare providers in the United States. Each provider obtains a National Provider Identifier (NPI) stored in the NPPES. This data provides detailed information about each provider, such as licensing, specialties, practice locations, points of contact, etc..
More information on these files can be found at: http://download.cms.gov/nppes/NPI_Files.html. The NPPES also provides a REST API. Additional info at CMS Data Dissemination

At time of this writing Taxonomy Codes can be found at Taxonomy Codes, this is likely to change.

This project downloads and loads these files into a SQL Server Database. The SQL Server I used is running in Docker desktop using image "mcr.microsoft.com/mssql/server:2019-CU3-ubuntu-18.04". This approach requires that you have Docker Desktop running on a machine with Hyper-V, e.g. Windows 10 Pro, or Windows Server. However, there is nothing stopping using these loads against any SQL Server implementation.

Project Goals

  • Each step should be scripted, requiring minimal or no human intervention
  • Download and unzipping of the Monthly NPPES .csv data files:
    • National Provider Identifier (NPI) File
    • Other Name Reference File
    • Practice Location Reference File
    • Endpoint Reference File
    • Taxonomy Codes
  • Wrangle the .csv files into a format that can be loaded using bcp
  • Optionally filter the data by a single state (e.g. CA)
  • Create required database and tables in SQL Server
  • bcp load into the SQL Server tables
  • Provide some representative SQL queries against the data

Files and Steps

  • Put-SQLCredential.ps1

    • Run this script to capture your SQL server credentials and server name. These are encrypted, stored in passwords.json, and will automatically be used by all subsequent scripts.
  • RunRefresh.ps1

    • Runs all child scripts and processes necessary to download, clean, reformat, create necessary tables, and load the data. Review the steps in this script to determine what steps to uncomment. For example, starting the SQL Server instance is commented out.
  • npi_queries.sql

    • Contains several self contained SQL queries against the loaded data, with instructions. I used Microsoft SQL Server Management Studio to run these.
  • Other files

    • There are numerous other .sql, .ps1, and .py files. However, all of the load is driven by RunRefresh.ps1 which should be your starting place to explore the process.

    See Also

    CAHealthFacilityDBLoad - process to load California healthcare facility, services, beds information to SQL Server.

    NPPES-REST-API - a REST API for the healthcare provider data. Written in C# using ASP.NET Core.

    CAHealthQueries - C# queries against healthcare data.

About

Loads healthcare provider information from the National Plan and Provider Enumeration System (NPPES) into SQL Server.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published