Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--skip-errors doesn't work for packages #1639

Open
diego-oncoramedical opened this issue Feb 5, 2024 · 2 comments
Open

--skip-errors doesn't work for packages #1639

diego-oncoramedical opened this issue Feb 5, 2024 · 2 comments

Comments

@diego-oncoramedical
Copy link

diego-oncoramedical commented Feb 5, 2024

Overview

Edit: In my case, I only tried foreign key checks, but as @fjuniorr noted below, --skip-errors appears to be broken for all errors when checking a package.

When validating a package using the CLI, --skip-errors does not appear to disable foreign key checks. Validation passes if and only if the foreign keys are commented out in each table schema file.

I'm running the following command:

frictionless validate --trusted --limit-errors 50 --skip-errors [see below] $OUTPUT_DIR/package.json

For the error slug, I've tried:

  • foreign-key (from docs)
  • foreign-key-error (mentioned here)
  • foreignKey (from source code)
  • foreignKeyError (by analogy with foreign-key-error)

I've also tried all four at the same time, separated by commas with no intervening spaces.

Sample output:

                                              dataset                                              
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ name                 ┃ type  ┃ path                                                  ┃ status  ┃
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ medical_patient      │ table │ /var/data-pkg/output/Patient_20240112150044.csv       │ VALID   │
│ medical_encounter    │ table │ /var/data-pkg/output/Encounter_20240112150042.csv     │ VALID   │
│ medical_medications  │ table │ /var/data-pkg/output/Medications_20240112150809.csv   │ VALID   │
│ medical_problem      │ table │ /var/data-pkg/output/Problem_20240112150453.csv       │ VALID   │
│ medical_toxicity     │ table │ /var/data-pkg/output/Toxicity_20240112151505.csv      │ INVALID │
│ medical_observations │ table │ /var/data-pkg/output/Observations_20240112155005.csv  │ VALID   │
│ medical_vitals       │ table │ /var/data-pkg/output/Vitals_20240112150819.csv        │ VALID   │
└──────────────────────┴───────┴───────────────────────────────────────────────────────┴─────────┘

                                                                                       medical_toxicity
┏━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Row ┃ Field ┃ Type        ┃ Message                                                                                                                                                     ┃
┡━━━━━╇━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 2   │ None  │ foreign-key │ Row at position "2" violates the foreign key: for "EMPI": values "......" not found in the lookup table "medical_patient" as "EMPI"                         │
│ 2   │ None  │ foreign-key │ Row at position "2" violates the foreign key: for "MRN": values "......" not found in the lookup table "medical_patient" as "MRN"                           │
│ 2   │ None  │ foreign-key │ Row at position "2" violates the foreign key: for "EncounterNumber": values "......" not found in the lookup table "medical_encounter" as "EncounterNumber" │
│ 3   │ None  │ foreign-key │ Row at position "3" violates the foreign key: for "EMPI": values "......" not found in the lookup table "medical_patient" as "EMPI"                         │

...etc

Info

Environment:

App is running inside the official Python 3.12.1 Alpine Linux Docker image.

The requirements.txt file, in its entirety:

chardet==5.2.0          # Character encoding detection
click==8.1.7            # CLI
frictionless==5.16.1    # Validation
pandas==2.2.0           # CSV loading and cleaning
pyyaml==6.0.1           # Configuration file loading

Package

The package consists of a few unremarkable CSVs:

  • All leading and trailing whitespace is stripped from each field, so we know that's not the issue.
  • All column names are valid Python identifiers.

Package JSON, presented as YAML for readability:

resources:
- encoding: utf-8
  format: csv
  mediatype: text/csv
  name: medical_patient
  path: /var/data-pkg/output/Patient_20240112150044.csv
  schema: /app/schemas/medical/patient.yaml
  type: table
- encoding: utf-8
  format: csv
  mediatype: text/csv
  name: medical_encounter
  path: /var/data-pkg/output/Encounter_20240112150042.csv
  schema: /app/schemas/medical/encounter.yaml
  type: table
- encoding: utf-8
  format: csv
  mediatype: text/csv
  name: medical_medications
  path: /var/data-pkg/output/Medications_20240112150809.csv
  schema: /app/schemas/medical/medications.yaml
  type: table
- encoding: utf-8
  format: csv
  mediatype: text/csv
  name: medical_problem
  path: /var/data-pkg/output/Problem_20240112150453.csv
  schema: /app/schemas/medical/problem.yaml
  type: table
- encoding: utf-8
  format: csv
  mediatype: text/csv
  name: medical_toxicity
  path: /var/data-pkg/output/Toxicity_20240112151505.csv
  schema: /app/schemas/medical/toxicity.yaml
  type: table
- encoding: utf-8
  format: csv
  mediatype: text/csv
  name: medical_observations
  path: /var/data-pkg/output/Observations_20240112155005.csv
  schema: /app/schemas/medical/observations.yaml
  type: table
- encoding: utf-8
  format: csv
  mediatype: text/csv
  name: medical_vitals
  path: /var/data-pkg/output/Vitals_20240112150819.csv
  schema: /app/schemas/medical/vitals.yaml
  type: table
@fjuniorr
Copy link
Contributor

It looks like this is a more general error that we can't skip any error in the CLI for validating packages. In frictionless 5.17.0 with this reprex I get:

frictionless validate --skip-errors "blank-label" https://raw.githubusercontent.com/splor-mg/reprex/main/reprex/20231228T143527/datapackage.json
────────────────────────────────────────────────────────────── Dataset ───────────────────────────────────────────────────────────────
               dataset               
┏━━━━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┓
┃ name ┃ type  ┃ path     ┃ status  ┃
┡━━━━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━┩
│ data │ table │ data.csv │ INVALID │
└──────┴───────┴──────────┴─────────┘
─────────────────────────────────────────────────────────────── Tables ───────────────────────────────────────────────────────────────
                                         data                                         
┏━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Row  ┃ Field ┃ Type        ┃ Message                                               ┃
┡━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ None │ 2     │ blank-label │ Label in the header in field at position "2" is blank │
└──────┴───────┴─────────────┴───────────────────────────────────────────────────────┘

When I validate the data file (or a standalone resource) the check is properly skipped:

frictionless validate --skip-errors "blank-label" https://raw.githubusercontent.com/splor-mg/reprex/main/reprex/20231228T143527/data.csv
─────────────────────────────────────────────────────── Dataset ────────────────────────────────────────────────────────
              dataset               
┏━━━━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┓
┃ name ┃ type  ┃ path     ┃ status ┃
┡━━━━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━┩
│ data │ table │ data.csv │ VALID  │
└──────┴───────┴──────────┴────────┘

@diego-oncoramedical
Copy link
Author

Good catch. I'll change the title of the ticket to reflect this.

@diego-oncoramedical diego-oncoramedical changed the title Can't disable foreign key checks in CLI --skip-errors doesn't work for packages Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants