Skip to content

neeraj3029/chi-sq-test

Repository files navigation

Chi-Sqaured tests

This package helps run Chi-Squared hypothesis tests for testing distributions on numerical data. Details follow.

How to install

This library can easily be integrated in to your project manually. Alternatively, the library can be included using npm/GitHub packages.

npm install chi-sq-test

How to use

To run chi-squared test for a given dataset

ChiSqTest.gof(fObs, fExp, ddof)
Documentation
  • fObs: [Array] Array of observed frequencies for each category
          Default: No default value, essential arg
  • fExp: [Array] Array of expected frequencies in each category
          Default: By default all categories are assumed to be equally likely. Expected frequency of each  category would be the mean of observed frequencies.
  • ddof: [number] delta degrees of freedom.
          Effective degrees of freedom = k - 1 - ddof, where k is the number of observed frequencies.
          Default ddof: 0
This is somewhat similar to SciPy.

Example:

const ChiSqTest = require('chi-sq-test');

const obs = [2, 3, 4]; // observed frequencies 
const exp = [3, 4, 5]; // expected frequencies    
const ddof = 0;       // delta degree of freedom (Degree Of Freedom  = 3-1 = 2)

const testres1 = ChiSqTest.gof(obs, exp, ddof);
console.log(testres1);

/*
=> { value: 0.7833333333333332, pValue: 0.6759293959125221 }
*/

const testres2 = ChiSqTest.gof(obs); // mean fObs is used as fExp by default
console.log(testres2);

/*
=> { value: 0.6666666666666666, pValue: 0.7165313783925148 }
*/

Output:

Function gof returns a JSON object, which contains Chi-Square value and the pValue for the given dataset.

Chi-square statistics for independence

ChiSqTest.independence(fObs, ddof)
Documentation
  • fObs: [2D Array] 2D-Array of observed frequencies of interestcting categories Tij = (Ai ∩ Bj)
          Default: No default value, essential arg
  • ddof: [number] delta degrees of freedom.
          Effective degrees of freedom = (k - 1).(m - 1) - ddof, where k and m are number of categories in sets A and B respectively.
          Default ddof: 0

Example

Statement We have an email-dataset which is divided in two ways. \ A = {with image, without images} \ B = {Spam, No Spam}
fObs(i,j) With Images Without Images
Spam 160 240
No Spam 140 460
For the null hypothesis: \ H0: Email spam and image attachment are independent. \ HA: being spam and image attachment are dependent
const ChiSqTest = require('chi-sq-test');
const obs = [
    [160, 240],
    [140, 460]
];

console.log(
    ChiSqTest.independence(obs, 0)
);

/*
=> { value: 31.746031746031747, pValue: 1.7570790822318827e-8 }
*/

Output:

Function independence returns a JSON object, which contains Chi-Square value and the pValue for the hypothesis for indpendence.