Issue with UtilityParity indexing #1338

adrinjalali · 2024-01-24T17:08:54Z

We have this code:

        self.pos_basis = pd.DataFrame()
        self.neg_basis = pd.DataFrame()
        self.neg_basis_present = pd.Series(dtype="float64")
        zero_vec = pd.Series(0.0, self.index)
        i = 0

        for e in event_vals:
            # Constraints on the final group are redundant, so they are not
            # included in the basis.
            for g in group_vals[:-1]:
                self.pos_basis[i] = 0 + zero_vec
                self.neg_basis[i] = 0 + zero_vec
                self.pos_basis[i]["+", e, g] = 1
                self.neg_basis[i]["-", e, g] = 1
                self.neg_basis_present.at[i] = True
                i += 1

which causes a few issues, due to indexing and copy-on-write, so a better way of writing the same code seems to be:

        self.neg_basis_present = pd.Series(dtype="float64")
        col_count = len(event_vals) * (len(group_vals) - 1)
        self.pos_basis = pd.DataFrame(0.0, index=self.index, columns=range(col_count))
        self.neg_basis = pd.DataFrame(0.0, index=self.index, columns=range(col_count))

        i = 0

        for e in event_vals:
            # Constraints on the final group are redundant, so they are not
            # included in the basis.
            for g in group_vals[:-1]:
                self.pos_basis.loc[("+", e, g), i] = 1
                self.neg_basis.loc[("-", e, g), i] = 1
                self.neg_basis_present.at[i] = True
                i += 1

However, the former produces:

                         0    1    2    3    4    5    6    7    8    9
sign event   group_id                                                  
+    label=0 0,0       1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             1,1       0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             2,2       0.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
     label=1 1,1       0.0  0.0  0.0  0.0  0.0  0.0  1.0  0.0  0.0  0.0
             3,3       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  0.0
             4,4       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0
             5,5       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
-    label=0 0,0       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             1,1       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             2,2       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
     label=1 1,1       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             3,3       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             4,4       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             5,5       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0

while the second code produces:

                         0    1    2    3    4    5    6    7    8    9
sign event   group_id                                                  
+    label=0 0,0       1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             1,1       0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             2,2       0.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
     label=1 1,1       0.0  0.0  0.0  0.0  0.0  0.0  1.0  0.0  0.0  0.0
             3,3       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  0.0
             4,4       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0
             5,5       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
-    label=0 0,0       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             1,1       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             2,2       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
     label=1 1,1       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             3,3       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             4,4       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
             5,5       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
+    label=0 3,3       NaN  NaN  NaN  1.0  NaN  NaN  NaN  NaN  NaN  NaN
             4,4       NaN  NaN  NaN  NaN  1.0  NaN  NaN  NaN  NaN  NaN
     label=1 0,0       NaN  NaN  NaN  NaN  NaN  1.0  NaN  NaN  NaN  NaN
             2,2       NaN  NaN  NaN  NaN  NaN  NaN  NaN  1.0  NaN  NaN

I don't really understand this part of the code, and it seems @MiroDudik wrote it, but I'm not sure if he's got time to check. Anybody? @fairlearn/fairlearn-maintainers

Note that this change is necessary in new pandas releases.

The text was updated successfully, but these errors were encountered:

romanlutz · 2024-01-25T00:50:42Z

Do you have the example code which produces this output? Looks like the last 4 rows are added in the second one. Reading the code, it not obvious to me why that is. Might need to step through.

adrinjalali · 2024-01-25T08:48:55Z

This test is the one generating the data for the above matrices: TestEqualizedOdds::test_many_sensitive_feature_groups_warning

Here you have the code to trigger the issue: #1339, althrough the issue was triggered in the previous pandas PR when merged, so to compare you need to checkout the PR where pyarrow was added, or anything before that.

riedgar-ms · 2024-02-12T19:59:26Z

I have a feeling this is related to the trouble I'm having with #1351 . That's eventually failing due to a vector being the wrong size for multiplication... and the reason it's the wrong size seems to be a few NaN entries

riedgar-ms · 2024-02-12T20:00:04Z

Weirdly, pushing the pandas version back isn't 'fixing' the issue either. I'm not sure why that is :-(

adrinjalali mentioned this issue Jan 24, 2024

FIX more pandas issues #1339

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with UtilityParity indexing #1338

Issue with UtilityParity indexing #1338

adrinjalali commented Jan 24, 2024 •

edited

romanlutz commented Jan 25, 2024

adrinjalali commented Jan 25, 2024

riedgar-ms commented Feb 12, 2024

riedgar-ms commented Feb 12, 2024

Issue with UtilityParity indexing #1338

Issue with UtilityParity indexing #1338

Comments

adrinjalali commented Jan 24, 2024 • edited

romanlutz commented Jan 25, 2024

adrinjalali commented Jan 25, 2024

riedgar-ms commented Feb 12, 2024

riedgar-ms commented Feb 12, 2024

adrinjalali commented Jan 24, 2024 •

edited