data can be loaded only once #1210

anilsh · 2023-02-11T02:34:40Z

When I try to run the GridSearch twice or ExponentiatedGradient after GridSearch, the constraints returns the following error.

AssertionError: data can be loaded only once

Full stack trace is:

File ~\OneDrive - EY\fairness2\FSRM-shEYzam-repos\shazamlib\group_fairness.py:313, in GroupFairness.reduction_grid_search(self, base_model)
    311 # train model 
    312 print ('Training base model on specified constraints..')
--> 313 model_gridsearch.fit(self.X_train, self.y_train, sensitive_features=self.S_train)
    314 self.inprocess['model'] = model_gridsearch
    316 # make predictions

File ~\Anaconda3\envs\sheyzam-fairness-env\lib\site-packages\fairlearn\reductions\_grid_search\grid_search.py:143, in GridSearch.fit(self, X, y, **kwargs)
    141 # Prep the parity constraints and objective
    142 logger.debug("Preparing constraints and objective")
--> 143 self.constraints.load_data(X, y, **kwargs)
    144 objective = self.constraints.default_objective()
    145 objective.load_data(X, y, **kwargs)

File ~\Anaconda3\envs\sheyzam-fairness-env\lib\site-packages\fairlearn\reductions\_moments\utility_parity.py:333, in DemographicParity.load_data(self, X, y, sensitive_features, control_features)
    331 base_event = pd.Series(data=_ALL, index=y_train.index)
    332 event = _merge_event_and_control_columns(base_event, cf_train)
--> 333 super().load_data(X, y_train, event=event, sensitive_features=sf_train)

File ~\Anaconda3\envs\sheyzam-fairness-env\lib\site-packages\fairlearn\reductions\_moments\utility_parity.py:146, in UtilityParity.load_data(self, X, y, sensitive_features, event, utilities)
    123 def load_data(
    124     self,
    125     X,
   (...)
    130     utilities=None,
    131 ):
    132     """Load the specified data into this object.
    133 
    134     This adds a column `event` to the `tags` field.
   (...)
    144 
    145     """
--> 146     super().load_data(X, y, sensitive_features=sensitive_features)
    147     self.tags[_EVENT] = event
    148     if utilities is None:

File ~\Anaconda3\envs\sheyzam-fairness-env\lib\site-packages\fairlearn\reductions\_moments\moment.py:42, in Moment.load_data(self, X, y, sensitive_features)
     30 def load_data(self, X, y: pd.Series, *, sensitive_features: pd.Series = None):
     31     """Load a set of data for use by this object.
     32 
     33     Parameters
   (...)
     40         The sensitive feature vector (default None)
     41     """
---> 42     assert self.data_loaded is False, "data can be loaded only once"
     43     if sensitive_features is not None:
     44         assert isinstance(sensitive_features, pd.Series)

AssertionError: data can be loaded only once

The text was updated successfully, but these errors were encountered:

hildeweerts · 2023-04-17T14:13:06Z

Hi @anilsh. Please follow the instructions from the bug report template to print all dependencies.

romanlutz · 2023-07-18T23:47:52Z

This is by design AFAIK. If we allowed loading multiple times you could have something like

constraint = DemographicParity()
eg = ExponentiatedGradient(constraints=constraint, ...)
eg.fit(...)  # calls load_data and sets fields internal to the moment
constraint.load_data(different_data)

In other words, one could mess up the constraint object in weird ways. I can see two changes we could make

Perhaps load_data should be _load_data to avoid giving people the impression that it's something they could use (?), and
perhaps we should clone the constraints object before using it internally. That way, we could pass the same constraints object to several different mitigators without corrupting it in the process.

@MiroDudik wdyt?

hildeweerts added the Waiting for OP's Response Waiting for original poster's response, and will close if that doesn't happen for a while. label Apr 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data can be loaded only once #1210

data can be loaded only once #1210

anilsh commented Feb 11, 2023

hildeweerts commented Apr 17, 2023

romanlutz commented Jul 18, 2023

data can be loaded only once #1210

data can be loaded only once #1210

Comments

anilsh commented Feb 11, 2023

hildeweerts commented Apr 17, 2023

romanlutz commented Jul 18, 2023