Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type_of_target are not proper to decide multiclass vs continuous #296

Open
chunqishi opened this issue Jan 4, 2024 · 2 comments
Open

type_of_target are not proper to decide multiclass vs continuous #296

chunqishi opened this issue Jan 4, 2024 · 2 comments
Labels
question Further information is requested
Projects
Milestone

Comments

@chunqishi
Copy link

chunqishi commented Jan 4, 2024

for a continuous case, sometime target are just float values without the part after the decimal point, however, type_of_target treat [1.0, 2.0, 3.0, 4.0, 5.0] as multiclass.

could you provide a setting interface to assign self._target_dtype ? or fix type_of_target into continuous when target are float type and the number of unique values larger than 10.

def _fit(self, X, y, sample_weight, check_input):
time_init = time.perf_counter()

    if self.verbose:
        logger.info("Binning process started.")
        logger.info("Options: check parameters.")

    _check_parameters(**self.get_params())

    # check X dtype
    if not isinstance(X, (pd.DataFrame, np.ndarray)):
        raise TypeError("X must be a pandas.DataFrame or numpy.ndarray.")

    # check target dtype
    self._target_dtype = type_of_target(y)
@guillermo-navas-palencia guillermo-navas-palencia added the question Further information is requested label Jan 8, 2024
@guillermo-navas-palencia
Copy link
Owner

Hi @chunqishi.

Your are not the first one encountering this issue with https://scikit-learn.org/stable/modules/generated/sklearn.utils.multiclass.type_of_target.html. I will think about it.

@guillermo-navas-palencia guillermo-navas-palencia added this to the v0.19.0 milestone Jan 8, 2024
@guillermo-navas-palencia
Copy link
Owner

I think it makes sense to implement this parameter in the Binning Process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
ToDo
  
Awaiting triage
Development

No branches or pull requests

2 participants