You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While scikit-learn excels in predictive modeling, users often need detailed statistical summaries to interpret their regression results.
I propose we develop options for users wanting comprehensive statistical reports for models such as LinearRegression(), without impacting model performance.
Describe your proposed solution
Modular Design:
Introduce optional modules or mixins for secondary features.
Users can enable them explicitly when needed. Feature Flags:
Allow users to toggle specific functionalities. Lazy Evaluation:
Compute secondary features only when requested.
Describe alternatives you've considered, if relevant
While statsmodels provides comprehensive summaries (including p-values!), having an integrated solution within scikit-learn would be valuable. The synergy between the two libraries benefits users seeking both prediction and statistical inference.
Using the existing metrics is inconvenient -- I often find myself copying the same code across projects for printing out all the evaluations. Statisticians would appreciate the full summary output.
Additional context
No response
The text was updated successfully, but these errors were encountered:
Could you provide the exact feature that you would like to be reported. I think this is more important than the design questions that will be a subsequent question.
Basically, if this is related to reporting of metrics and inspection method, I think that we want to explore something around model cards that should provide such reporting.
If it is related specifically to p-values of the linear models, then I recall that this discussion happened in the past and a decision was made to not include them.
That's why, I think it would be great to know the exact information that you would expect in the report.
Describe the workflow you want to enable
While scikit-learn excels in predictive modeling, users often need detailed statistical summaries to interpret their regression results.
I propose we develop options for users wanting comprehensive statistical reports for models such as LinearRegression(), without impacting model performance.
Describe your proposed solution
Modular Design:
Introduce optional modules or mixins for secondary features.
Users can enable them explicitly when needed.
Feature Flags:
Allow users to toggle specific functionalities.
Lazy Evaluation:
Compute secondary features only when requested.
Describe alternatives you've considered, if relevant
While statsmodels provides comprehensive summaries (including p-values!), having an integrated solution within scikit-learn would be valuable. The synergy between the two libraries benefits users seeking both prediction and statistical inference.
Using the existing metrics is inconvenient -- I often find myself copying the same code across projects for printing out all the evaluations. Statisticians would appreciate the full summary output.
Additional context
No response
The text was updated successfully, but these errors were encountered: