Enhancement: Add Summary Output for Linear Regression Models #28996

IsabelBody · 2024-05-11T06:06:26Z

Describe the workflow you want to enable

While scikit-learn excels in predictive modeling, users often need detailed statistical summaries to interpret their regression results.
I propose we develop options for users wanting comprehensive statistical reports for models such as LinearRegression(), without impacting model performance.

Describe your proposed solution

Modular Design:
Introduce optional modules or mixins for secondary features.
Users can enable them explicitly when needed.
Feature Flags:
Allow users to toggle specific functionalities.
Lazy Evaluation:
Compute secondary features only when requested.

Describe alternatives you've considered, if relevant

While statsmodels provides comprehensive summaries (including p-values!), having an integrated solution within scikit-learn would be valuable. The synergy between the two libraries benefits users seeking both prediction and statistical inference.
Using the existing metrics is inconvenient -- I often find myself copying the same code across projects for printing out all the evaluations. Statisticians would appreciate the full summary output.

Additional context

No response

glemaitre · 2024-05-15T17:24:33Z

Could you provide the exact feature that you would like to be reported. I think this is more important than the design questions that will be a subsequent question.

Basically, if this is related to reporting of metrics and inspection method, I think that we want to explore something around model cards that should provide such reporting.

If it is related specifically to p-values of the linear models, then I recall that this discussion happened in the past and a decision was made to not include them.

That's why, I think it would be great to know the exact information that you would expect in the report.

lorentzenchr · 2024-05-15T20:09:54Z

x-refs for p-values #16802

IsabelBody added Needs Triage Issue requires triage New Feature labels May 11, 2024

glemaitre added Needs Decision - Include Feature Requires decision regarding including feature and removed Needs Triage Issue requires triage labels May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: Add Summary Output for Linear Regression Models #28996

Enhancement: Add Summary Output for Linear Regression Models #28996

IsabelBody commented May 11, 2024 •

edited

glemaitre commented May 15, 2024

lorentzenchr commented May 15, 2024

Enhancement: Add Summary Output for Linear Regression Models #28996

Enhancement: Add Summary Output for Linear Regression Models #28996

Comments

IsabelBody commented May 11, 2024 • edited

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

glemaitre commented May 15, 2024

lorentzenchr commented May 15, 2024

IsabelBody commented May 11, 2024 •

edited