[Question] Setting values for linear coefficient #6390

velezbeltran · 2024-03-27T15:47:03Z

Summary

Hello! Thank you for the library; it has been invaluable to my work for the past couple of years!

I was wondering if, from the Python interface, it is possible to manually set the linear model at the leaf if we are fitting a linear tree. That is if when training the model, we use linear_tree=True is it possible to afterwards modify the linear model at each leaf. If not, I think it would be useful.

Motivation

This is good if you want to compute the derivatives of the tree and use the linear model as an approximation. That is what we were planning on using it for. In that case we can differentiate by modifying the linear model and setting some values to 0.

Description

Essentially, having some function that is similar to set_leaf_output but for the coefficients.

The text was updated successfully, but these errors were encountered:

jameslamb · 2024-03-27T16:31:13Z

Thanks for using LightGBM.

I've edited your post to actually use set_leaf_output in plaintext, so this could be found from search engines.

I think this is an interesting idea. Could you write some pseudo-code showing what you'd like the interface to look like? For example, would it be like this?

Booster.set_linear_leaf_coefficients(
   tree_id=1234,
   leaf_id=5,
   constant=100.5,
   beta=0.89
)

(I don't recall if LightGBM linear models have a constant, would have to double-check)

aagrande · 2024-03-27T18:57:41Z

Thank you for the prompt reply @jameslamb! I collaborate with @velezbeltran.

When linear_tree=True, each leaf has:

leaf_const: intercept of the linear model.
leaf_features: indices of the numerical features in the leaf's branch.
leaf_coeff: slopes of the linear model, one for each feature.

So the interface may look like this:

Booster.set_leaf_linear_model(
   tree_id=1234,
   leaf_id=5,
   constant=100.5,
   features=[0, 3, 4],
   coefficients=[0.89, 0.12, 3.14]
)

To modify the coefficients within a leaf, we need to know which features appear in the leaf's linear model. So the set method would be paired with a get method (similar to get_leaf_output and set_leaf_output):

Booster.get_leaf_linear_model(
   tree_id=1234,
   leaf_id=5
)
  """
  Return intercept, features, and slopes of the linear model.
  """

My understanding is that at the moment the only method to access the linear coefficients is via Booster.dump_model().

jameslamb · 2024-04-01T03:35:13Z

Thanks for that, makes sense to me!

We'd have to figure out specifics on how much validation to do, how to test this, etc. but in general I think this would be a great addition to the library, to add functionality for linear models that's similar to what you can get for regular single-value leaf nodes with `set_leaf_output().

I think we'd want to add this at the level of the C API and keep the logic on the Python side as minimal as possible.

@guolinke @shiyu1994 @jmoralez @borchero @btrotta what do you think about this? I think I should not be the one to decide along whether or not we accept an expansion of the library's API like this.

borchero · 2024-04-11T23:05:06Z

I personally cannot gauge the usefulness of this feature and believe that this is quite a niche requirement. That being said, I also don't see a reason to not expose the coefficients of the linear models via the Python API and, similarly, allow to modify these values.

Regarding testing, I don't have a lot of concerns: it seems to me like this would essentially be about implementing "getter/setter" methods for the coefficients.

shiyu1994 · 2024-04-15T16:28:09Z

I think we'd want to add this at the level of the C API and keep the logic on the Python side as minimal as possible.

I agree. I can help to implement this feature in the C API.

jameslamb added the feature request label Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Setting values for linear coefficient #6390

[Question] Setting values for linear coefficient #6390

velezbeltran commented Mar 27, 2024 •

edited by jameslamb

jameslamb commented Mar 27, 2024

aagrande commented Mar 27, 2024 •

edited

jameslamb commented Apr 1, 2024

borchero commented Apr 11, 2024

shiyu1994 commented Apr 15, 2024

[Question] Setting values for linear coefficient #6390

[Question] Setting values for linear coefficient #6390

Comments

velezbeltran commented Mar 27, 2024 • edited by jameslamb

Summary

Motivation

Description

jameslamb commented Mar 27, 2024

aagrande commented Mar 27, 2024 • edited

jameslamb commented Apr 1, 2024

borchero commented Apr 11, 2024

shiyu1994 commented Apr 15, 2024

velezbeltran commented Mar 27, 2024 •

edited by jameslamb

aagrande commented Mar 27, 2024 •

edited