better handle float value comparison #407

vandyxiaowei · 2024-05-03T03:42:30Z

No description provided.

HuanzhiMao

Thanks for the PR!

However, we cannot merge it because the line value = float(value) will convert any value to a float when the parameter is expecting a float value, even if the original model-provided value is of type int. This compromises the purpose of type-checking in the AST evaluation process.

The reason why we think strict typing is necessary can be found in a previous issue reply. The TLDR is that, without looking at the source code, it’s not 100% safe to say that the Python auto-conversion between float and int would always work as expected. Thus in the AST test category, we need to be precise with typing. There are certainly many scenarios where providing a float value for an integer parameter would work perfectly fine, and that’s exactly the purpose of the executable test category, where we are concerned only with the execution result (as opposed to how the result is obtained).

PS, you might noticed that in line 397, we manually convert the value to list when the expected type is tuple, eg value = list(value) —- This is technically a workaround rather than best practice (and it does compromise the purpose of type-checking as well). The conversion is necessary there because any tuple in the possible answer would become a list after being processed through json.dump() and json.load(). I will have a more robust solution to address this issue :)

danieljannai21 · 2024-05-26T06:56:21Z

Hi @HuanzhiMao,

I just read again your response to the attached issue (which I opened) and I still disagree. I'm commenting here because the attached issue is already closed.

In your response you demonstrate why it's not okay to send a float to an argument that's expecting an int value, but the discussed scenario is the other way around -- sending an int value to a float argument, which I can't think of a reason of why it would be wrong. On top of that, in JSON schema, which is the standard way of representing a function in the API of OpenAI and others, the type used for representing floats is number, which also accepts integers (like 42 in this example in the official documentation.

HuanzhiMao · 2024-05-26T16:20:45Z

Hey @danieljannai21,

Sorry that issue was closed. I don't have admin access to re-open it, RIP:/ But you can always open a new follow-up issue :)

Regarding your first concern on why the Python auto-conversion from int to float would not always work:

The TLDR is that you cannot represent every int as float in Python.
Python int objects have infinite/arbitrary precision (it is bounded only by memory). On the other hand, float objects do not have infinite precision; numbers are represented as an exponent and a significant, with the latter building up a number from 53 binary fractions. Thus, there are some int values that cannot be accurately represented by a float object. For example, consider a large int 809880023823263820. Note how the value of the last two digits got changed from 20 to 72 after conversion.

>>> x = 809880023823263820
>>> type(x)
<class 'int'>
>>> y = float(x)
>>> type(y)
<class 'float'>
>>> print(x)
809880023823263820
>>> print(y)
8.098800238232639e+17
>>> print(format(y, "f"))
809880023823263872.000000

Now, suppose I have the following function (you probably have seen such interest-rate-calculation-function in the BFCL dataset quite a lot haha):

def calculate_compound_interest(principal: float):
    """
    This function calculates the compound interest on a given principal over 10 periods.
    """
    for _ in range(10):
        principal += principal * 0.05
    return principal

Since we are talking about accuracy here, let's use a high-precision version of the above function.

def calculate_high_precision_compound_interest(principal: float):
    """
    This function calculates the compound interest on a given principal over 10 periods, using Python's decimal module for high precision.
    """
    import decimal
    
    principal = decimal.Decimal(principal)
    rate = decimal.Decimal(0.05)
    for _ in range(10):
        principal += principal * rate
    return principal

The result is below. Sending an int value to a float argument results in an unexpected/wrong value.

>>> large_int_principal = 809880023823263820
>>> large_float_principal = 809880023823263820.0
>>> print(calculate_high_precision_compound_interest(large_int_principal))
1319209219140100709.664993269
>>> print(calculate_high_precision_compound_interest(large_float_principal))
1319209219140100794.367513861

Now regarding your second concern that the OpenAPI standard allows float in number type:

Our evaluation dataset is not limited to the types defined in OpenAPI. We use the "BFCL Type", which for Python is "boolean", "array", "string", "integer", "float", "tuple", "any", "dict".
For models that accept OpenAPI input schema, the conversion from float to OpenAPI's number is similar to how we handle/convert the type for Java and JavaScript functions (a detailed explanation can be found in Java/Javascript split questions #424). Although the parameter type is the broad number, we specify the detailed type limitation in the parameter description. And thus the model should be capable of realizing such restrictions. The code that implements this is here

Let me know if you have more questions! Happy to discuss this further with you :)

danieljannai21 · 2024-05-28T11:20:29Z

Hey @HuanzhiMao,

Thanks for taking the time to answer my concerns.

The problem in your example is that the number was inaccurately the moment it entered a float variable, and by the time it was converted to decimal.Decimal it was already the wrong number. The calculation that started with an int is actually closer to the correct result (which is 1319209219140100674.793173045685546875 according to this calculator). Regardless, all the examples in the evaluation suit are a lot smaller, so if this is indeed the reason you insist on matching the exact type, I think it would be best to add this type of edge case.

handle float value comparison

015e2fa

HuanzhiMao suggested changes May 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

better handle float value comparison #407

better handle float value comparison #407

vandyxiaowei commented May 3, 2024

HuanzhiMao left a comment

danieljannai21 commented May 26, 2024

HuanzhiMao commented May 26, 2024 •

edited

danieljannai21 commented May 28, 2024

better handle float value comparison #407

Are you sure you want to change the base?

better handle float value comparison #407

Conversation

vandyxiaowei commented May 3, 2024

HuanzhiMao left a comment

Choose a reason for hiding this comment

danieljannai21 commented May 26, 2024

HuanzhiMao commented May 26, 2024 • edited

danieljannai21 commented May 28, 2024

HuanzhiMao commented May 26, 2024 •

edited