Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better handle float value comparison #407

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vandyxiaowei
Copy link
Contributor

No description provided.

Copy link
Contributor

@HuanzhiMao HuanzhiMao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vandyxiaowei,

Thanks for the PR!

However, we cannot merge it because the line value = float(value) will convert any value to a float when the parameter is expecting a float value, even if the original model-provided value is of type int. This compromises the purpose of type-checking in the AST evaluation process.

The reason why we think strict typing is necessary can be found in a previous issue reply. The TLDR is that, without looking at the source code, it’s not 100% safe to say that the Python auto-conversion between float and int would always work as expected. Thus in the AST test category, we need to be precise with typing. There are certainly many scenarios where providing a float value for an integer parameter would work perfectly fine, and that’s exactly the purpose of the executable test category, where we are concerned only with the execution result (as opposed to how the result is obtained).

PS, you might noticed that in line 397, we manually convert the value to list when the expected type is tuple, eg value = list(value) —- This is technically a workaround rather than best practice (and it does compromise the purpose of type-checking as well). The conversion is necessary there because any tuple in the possible answer would become a list after being processed through json.dump() and json.load(). I will have a more robust solution to address this issue :)

@danieljannai21
Copy link

Hi @HuanzhiMao,

I just read again your response to the attached issue (which I opened) and I still disagree. I'm commenting here because the attached issue is already closed.

In your response you demonstrate why it's not okay to send a float to an argument that's expecting an int value, but the discussed scenario is the other way around -- sending an int value to a float argument, which I can't think of a reason of why it would be wrong. On top of that, in JSON schema, which is the standard way of representing a function in the API of OpenAI and others, the type used for representing floats is number, which also accepts integers (like 42 in this example in the official documentation.

@HuanzhiMao
Copy link
Contributor

HuanzhiMao commented May 26, 2024

Hey @danieljannai21,

Sorry that issue was closed. I don't have admin access to re-open it, RIP:/ But you can always open a new follow-up issue :)

Regarding your first concern on why the Python auto-conversion from int to float would not always work:

The TLDR is that you cannot represent every int as float in Python.
Python int objects have infinite/arbitrary precision (it is bounded only by memory). On the other hand, float objects do not have infinite precision; numbers are represented as an exponent and a significant, with the latter building up a number from 53 binary fractions. Thus, there are some int values that cannot be accurately represented by a float object. For example, consider a large int 809880023823263820. Note how the value of the last two digits got changed from 20 to 72 after conversion.

>>> x = 809880023823263820
>>> type(x)
<class 'int'>
>>> y = float(x)
>>> type(y)
<class 'float'>
>>> print(x)
809880023823263820
>>> print(y)
8.098800238232639e+17
>>> print(format(y, "f"))
809880023823263872.000000

Now, suppose I have the following function (you probably have seen such interest-rate-calculation-function in the BFCL dataset quite a lot haha):

def calculate_compound_interest(principal: float):
    """
    This function calculates the compound interest on a given principal over 10 periods.
    """
    for _ in range(10):
        principal += principal * 0.05
    return principal

Since we are talking about accuracy here, let's use a high-precision version of the above function.

def calculate_high_precision_compound_interest(principal: float):
    """
    This function calculates the compound interest on a given principal over 10 periods, using Python's decimal module for high precision.
    """
    import decimal
    
    principal = decimal.Decimal(principal)
    rate = decimal.Decimal(0.05)
    for _ in range(10):
        principal += principal * rate
    return principal

The result is below. Sending an int value to a float argument results in an unexpected/wrong value.

>>> large_int_principal = 809880023823263820
>>> large_float_principal = 809880023823263820.0
>>> print(calculate_high_precision_compound_interest(large_int_principal))
1319209219140100709.664993269
>>> print(calculate_high_precision_compound_interest(large_float_principal))
1319209219140100794.367513861

Now regarding your second concern that the OpenAPI standard allows float in number type:

  1. Our evaluation dataset is not limited to the types defined in OpenAPI. We use the "BFCL Type", which for Python is "boolean", "array", "string", "integer", "float", "tuple", "any", "dict".
  2. For models that accept OpenAPI input schema, the conversion from float to OpenAPI's number is similar to how we handle/convert the type for Java and JavaScript functions (a detailed explanation can be found in Java/Javascript split questions #424). Although the parameter type is the broad number, we specify the detailed type limitation in the parameter description. And thus the model should be capable of realizing such restrictions. The code that implements this is here

Let me know if you have more questions! Happy to discuss this further with you :)

@danieljannai21
Copy link

Hey @HuanzhiMao,

Thanks for taking the time to answer my concerns.

The problem in your example is that the number was inaccurately the moment it entered a float variable, and by the time it was converted to decimal.Decimal it was already the wrong number. The calculation that started with an int is actually closer to the correct result (which is 1319209219140100674.793173045685546875 according to this calculator). Regardless, all the examples in the evaluation suit are a lot smaller, so if this is indeed the reason you insist on matching the exact type, I think it would be best to add this type of edge case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants