Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

349: Implemented round() function #359

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

VidhyaVarshanyJS
Copy link

I have made changes the files
image
Kindly review my implementation.

Copy link

google-cla bot commented Jan 30, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Collaborator

@ianspektor ianspektor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start! 🚀

Missing:

  • Tests
    • Be thorough
    • Test edge cases
    • Test passing an EventSet with a single feature
    • Test passing an EventSet with several features
    • Test passing an EventSet with features of non-accepted types (e.g. string, ints)
    • Test passing an EventSet with some float32 and some float64 features, ensure output types are the same as input ones
  • Adding the new operation to the docs

See last two points in https://temporian.readthedocs.io/en/latest/contributing/#developing-a-new-operator for guidance and examples.

Let's keep this PR for the round() function only, and open a new one for floor() and ceil() once done.

Thanks!

temporian/core/event_set_ops.py Outdated Show resolved Hide resolved
temporian/core/event_set_ops.py Outdated Show resolved Hide resolved
Comment on lines 190 to 199
@classmethod
def allowed_dtypes(cls) -> List[DType]:
return [
DType.INT32,
DType.INT64,
]

@classmethod
def get_output_dtype(cls, feature_dtype: DType) -> DType:
return feature_dtype
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look off:

  • allowed_dtypes specifies the types that the operator can consume - should be DType.FLOAT32 and DType.FLOAT64
  • get_output_dtype returns the output dtype, given the input dtype - in this case it should return the same type as received (I said it should return ints in the issue, edited it, needs to return floats because they have a much larger range than ints)

operator_lib.register_operator(InvertOperator)
operator_lib.register_operator(IsNanOperator)
operator_lib.register_operator(NotNanOperator)
operator_lib.register_operator(AbsOperator)
operator_lib.register_operator(LogOperator)
operator_lib.register_operator(RoundOperator)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra newline here too

@VidhyaVarshanyJS
Copy link
Author

This is a great start! 🚀

Missing:

  • Tests

    • Be thorough
    • Test edge cases
    • Test passing an EventSet with a single feature
    • Test passing an EventSet with several features
    • Test passing an EventSet with features of non-accepted types (e.g. string, ints)
    • Test passing an EventSet with some float32 and some float64 features, ensure output types are the same as input ones
  • Adding the new operation to the docs

See last two points in https://temporian.readthedocs.io/en/latest/contributing/#developing-a-new-operator for guidance and examples.

Let's keep this PR for the round() function only, and open a new one for floor() and ceil() once done.

Thanks!

Thanks for your time!
Shall I create a separate test_round.py or implement test cases within the test_unary.py itself?

@ianspektor
Copy link
Collaborator

Shall I create a separate test_round.py or implement test cases within the test_unary.py itself?

Let's do test_unary.py, since they're implemented in the same file too.

@VidhyaVarshanyJS
Copy link
Author

Hi..
I have added the test cases for the round() in test_unary.py for the all the above combinations that is mentioned.
Kindly review my changes.
Thanks

Comment on lines 133 to 134
with self.assertRaises(TypeError):
_ = evset
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing calling round() on the evset here - this shouldn't be raising anything? did you manage to run the tests locally to ensure they pass?

)
expected = event_set(
timestamps=[1, 2],
features={"a": [11, 12], "b": [1, 3]},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these will pass, since a list of ints will yield an eventset with int features, which won't be equal to the float results of round(). Ensure your new tests are passing by running bazel test //temporian/core/operators/test:test_unary --config=macos --test_output=errors or bazel test //temporian/core/operators/test:test_unary --config=linux --test_output=errors depending on your OS

Comment on lines 136 to 145
def test_round_float32_and_float64_features(self):
evset = event_set(
timestamps=[1, 2],
features={"a": [10.5, 11.7], "b": [1.2, 2.9]},
)
expected = event_set(
timestamps=[1, 2],
features={"a": [11.0, 12.0], "b": [1.0, 3.0]},
same_sampling_as=evset,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which of these are being defined as f32 and which as f64? see the f64 and f32 methods in temporian/test/utils.py to explicitly create an eventset with the desired feature types

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need a clarification. Can I use the numpy for specifying the float32 and float64 separately as

float32

def test_round_float32(self):
    evset = event_set(
        timestamps=[1, 2],
        features={"a": np.array([10.5, 11.7], dtype=np.float32), "b": np.array([1.2, 2.9], dtype=np.float32)},
    )
    expected = event_set(
        timestamps=[1, 2],
        features={"a": np.array([11.0, 12.0], dtype=np.float32), "b": np.array([1.0, 3.0], dtype=np.float32)},
        same_sampling_as=evset,
    )
    assertOperatorResult(self, evset.round(), expected)
    assertOperatorResult(self, round(evset), expected)  # __round__ magic

float64

def test_round_float64(self):
    evset = event_set(
        timestamps=[1, 2],
        features={"a": np.array([10.5, 11.7], dtype=np.float64), "b": np.array([1.2, 2.9], dtype=np.float64)},
    )
    expected = event_set(
        timestamps=[1, 2],
        features={"a": np.array([11.0, 12.0], dtype=np.float64), "b": np.array([1.0, 3.0], dtype=np.float64)},
        same_sampling_as=evset,
    )
    assertOperatorResult(self, evset.round(), expected)
    assertOperatorResult(self, round(evset), expected)  # __round__ magic

Comment on lines 195 to 196
DType.INT32,
DType.INT64,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ints shouldn't be allowed

Copy link
Collaborator

@ianspektor ianspektor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work so far, please run tests locally to make sure they pass before submitting

self: EventSetOrNode,
) -> EventSetOrNode:
"""Rounds the values of an [`EventSet`][temporian.EventSet]'s features to the nearest integer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add another line specifying that only float types are allowed, and the output type will always be the same as the input's

temporian/core/event_set_ops.py Show resolved Hide resolved
@ianspektor
Copy link
Collaborator

One more thing, please merge main branch into your PR's branch so that tests are ran on it (fixed in #363)

@javiber javiber changed the title implemented round() function 349: Implemented round() function Mar 4, 2024
Copy link
Collaborator

@javiber javiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, There are some leftovers from the merge on index.md
Black should take care of the incorrect indentations and unnecessary empty lines. If you have any issue running black reach out to me on discord

assertOperatorResult(self, evset.round(), expected)
assertOperatorResult(self, round(evset), expected) # __round__ magic

def test_correct_sin(self) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation of this line is off. Also no need for type annotations in these test

Suggested change
def test_correct_sin(self) -> None:
def test_correct_sin(self):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should run black . on the root of the repository (might need to call poetry shell or pre-ped poetry run) to fix the other formatting issues

@@ -44,6 +44,8 @@ Check the index on the left for a more detailed description of any symbol.
| ---------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------- |
| [`tp.combine()`][temporian.combine] | Combines events from [`EventSets`][temporian.EventSet] with different samplings. |
| [`tp.glue()`][temporian.glue] | Concatenates features from [`EventSets`][temporian.EventSet] with the same sampling. |
| [`EventSet.abs()`][temporian.EventSet.abs] | Computes the absolute value of the features.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

abs is repeated on line 49 below

@@ -44,6 +44,8 @@ Check the index on the left for a more detailed description of any symbol.
| ---------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------- |
| [`tp.combine()`][temporian.combine] | Combines events from [`EventSets`][temporian.EventSet] with different samplings. |
| [`tp.glue()`][temporian.glue] | Concatenates features from [`EventSets`][temporian.EventSet] with the same sampling. |
| [`EventSet.abs()`][temporian.EventSet.abs] | Computes the absolute value of the features.
| [`EventSet.add_index()`][temporian.EventSet.add_index] | Adds indexes to an [`EventSet`][temporian.EventSet]. |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_index is also repeated below

self: EventSetOrNode,
) -> EventSetOrNode:
"""Rounds the values of an [`EventSet`][temporian.EventSet]'s features to the nearest integer.
only float types are allowed and the output. Output type wil be same as the input type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation is incorrect, black should fix this

assertOperatorResult(self, evset.round(), expected)
assertOperatorResult(self, round(evset), expected) # __round__ magic

def test_round_float64(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_round_float64 and test_round_float32 are doing the same thing. You can use functions f64 and f32 to create an array with that type (link).
f64 is used above in test_correct_isnan

@@ -272,6 +277,7 @@ class ArcTanOperator(BaseUnaryOperator):
def op_key_definition(cls) -> str:
return "ARCTAN"


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary empty line

@@ -414,5 +431,6 @@ def arctan(
assert isinstance(input, EventSetNode)

return ArcTanOperator(

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary empty line

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants