Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise Error After Failed Cast in GEOS Impls #341

Merged
merged 10 commits into from
Nov 11, 2022
Merged

Conversation

keithdoggett
Copy link
Member

Summary

Raise errors instead of returning nil when objects cannot be properly cast to GEOS geometries in both the capi and FFI implementations.

These methods are primarily called when a method requires a second geometry (ex. intersects? takes a rhs argument). So now if invalid data is given in rhs an error will be raised.

…Remove null checks from methods which call convert_to_fg_geometry
…o_convert_to_geos_geometry to raise InvalidGeometry instead. Remove NULL checks from functions that call rgeo_convert_to_geos_geometry.
Comment on lines 835 to 836
rb_raise(rb_eRGeoInvalidGeometry,
"Unable to cast the geometry to the GEOS Factory");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests were failing because they check that an InvalidGeometry error is raised rather than any error, so I left the conditional in for now. We can remove it all together and allow Check_TypedStruct to raise, but we'll need to fix the tests.

}

Check_TypedStruct(object, &rgeo_geometry_type);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May not be necessary, but added it for safety.

Comment on lines 425 to 428
prep = rgeo_request_prepared_geometry(self_data);
if (prep)
val = GEOSPreparedDisjoint(prep, rhs_geom);
else
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed all of the checks since we should no longer have a NULL rhs_geom.

Comment on lines 69 to 71
if (state || !geom) {
break;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok to remove since rgeo_convert_to_detached_geos_geometry will raise on this condition.

Comment on lines 256 to 261
if (state) {
rb_exc_raise(rb_errinfo());
}
if (!exterior_geom) {
return Qnil;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I think this is safe to remove because rgeo_convert_to_detached_geos_geometry checks for both of these, but I'm not quite sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm fully understanding (for now), You are ignoring the state, which seems wrong to me

if !fg
false

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar situation here. We can remove the nil checks

@keithdoggett keithdoggett changed the title Geos raise on null geom Raise Error After Failed Cast in GEOS Impls Oct 29, 2022
args.klasses = &klass;
args.state = &state;

geom = rb_protect(rgeo_convert_to_detached_geos_geometry_wrapper,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're raising now, this function and the coord_seq_from_array function had memory leaks because we allocated data before calling rgeo_convert_* in each of them. This wraps those calls in rb_protect so that we can handle the error, free the existing data and then re-raise.

I could go with rb_rescue, but rb_protect feels a little nicer to handle all of the error logic within the original caller.

Comment on lines 256 to 261
if (state) {
rb_exc_raise(rb_errinfo());
}
if (!exterior_geom) {
return Qnil;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm fully understanding (for now), You are ignoring the state, which seems wrong to me

Comment on lines 867 to 868
rb_raise(rb_eRGeoInvalidGeometry,
"Unable to cast the geometry to the GEOS Factory");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay now I got it. You are now raising here if the state is changed.

If think this needs a tiny bit of rethinking. For one contributor of our c extension, it would feel weird to pass a state, and not have to care about it.

I'd go either of two ways:

1. Propagate the state

Use the state, propagate up and raise in the upper method. So here, we'd just rb_raise if the object is NIL_P, or actually we'd set the state to one and set the correct error to the last ruby error. For instance using rb_set_errinfo

Get rid of the state in the function parameters

Have an intern state only. This raises the concern you had: now, calling this method can raise.

My preference

I'd rather go for (1.) and say that none of our c utility method should raise, but all should act as rb_protect. This, for me, is more intuitive and you can see in the method arguments that you'll have to handle a potential error. Rather than with (2.), you have to know the implementation details of the methods to know that a method raises or not.

However, (2.) might be way less verbose, in that case
I'd be ok to have some compromise that relies on naming (e.g. a utility that ends with _bang can raise for instance)

Copy link
Member

@BuonOmo BuonOmo Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For (1.), if we are scared of using rb_set_errinfo (because rb_protect actually seems to do much more), we could:

  if (NIL_P(object))
    rb_protect(rb_exc_raise, rb_exc_new_str(rb_eRGeoInvalidGeometry, "Unable to cast the geometry to the GEOS Factory"), state);

  if (*state)
    return NULL;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this solution. We'd have to do a NULL check but at least we can raise right then and halt execution instead of letting it propagate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look at this soon and see if any other issues come up with it, but this seems like a fairly straightforward fix. Do you suggest doing this for both of the rgeo_convert_* methods or just the detached version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess both of them get used in a similar way that we see as an issue here so I can do both.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BuonOmo I'm going through this a little more now and I'm not sure we can meet the guarantee of the rgeo_convert_* methods not raising since we use Check_TypedStruct to validate the data types which may raise.

I can modify the detached version so that if it's NULL or state is returned, the user should handle the error, and the state variable doesn't just disappear, but I don't think it makes sense to pull the Check_TypedStruct calls out of these functions. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6f24106 This is the commit where we a added that check btw.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BuonOmo looked at this some more. It seems like I can replace these Check_TypedStruct calls with the rgeo_is_geos_object or rgeo_check_geos_object functions we defined and wrap it in rb_protect.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keithdoggett it seems like there was a lot more to dig than what I expected. Maybe raising is the simpler option?

If you feel like you have the problem under control however, I fully agree with your choices.

My last concern is about performances, they may be a bit degraded when using protect. But tbh I don't know, and I'm not sure for now how to handle that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can look into rb_protect and see if performance is going to be really degraded. I think rb_protect is intended for calling ruby methods from C code, but the interface is much nicer since you don't have to define a callback function with specific args like you do with rb_rescue.

I would say from a "coding flow" standpoint it feels good right now and the 2 rgeo_convert_* functions have a clear error handling pattern through state, but I can investigate rb_protect.

@keithdoggett
Copy link
Member Author

keithdoggett commented Nov 4, 2022

@BuonOmo my most recent changes propagate state through the internal rgeo_convert_* functions and the caller must determine how to handle an error. This makes cleanup a lot easier in cases where some data may have already been allocated.

Thanks for the suggestion on that

ext/geos_c_impl/factory.c Outdated Show resolved Hide resolved
ext/geos_c_impl/factory.c Outdated Show resolved Hide resolved
ext/geos_c_impl/geometry.c Outdated Show resolved Hide resolved
Copy link
Member

@BuonOmo BuonOmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we have something much cleaner now!

@keithdoggett
Copy link
Member Author

@BuonOmo thanks for the review! The only thing that is a minor concern (although it doesn't seem to be a big deal) is that rb_protect expects the given function to return a VALUE. The compiler gives a warning about this, but as far as I can tell it doesn't cause any crashes or other issues.

If we want to match the return type, we could wrap rb_raise/rb_exc_raise in a function like this

VALUE rb_raise_value(VALUE error_type, char* cstr) {
  rb_raise(error_type, cstr);
  return Qnil;
}

@BuonOmo
Copy link
Member

BuonOmo commented Nov 4, 2022

I think I'd add the method to avoid having too much warnings. However, not such if that will really silent the error: you might have another error then saying that rb_raise_value is not returning and should be declared as such

@keithdoggett
Copy link
Member Author

@BuonOmo I added a function that returns a VALUE. Similar to this use case from the ruby library

https://github.com/ruby/ruby/blob/d92f09a5eea009fa28cd046e9d0eb698e3d94c5c/ext/-test-/exception/ensured.c#L21-L33

I also cleaned up a few other warnings.

Comment on lines +15 to +16
VALUE
rb_exc_raise_value(VALUE exc);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a short comment here for the reason?

@keithdoggett keithdoggett merged commit ca634db into main Nov 11, 2022
@keithdoggett keithdoggett deleted the geos-raise-on-null-geom branch November 11, 2022 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants