Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor globals and GC mark them #282

Merged
merged 2 commits into from
Jan 9, 2022
Merged

Conversation

teckwan
Copy link
Contributor

@teckwan teckwan commented Jan 7, 2022

Summary

Refactor the globals, to make them global in C, and remove the link between factory and globals. This avoids passing references around, and treating ruby as a sort of storage for these globals.

Reproduction

The script below would segv every time (every rgeo / geos version that can compile nowadays)

pt=RGeo::Geos.factory.point(1,0)
RGeo::Geos::CAPIFactory::INTERNAL_CGLOBALS = nil
def polygon(pts, factory: RGeo::Geos.factory)
  pts = pts.each_slice(2) if pts.first.is_a?(Numeric) && pts.size.even?
  factory.polygon(
    ring(pts, factory: factory)
  )
end
polygon([0, 0,    0, 1,    1, 1,    1, 0,    0, 0]).contains?(pt)

Even though this is a pretty dumb script, pure ruby should not crash, moreover, using GC.compact a few times (or verify_compaction_references) would generate the same issue at some point.

Comment on lines +851 to +855
/* No need to cache the internal here (https://ips.fastruby.io/4x) */
result = rb_funcall(factory, rb_intern("eql?"), 1, RGEO_GEOMETRY_DATA_PTR(obj2)->factory);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We extended the usage of this benchmark to "hash" and "cast" as well, the idea being that their usage does not call for such precise performance optimisation (otherwise, we should not be calling rb_funcall either).

Performance matters, however we're working with geometries that are much computation heavier than featching a value in a global symbol cache, hence we decided to pick our fights and prefer readability here.

The benchmark was made with:

require "benchmark/ips"
load "#{__dir__}/lib/rgeo.rb"

# Puts a Centered, bold and inverted text for better visibility.
def shout_out(string)
  puts "\n\e[7;1m#{string.center(75)}\e[27;0m\n\n"
end

shout_out("SMALL POLYGON")

def ring(pts, factory: RGeo::Geos.factory)
  factory.linear_ring(
    pts.map { factory.point(_1, _2) }
  )
end

def polygon(pts, factory: RGeo::Geos.factory)
  pts = pts.each_slice(2) if pts.first.is_a?(Numeric) && pts.size.even?
  factory.polygon(
    ring(pts, factory: factory)
  )
end

poly = polygon([0, 0,    0, 1,    1, 1,    1, 0,    0, 0])

Benchmark.ips do |x|
  x.config(time: 20)
  x.report("rb_intern") { poly == poly }
  x.report("cached") { poly == poly }
  x.hold!("temp")
  x.compare!
end

and commenting the polygon equality part to only have this class used in c.

@@ -634,33 +576,16 @@ RGeo_Globals* rgeo_init_geos_factory()
rb_define_module_function(geos_factory_class, "_supports_unary_union?", cmethod_factory_supports_unary_union, 0);

// Pre-define implementation classes and set up allocation methods
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@teckwan I think we should update this comment

wkt_generator = rb_funcall(
rb_const_get_at(globals->geos_module, rb_intern("Utils")),
if (NIL_P(psych_wkt_generator)) {
psych_wkt_generator = rb_funcall(
Copy link
Member

@BuonOmo BuonOmo Jan 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be marked as well ? Note that however, in the current case, I did not find a way to generate a segv.

I tried some variations of :

RGeo::Geos::Utils.singleton_class.undef_method :psych_wkt_generator
RGeo::WKRep::WKTGenerator.undef_method :generate
RGeo::Geos::Utils = nil
f = RGeo::Geos.factory(has_z_coordinate: true)
pt = f.point(1, 0, 0);nil
GC.start
GC.verify_compaction_references(double_heap: true, toward: :empty)
f.write_for_psych(pt)

ObjectSpace.each_object(RGeo::WKRep::WKTGenerator).count # run this between every line to see evolution

At some point, I just found that RGeo::Geos.capi_supported? is false. It looks like there is an issue with our new organisation:

image


EDIT: my issue was local, a clean and compile fixed it. However:

f = RGeo::Geos.factory(has_z_coordinate: true)
pt = f.point(1, 0, 0);nil
GC.start
GC.verify_compaction_references(double_heap: true, toward: :empty)
f.write_for_psych(pt)

segfaults at some point, we'll have to mark it (i guess when setting it to Qnil)

Copy link
Contributor Author

@teckwan teckwan Jan 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Effectively there was a need to mark this as well. However i think that since the value mutates, simply using rb_gc_register_mark_object was not sufficient (still segfaults using your reproduction). By using rb_gc_register_address it seems to be better.

Copy link
Member

@BuonOmo BuonOmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last comment to add IMHO (reflecting the last answer you gave me) otherwise, LGTM!

@keithdoggett could you review this one as well ?

Comment on lines +550 to +554
psych_wkt_generator = Qnil;
rb_gc_register_address(&psych_wkt_generator);
marshal_wkb_generator = Qnil;
rb_gc_register_address(&marshal_wkb_generator);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
psych_wkt_generator = Qnil;
rb_gc_register_address(&psych_wkt_generator);
marshal_wkb_generator = Qnil;
rb_gc_register_address(&marshal_wkb_generator);
/* We favor rb_gc_register_address over rb_gc_register_mark_object because the value changes at runtime */
psych_wkt_generator = Qnil;
rb_gc_register_address(&psych_wkt_generator);
marshal_wkb_generator = Qnil;
rb_gc_register_address(&marshal_wkb_generator);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that makes sense to add a note. I didn't realize you could mark objects with the rb_gc_register_address and rb_gc_mark functions. Pretty cool

Copy link
Member

@keithdoggett keithdoggett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great thanks for the work on this! The notes mentioned in there make sense and this seems a lot better than passing around the globals struct.

@BuonOmo BuonOmo merged commit a171630 into rgeo:master Jan 9, 2022
@BuonOmo BuonOmo deleted the compact-bug branch January 9, 2022 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants