Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lifecycle Management of RemoteServiceAdmin #653

Open
PengZheng opened this issue Sep 24, 2023 · 4 comments
Open

Lifecycle Management of RemoteServiceAdmin #653

PengZheng opened this issue Sep 24, 2023 · 4 comments
Labels
component/remote-services Categorizes an issue or PR relevant to remote services. kind/bug Categorizes issue or PR as related to a bug.

Comments

@PengZheng
Copy link
Contributor

PengZheng commented Sep 24, 2023

cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_TESTING=ON -DENABLE_ADDRESS_SANITIZER=ON ..
make -j8
ctest

The following tests FAILED:
         49 - test_cxx_remote_services_integration (SEGFAULT)

UpdateCTestConfiguration  from :/home/peng/Downloads/git/mycelix/build/DartConfiguration.tcl
UpdateCTestConfiguration  from :/home/peng/Downloads/git/mycelix/build/DartConfiguration.tcl
Test project /home/peng/Downloads/git/mycelix/build
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 49
    Start 49: test_cxx_remote_services_integration

49: Test command: /home/peng/Downloads/git/mycelix/build/bundles/cxx_remote_services/integration/gtest/test_cxx_remote_services_integration
49: Test timeout computed to be: 10000000
49: Running main() from /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/src/gtest_main.cc
49: [==========] Running 2 tests from 1 test suite.
49: [----------] Global test environment set-up.
49: [----------] 2 tests from RemoteServicesIntegrationTestSuite
49: [ RUN      ] RemoteServicesIntegrationTestSuite.StartStopFrameworks
49: [2023-09-24T13:30:44] [   info] [celix_framework] [framework_start:476] Celix framework started
49: [2023-09-24T13:30:44] [   info] [celix_framework] [framework_start:476] Celix framework started
49: [2023-09-24T13:30:44] [   info] [celix_psa_zmq_v2] [PSA_ZMQ] Using 127.0.0.1 for service annunciation
49: [2023-09-24T13:30:44] [   info] [celix_psa_zmq_v2] [PSA_ZMQ] Using base till max port: 5501 till 6000
49: [2023-09-24T13:30:44] [   info] [celix_framework] Starting TestExportImportRemoteServiceFactory
49: [2023-09-24T13:30:44] [   info] [celix_psa_zmq_v2] [PSA_ZMQ] Using 127.0.0.1 for service annunciation
49: [2023-09-24T13:30:44] [   info] [celix_psa_zmq_v2] [PSA_ZMQ] Using base till max port: 5501 till 6000
49: [2023-09-24T13:30:44] [   info] [celix_framework] Starting TestExportImportRemoteServiceFactory
49: AddressSanitizer:DEADLYSIGNAL
49: =================================================================
49: ==107712==ERROR: AddressSanitizer: SEGV on unknown address 0x7f36b248f9b8 (pc 0x7f36b24ab4ac bp 0x7f36b7afdb90 sp 0x7f36b7afdb90 T1)
49: ==107712==The signal is caused by a READ memory access.
49:     #0 0x7f36b24ab4ac in std::default_delete<celix::rsa::IImportRegistration>::operator()(celix::rsa::IImportRegistration*) const /usr/include/c++/11/bits/unique_ptr.h:85
49:     #1 0x7f36b24ab4ac in std::_Sp_counted_deleter<celix::rsa::IImportRegistration*, std::default_delete<celix::rsa::IImportRegistration>, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() /usr/include/c++/11/bits/shared_ptr_base.h:442
49:     #2 0x7f36b24a439d in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /usr/include/c++/11/bits/shared_ptr_base.h:168
49:     #3 0x7f36b24a439d in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /usr/include/c++/11/bits/shared_ptr_base.h:705
49:     #4 0x7f36b24a439d in std::__shared_ptr<celix::rsa::IImportRegistration, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /usr/include/c++/11/bits/shared_ptr_base.h:1154
49:     #5 0x7f36b24a439d in std::shared_ptr<celix::rsa::IImportRegistration>::~shared_ptr() /usr/include/c++/11/bits/shared_ptr.h:122
49:     #6 0x7f36b24a439d in celix::rsa::RemoteServiceAdmin::removeEndpoint(std::shared_ptr<celix::rsa::EndpointDescription> const&) /home/peng/Downloads/git/mycelix/bundles/cxx_remote_services/admin/src/RemoteServiceAdmin.cc:81
49:     #7 0x7f36b24eadd9 in std::function<void (std::shared_ptr<celix::rsa::EndpointDescription> const&, std::shared_ptr<celix::Properties const> const&)>::operator()(std::shared_ptr<celix::rsa::EndpointDescription> const&, std::shared_ptr<celix::Properties const> const&) const /usr/include/c++/11/bits/std_function.h:590
49:     #8 0x7f36b24eadd9 in celix::dm::ServiceDependency<celix::rsa::RemoteServiceAdmin, celix::rsa::EndpointDescription>::setupCallbacks()::{lambda(void*, void*, hashMap const*)#3}::operator()(void*, void*, hashMap const*) const /home/peng/Downloads/git/mycelix/libs/framework/include/celix/dm/ServiceDependency_Impl.h:581
49:     #9 0x7f36b24eadd9 in celix::dm::ServiceDependency<celix::rsa::RemoteServiceAdmin, celix::rsa::EndpointDescription>::setupCallbacks()::{lambda(void*, void*, hashMap const*)#3}::_FUN(void*, void*, hashMap const*) /home/peng/Downloads/git/mycelix/libs/framework/include/celix/dm/ServiceDependency_Impl.h:569
49:     #10 0x7f36bd68c6e2 in celix_dmServiceDependency_invokeRemove /home/peng/Downloads/git/mycelix/libs/framework/src/dm_service_dependency.c:321
49:     #11 0x7f36bd68ae22 in celix_dmComponent_handleRemove /home/peng/Downloads/git/mycelix/libs/framework/src/dm_component_impl.c:669
49:     #12 0x7f36bd68ae22 in celix_private_dmComponent_handleEvent /home/peng/Downloads/git/mycelix/libs/framework/src/dm_component_impl.c:538
49:     #13 0x7f36bd68b1da in serviceDependency_removeServiceTrackerCallback /home/peng/Downloads/git/mycelix/libs/framework/src/dm_service_dependency.c:313
49:     #14 0x7f36bd68042a in serviceTracker_invokeRemovingService /home/peng/Downloads/git/mycelix/libs/framework/src/service_tracker.c:597
49:     #15 0x7f36bd68042a in serviceTracker_untrackTracked /home/peng/Downloads/git/mycelix/libs/framework/src/service_tracker.c:560
49:     #16 0x7f36bd680ae2 in serviceTracker_close /home/peng/Downloads/git/mycelix/libs/framework/src/service_tracker.c:231
49:     #17 0x7f36bd680caa in celix_serviceTracker_destroy /home/peng/Downloads/git/mycelix/libs/framework/src/service_tracker.c:712
49:     #18 0x7f36bd65593c in celix_bundleContext_removeServiceTracker /home/peng/Downloads/git/mycelix/libs/framework/src/bundle_context.c:789
49:     #19 0x7f36bd661642 in fw_handleEventRequest /home/peng/Downloads/git/mycelix/libs/framework/src/framework.c:1297
49:     #20 0x7f36bd661642 in fw_handleEvents /home/peng/Downloads/git/mycelix/libs/framework/src/framework.c:1342
49:     #21 0x7f36bd661cdb in fw_eventDispatcher /home/peng/Downloads/git/mycelix/libs/framework/src/framework.c:1509
49:     #22 0x7f36bc094b42 in start_thread nptl/pthread_create.c:442
49:     #23 0x7f36bc1269ff  (/lib/x86_64-linux-gnu/libc.so.6+0x1269ff)
49:
49: AddressSanitizer can not provide additional info.
49: SUMMARY: AddressSanitizer: SEGV /usr/include/c++/11/bits/unique_ptr.h:85 in std::default_delete<celix::rsa::IImportRegistration>::operator()(celix::rsa::IImportRegistration*) const
49: Thread T1 created by T0 here:
49:     #0 0x7f36bcc58685 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216
49:     #1 0x7f36bd60c7c9 in celixThread_create /home/peng/Downloads/git/mycelix/libs/utils/src/celix_threads.c:36
49:     #2 0x7f36bd66992d in fw_init /home/peng/Downloads/git/mycelix/libs/framework/src/framework.c:396
49:     #3 0x7f36bd66e241 in framework_start /home/peng/Downloads/git/mycelix/libs/framework/src/framework.c:449
49:     #4 0x7f36bd684512 in celix_frameworkFactory_createFramework /home/peng/Downloads/git/mycelix/libs/framework/src/celix_framework_factory.c:40
49:     #5 0x5568886ef033 in celix::createFramework(celix::Properties const&) /home/peng/Downloads/git/mycelix/libs/framework/include/celix/FrameworkFactory.h:40
49:     #6 0x5568886f0ead in RemoteServicesIntegrationTestSuite::RemoteServicesIntegrationTestSuite() /home/peng/Downloads/git/mycelix/bundles/cxx_remote_services/integration/gtest/src/RemoteServicesIntegrationTestSuite.cc:39
49:     #7 0x5568886f2620 in RemoteServicesIntegrationTestSuite_StartStopFrameworks_Test::RemoteServicesIntegrationTestSuite_StartStopFrameworks_Test() /home/peng/Downloads/git/mycelix/bundles/cxx_remote_services/integration/gtest/src/RemoteServicesIntegrationTestSuite.cc:166
49:     #8 0x5568886f2620 in testing::internal::TestFactoryImpl<RemoteServicesIntegrationTestSuite_StartStopFrameworks_Test>::CreateTest() /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/include/gtest/internal/gtest-internal.h:472
49:     #9 0x5568887c3e5a in testing::Test* testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::TestFactoryBase, testing::Test*>(testing::internal::TestFactoryBase*, testing::Test* (testing::internal::TestFactoryBase::*)(), char const*) /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/src/gtest.cc:2607
49:     #10 0x5568887c3e5a in testing::Test* testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::TestFactoryBase, testing::Test*>(testing::internal::TestFactoryBase*, testing::Test* (testing::internal::TestFactoryBase::*)(), char const*) /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/src/gtest.cc:2643
49:     #11 0x5568887969a5 in testing::TestInfo::Run() /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/src/gtest.cc:2851
49:     #12 0x556888798a06 in testing::TestSuite::Run() /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/src/gtest.cc:3015
49:     #13 0x55688879b208 in testing::internal::UnitTestImpl::RunAllTests() /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/src/gtest.cc:5855
49:     #14 0x5568887c489a in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/src/gtest.cc:2607
49:     #15 0x5568887c489a in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/src/gtest.cc:2643
49:     #16 0x556888797213 in testing::UnitTest::Run() /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/src/gtest.cc:5438
49:     #17 0x5568886e25c2 in RUN_ALL_TESTS() /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/include/gtest/gtest.h:2490
49:     #18 0x5568886e25c2 in main /home/peng/Downloads/git/mycelix/build/_deps/googletest-src/googletest/src/gtest_main.cc:52
49:     #19 0x7f36bc029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
49:
49: ==107712==ABORTING
1/1 Test #49: test_cxx_remote_services_integration ...***Failed    1.35 sec

@PengZheng
Copy link
Contributor Author

PengZheng commented Sep 24, 2023

It has been bisected to be introduced by 6bb8de5.

But as the commit log said, celix_framework_stopBundleEntry is unnecessary, since celix_framework_uninstallBundleEntry will stop the bundle first. It seems more like a testing/C++ RSA issue. @pnoltes

@PengZheng
Copy link
Contributor Author

PengZheng commented Sep 24, 2023

I've verified that adding celix_framework_stopBundleEntry back will "fix" the problem.

    size = celix_arrayList_size(stopEntries);
    for (int i = size-1; i >= 0; --i) { //note loop in reverse order -> stop later installed bundle first
        celix_framework_bundle_entry_t *entry = celix_arrayList_get(stopEntries, i);

        //NOTE possible starvation.
//        fw_bundleEntry_waitTillUseCountIs(entry, 1);  //note this function has 1 use count.

        //note race between condition (use count == 1) and bundle stop, meaning use count can be > 1 when
        //celix_framework_stopBundleEntry is called.

        bundle_state_e state = celix_bundle_getState(entry->bnd);
        if (state == CELIX_BUNDLE_STATE_ACTIVE || state == CELIX_BUNDLE_STATE_STARTING) {
            celix_framework_stopBundleEntry(fw, entry);
        }
    }
    for (int i = size-1; i >= 0; --i) { //note loop in reverse order -> uninstall later installed bundle first
        celix_framework_bundle_entry_t *entry = celix_arrayList_get(stopEntries, i);
        celix_framework_uninstallBundleEntry(fw, entry, false);
    }
    celix_arrayList_destroy(stopEntries);

And it can be narrowed down to the consumer bundle. The following can manifest the issue:

TEST_F(RemoteServicesIntegrationTestSuite, StartStopFrameworks) {
//    installProviderBundles();
    installConsumerBundles();
}

@PengZheng
Copy link
Contributor Author

PengZheng commented Sep 24, 2023

I suspect this is a lifecycle management bug of TestExportImportRemoteServiceFactory/C++ RSA.

Note that ComponentImportRegistration produced by CalculatorImportServiceFactory::importService is NOT managed by CalculatorImportServiceFactory. That means if TestExportImportRemoteServiceFactory bundle is uninstalled, RemoteServiceAdmin will keep references of ComponentImportRegistration, whose destructor is not in address space anymore.

Note also that any bundle should be able to be uninstalled independent of other bundles.

This shows that any object passing through Celix service should be managed by the object owner bundle some way.
WDYT? @pnoltes @rlenferink

@PengZheng PengZheng added kind/bug Categorizes issue or PR as related to a bug. component/testing Categorizes an issue or PR relevant to testing. labels Sep 24, 2023
@PengZheng
Copy link
Contributor Author

PengZheng commented Sep 25, 2023

Currently, IImportRegistration can not tell which IImportServiceFactory it comes from.
Otherwise, we can remove it from RemoteServiceAdmin when the factory is stopped.
Thus, it turns out to be a design issue of cxx_remote_services.

@PengZheng PengZheng added component/remote-services Categorizes an issue or PR relevant to remote services. and removed component/testing Categorizes an issue or PR relevant to testing. labels Sep 25, 2023
@PengZheng PengZheng changed the title test_cxx_remote_services_integration crash Lifecycle Management of RemoteServiceAdmin Sep 25, 2023
PengZheng added a commit that referenced this issue Sep 25, 2023
It helps catch issues like #653.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/remote-services Categorizes an issue or PR relevant to remote services. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

1 participant