Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some unit tests are broken on Ruby 3.0 #3263

Closed
kenhys opened this issue Feb 19, 2021 · 27 comments · Fixed by #3392
Closed

Some unit tests are broken on Ruby 3.0 #3263

kenhys opened this issue Feb 19, 2021 · 27 comments · Fixed by #3392
Assignees

Comments

@kenhys
Copy link
Contributor

kenhys commented Feb 19, 2021

Check CONTRIBUTING guideline first and here is the list to help us investigate the problem.

Describe the bug

Drone CI is continuously broken for two months.
See https://cloud.drone.io/fluent/fluentd

To Reproduce

Follow steps in .drone.yml

See https://github.com/fluent/fluentd/blob/master/.drone.yml#L26-L35

Expected behavior

There are 3 test case failures.

===============================================================================
Failure: test: it does not create accessor methods for parameters(Fluent::Config::TestConfigurable:::skip_accessor option)
/drone/src/test/config/test_configurable.rb:1456:in `block (2 levels) in <class:TestConfigurable>'
     1453:         @example = ConfigurableSpec::ExampleWithSkipAccessor.new
     1454:         @example.configure(config_element('ROOT'))
     1455:         assert_equal 'example7', @example.instance_variable_get(:@name)
  => 1456:         assert_raise NoMethodError.new("undefined method `name' for #{@example}") do
     1457:           @example.name
     1458:         end
     1459:       end
===============================================================================
Failure: test: when out_forward has @id(ForwardOutputTest):
  Exception raised:
  ArgumentError(<wrong number of arguments (given 2, expected 1)>)
  /drone/src/lib/fluent/plugin.rb:120:in `new_sd'
  /drone/src/vendor/bundle/ruby/3.0.0/gems/rr-1.2.1/lib/rr/method_dispatches/method_dispatch.rb:30:in `call_original_method'
  /drone/src/vendor/bundle/ruby/3.0.0/gems/rr-1.2.1/lib/rr/method_dispatches/method_dispatch.rb:41:in `call_implementation'
  /drone/src/vendor/bundle/ruby/3.0.0/gems/rr-1.2.1/lib/rr/method_dispatches/method_dispatch.rb:16:in `call'
  /drone/src/vendor/bundle/ruby/3.0.0/gems/rr-1.2.1/lib/rr/injections/double_injection.rb:183:in `dispatch_method'
  /drone/src/vendor/bundle/ruby/3.0.0/gems/rr-1.2.1/lib/rr/injections/double_injection.rb:40:in `dispatch_method'
  /drone/src/vendor/bundle/ruby/3.0.0/gems/rr-1.2.1/lib/rr/injections/double_injection.rb:148:in `new_sd'
  /drone/src/lib/fluent/plugin_helper/service_discovery/manager.rb:37:in `block in configure'
===============================================================================
Failure: test: Do not create connection per send_data(ForwardOutputTest::keepalive):
  socket_create_tcp("127.0.0.1", 38557, anything)
  Called 2 times.
  Expected 1 times.
/drone/src/test/plugin/test_out_forward.rb:1188:in `block (2 levels) in <class:ForwardOutputTest>'
     1185:
     1186:       begin
     1187:         chunk = Fluent::Plugin::Buffer::MemoryChunk.new(Fluent::Plugin::Buffer::Metadata.new(nil, nil, nil))
  => 1188:         mock.proxy(d.instance).socket_create_tcp(TARGET_HOST, TARGET_PORT, anything) { |sock| mock(sock).close.once; sock }.once
     1189:
     1190:         target_input_driver.run(timeout: 15) do
     1191:           d.run(shutdown: false) do
===============================================================================

Your Environment

  • Fluentd or td-agent version: fluentd --version or td-agent --version
  • Operating system: cat /etc/os-release
  • Kernel version: uname -r
$ ruby -v
ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [aarch64-linux]
$ bundle exec gem list

*** LOCAL GEMS ***

async (1.28.9)
async-http (0.54.1)
async-io (1.30.2)
async-pool (0.3.5)
bundler (2.2.11)
concurrent-ruby (1.1.8)
console (1.10.1)
cool.io (1.7.1)
docile (1.3.5)
fiber-local (1.0.0)
flexmock (2.3.6)
fluentd (1.12.1)
http_parser.rb (0.6.0)
msgpack (1.4.2)
nio4r (2.5.5)
oj (3.11.2)
parallel (1.20.1)
parallel_tests (0.15.4)
power_assert (2.0.0)
protocol-hpack (1.4.2)
protocol-http (0.21.0)
protocol-http1 (0.13.2)
protocol-http2 (0.14.2)
rake (13.0.3)
rr (1.2.1)
serverengine (2.2.3)
sigdump (0.2.4)
simplecov (0.21.2)
simplecov-html (0.12.3)
simplecov_json_formatter (0.1.2)
strptime (0.2.5)
test-unit (3.4.0)
test-unit-rr (1.0.5)
timecop (0.9.4)
timers (4.3.3)
tzinfo (2.0.4)
tzinfo-data (1.2021.1)
yajl-ruby (1.4.1)

Your Configuration

N/A

Your Error Log

I've attached logs

Additional context

N/A
arm64v8-test.log

@kenhys
Copy link
Contributor Author

kenhys commented Feb 19, 2021

#3262 will fix the first NoMethodError failure.

@ashie
Copy link
Member

ashie commented Feb 19, 2021

#3262 will fix the first NoMethodError failure.

It still seems fail: https://cloud.drone.io/fluent/fluentd/1045/2/2

Failure: test: it does not create accessor methods for parameters(Fluent::Config::TestConfigurable:::skip_accessor option)
/drone/src/test/config/test_configurable.rb:1456:in `block (2 levels) in <class:TestConfigurable>'
     1453:         @example = ConfigurableSpec::ExampleWithSkipAccessor.new
     1454:         @example.configure(config_element('ROOT'))
     1455:         assert_equal 'example7', @example.instance_variable_get(:@name)
  => 1456:         assert_raise NoMethodError.new("undefined method `name' for #{@example.to_s}") do
     1457:           @example.name
     1458:         end
     1459:       end

<NoMethodError(<undefined method `name' for #<ConfigurableSpec::ExampleWithSkipAccessor:0x0000aaaad0a7eb50>>)> expected but was
<NoMethodError(<undefined method `name' for #<ConfigurableSpec::ExampleWithSkipAccessor:0x0000aaaad0a7eb50 @name="example7", @config=name:ROOT, arg:, {}, [], @config_root_section=<Fluent::Config::Section {"name":"example7"}>>>)

@kenhys
Copy link
Contributor Author

kenhys commented Feb 22, 2021

First test case:
It is strange that the following sample code execution result is the same on amd64 and arm64v8. 😕

@example = ConfigurableSpec::ExampleWithSkipAccessor.new
@example.configure(config_element('ROOT'))

p @example.inspect

Then ruby -Ilib test/config/test_configurable.rb

  • ruby 2.6.6 amd64 ng
  • ruby 2.7.2 amd64 ng
  • ruby 3.0 amd64 ok
  • ruby 3.0 arm64 ok

@kenhys
Copy link
Contributor Author

kenhys commented Feb 22, 2021

Failure: test: when out_forward has @id(ForwardOutputTest): test case will be fixed by specifying parent explicitly.

@kenhys
Copy link
Contributor Author

kenhys commented Feb 22, 2021

Revised #3262 for fixing 2nd test case.

@kenhys kenhys self-assigned this Feb 24, 2021
@ashie
Copy link
Member

ashie commented Mar 5, 2021

There are 3 test case failures.

They aren't Drone or Arm specific, it's caused by combination of Ruby 3.0 and RR.
As I described in #3262 (comment) RR's Proxy doesn't seem support Ruby3.0's kwargs, we need to fix RR.
As a first aid I omitted these test on Ruby 3.0.

@ashie ashie changed the title Fix continuously broken Drone CI test cases Some unit tests are broken on Ruby 3.0 Mar 9, 2021
@kenhys
Copy link
Contributor Author

kenhys commented Mar 12, 2021

FIXME:

  • ruby -Ilib test/plugin/test_out_forward.rb -v -n /after/
  • ruby -Ilib test/plugin_helper/test_server.rb -v -n /cannot/

@kenhys
Copy link
Contributor Author

kenhys commented Mar 15, 2021

ruby -Ilib test/plugin_helper/test_server.rb -v -n /cannot/

In this failure case, it seems that server_create_tcp_socket doesn't raise an expected exception.

sock = if shared
server_socket_manager_client.listen_tcp(bind, port)
else
# TCPServer.new doesn't set IPV6_V6ONLY flag, so use Addrinfo class instead.
# backlog will be set by the caller, we don't need to set backlog here
tsock = Addrinfo.tcp(bind, port).listen
tsock.autoclose = false
TCPServer.for_fd(tsock.fileno)
end

@kenhys
Copy link
Contributor Author

kenhys commented Mar 15, 2021

In this failure case, it seems that server_create_tcp_socket doesn't raise an expected exception.
https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin_helper/server.rb#L352-L360

server_create_tcp_socket depends server_socket_manager_client

def server_socket_manager_client
socket_manager_path = ENV['SERVERENGINE_SOCKETMANAGER_PATH']
if Fluent.windows?
socket_manager_path = socket_manager_path.to_i
end
ServerEngine::SocketManager::Client.new(socket_manager_path)
end

@kenhys
Copy link
Contributor Author

kenhys commented Mar 16, 2021

#3263 (comment) It is not clarified the problem well. it may differ between Ruby 2.7 (raise EADDRINUSE) and Ruby 3.0 (NO error)

#<Socket::Option: INET SOCKET REUSEADDR 0> is not set both of them (2.7/3.0)

@kenhys
Copy link
Contributor Author

kenhys commented Mar 17, 2021

I'll try a reproducible minimum script.

@kenhys
Copy link
Contributor Author

kenhys commented Mar 17, 2021

It seems SocketManager.send_peer succeeds (TCPSocket writes packed data - 91 bytes), but SocketManager.recv_peer behaves differently.

https://github.com/treasure-data/serverengine/blob/6e4dc985f83829811532934f136d5f2204740daa/lib/serverengine/socket_manager.rb#L51-L53

@kenhys
Copy link
Contributor Author

kenhys commented Mar 18, 2021

Both of 2.7/3.0, WinSock.bind(sock.handle, sock_addr, sock_addr.bytesize) returns -1, but RbWinSock.raise_last_error("bind(2)") raise Errno::NOERROR in 3.0.

https://github.com/treasure-data/serverengine/blob/6e4dc985f83829811532934f136d5f2204740daa/lib/serverengine/socket_manager_win.rb#L85-L87

@kenhys
Copy link
Contributor Author

kenhys commented Mar 19, 2021

@kenhys
Copy link
Contributor Author

kenhys commented Mar 22, 2021

I've found bind return -1, but WinSock.WSAGetLastError returns 0.

Thread.current is same (bind(2) and WinSock.WSAGetLastError()) then mutex.synchronize do block doesn't resolve the issue.

@kenhys
Copy link
Contributor Author

kenhys commented Mar 23, 2021

I got feedback to reproduce it

module WinSock
  require 'fiddle/import'
  extend Fiddle::Importer
  dlload 'ws2_32.dll'
  extern 'int bind(int, void *, int)'
  extern 'int WSAGetLastError(void)'
end

p WinSock.bind(0, nil, 0)
p WinSock.WSAGetLastError

Ruby 2.7

PS C:\work\fluentd> ridk use
1 - C:/Ruby25-x64       ruby 2.5.8p224 (2020-03-31 revision 67882) [x64-mingw32]
2 - C:/Ruby26-x64       ruby 2.6.6p146 (2020-03-31 revision 67876) [x64-mingw32]
3 - C:/Ruby27-x64       ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x64-mingw32]
4 - C:/Ruby30-x64       ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x64-mingw32]
Select ruby version to enable: 3
Disable C:/Ruby26-x64
Disable C:/Ruby30-x64
Enable C:/Ruby27-x64
PS C:\work\fluentd> ruby .\test-wsagetlasterror.rb
-1
10038

Ruby 3.0

PS C:\work\fluentd> ridk use
1 - C:/Ruby25-x64       ruby 2.5.8p224 (2020-03-31 revision 67882) [x64-mingw32]
2 - C:/Ruby26-x64       ruby 2.6.6p146 (2020-03-31 revision 67876) [x64-mingw32]
3 - C:/Ruby27-x64       ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x64-mingw32]
4 - C:/Ruby30-x64       ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x64-mingw32]
Select ruby version to enable: 4
Disable C:/Ruby26-x64
Disable C:/Ruby27-x64
Enable C:/Ruby30-x64
PS C:\work\fluentd> ruby .\test-wsagetlasterror.rb
-1
0

@kenhys
Copy link
Contributor Author

kenhys commented Mar 23, 2021

Additiional experiment with #3263 (comment)

ruby 3.0 fiddle 1.0.6 NG (WinSock.WSAGetLastError returns 0)
ruby 3.0 fiddle 1.0.7 NG (WinSock.WSAGetLastError returns 0)
ruby 2.7 fiddle 1.0.0 OK
ruby 2.7 fiddle 1.0.7 OK

@kenhys
Copy link
Contributor Author

kenhys commented Apr 2, 2021

FYI: ruby/fiddle#72

I've forwarded to upstream about this issue. #3263 (comment)

@ashie
Copy link
Member

ashie commented Apr 6, 2021

rr's issue is fixed in #3304

Remaining issue is

FYI: ruby/fiddle#72

I've forwarded to upstream about this issue. #3263 (comment)

In addition, following error may be also Ruby 3.0 on Windows specific?

https://github.com/fluent/fluentd/runs/2253559936

2021-04-02T11:52:26.6046267Z   2016-04-13 18:33:32 -0700 [warn]: D:/a/fluentd/fluentd/test/plugin/test_output_as_buffered_backup.rb:40:in `write'
2021-04-02T11:52:26.6047438Z Error: test: backup chunk with different type secondary(BufferedOutputBackupTest::buffered output for broken chunks): Timeout::Error: execution expired
2021-04-02T11:52:26.6048928Z   2016-04-13 18:33:32 -0700 [warn]: D:/a/fluentd/fluentd/lib/fluent/plugin/output.rb:1136:in `try_flush'
2021-04-02T11:52:26.6049695Z <internal:dir>:135:in `glob'
2021-04-02T11:52:26.6050461Z   2016-04-13 18:33:32 -0700 [warn]: D:/a/fluentd/fluentd/lib/fluent/plugin/output.rb:1448:in `flush_thread_run'
2021-04-02T11:52:26.6051309Z D:/a/fluentd/fluentd/test/plugin/test_output_as_buffered_backup.rb:175:in `block in wait_flush'
2021-04-02T11:52:26.6052298Z   2016-04-13 18:33:32 -0700 [warn]: D:/a/fluentd/fluentd/lib/fluent/plugin/output.rb:462:in `block (2 levels) in start'
2021-04-02T11:52:26.6053154Z D:/a/fluentd/fluentd/test/plugin/test_output_as_buffered_backup.rb:112:in `block in waiting'
2021-04-02T11:52:26.6054631Z   2016-04-13 18:33:32 -0700 [warn]: D:/a/fluentd/fluentd/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2021-04-02T11:52:26.6056004Z C:/hostedtoolcache/windows/Ruby/3.0.0/x64/lib/ruby/3.0.0/timeout.rb:112:in `timeout'
2021-04-02T11:52:26.6057731Z 2016-04-13 18:33:32 -0700 [warn]: [backup_test_with_diff_secondary] got an error in secondary for unrecoverable error error_class=Fluent::UnrecoverableError error="yay, your secondary #write must fail"
2021-04-02T11:52:26.6059325Z D:/a/fluentd/fluentd/test/plugin/test_output_as_buffered_backup.rb:111:in `waiting'
2021-04-02T11:52:26.6061401Z   2016-04-13 18:33:32 -0700 [warn]: D:/a/fluentd/fluentd/test/plugin/test_output_as_buffered_backup.rb:298:in `block (4 levels) in <class:BufferedOutputBackupTest>'
2021-04-02T11:52:26.6062488Z D:/a/fluentd/fluentd/test/plugin/test_output_as_buffered_backup.rb:173:in `wait_flush'
2021-04-02T11:52:26.6064082Z   2016-04-13 18:33:32 -0700 [warn]: D:/a/fluentd/fluentd/test/plugin/test_output_as_buffered_backup.rb:67:in `write'
2021-04-02T11:52:26.6065917Z D:/a/fluentd/fluentd/test/plugin/test_output_as_buffered_backup.rb:304:in `block (3 levels) in <class:BufferedOutputBackupTest>'
2021-04-02T11:52:26.6067542Z   2016-04-13 18:33:32 -0700 [warn]: D:/a/fluentd/fluentd/lib/fluent/plugin/output.rb:1165:in `rescue in try_flush'
2021-04-02T11:52:26.6068391Z D:/a/fluentd/fluentd/lib/fluent/system_config.rb:98:in `overwrite_system_config'
2021-04-02T11:52:26.6069532Z   2016-04-13 18:33:32 -0700 [warn]: D:/a/fluentd/fluentd/lib/fluent/plugin/output.rb:1116:in `try_flush'
2021-04-02T11:52:26.6071201Z D:/a/fluentd/fluentd/test/plugin/test_output_as_buffered_backup.rb:284:in `block (2 levels) in <class:BufferedOutputBackupTest>'
2021-04-02T11:52:26.6072622Z   2016-04-13 18:33:32 -0700 [warn]: D:/a/fluentd/fluentd/lib/fluent/plugin/output.rb:1448:in `flush_thread_run'

@kenhys
Copy link
Contributor Author

kenhys commented Apr 6, 2021

In addition, following error may be also Ruby 3.0 on Windows specific?

It seems so.

@kenhys
Copy link
Contributor Author

kenhys commented Apr 6, 2021

In addition, following error may be also Ruby 3.0 on Windows specific?

It seems so.

PS C:\work\fluentd\fluentd.work> ruby -Ilib .\test\plugin\test_output_as_buffered_backup.rb -v fails with f03611d and ruby 3.0.0p0 on windows.

Finished in 61.2189148 seconds.
-------------------------------------------------------------------------------
12 tests, 13 assertions, 0 failures, 10 errors, 0 pendings, 1 omissions, 0 notifications
9.09091% passed
-------------------------------------------------------------------------------
0.20 tests/s, 0.21 assertions/s

It depends whether wait_flush succeeds or not. (Timeout Error when target directory keeps empty)

@kenhys
Copy link
Contributor Author

kenhys commented Apr 6, 2021

#3316 will solve \test\plugin\test_output_as_buffered_backup.rb Timeout:Error #3263 (comment)

@kenhys
Copy link
Contributor Author

kenhys commented Apr 7, 2021

As #3316 was merged, the rest of TODO: fiddle related one.

@ashie
Copy link
Member

ashie commented Apr 22, 2021

TODO: fiddle related one.

Fix for ServerEngine: treasure-data/serverengine#112

@ashie
Copy link
Member

ashie commented May 24, 2021

I've released ServerEngine 2.2.4: https://rubygems.org/gems/serverengine/versions/2.2.4
Need to specify RUBYLIB to enable it like this: treasure-data/serverengine@6aad8b2

@ashie
Copy link
Member

ashie commented May 25, 2021

I'm trying serverengine 2.2.4 & fiddle 1.0.8 at ashie#14
Failed tests are passed, but it seems crash on certain tests.
It reproduces on both my local environment & GitHub Actions.

@ashie
Copy link
Member

ashie commented May 25, 2021

ashie added a commit to ashie/fluentd that referenced this issue May 25, 2021
On Ruby 3.0, we need to use fiddle 1.0.8 or later to retrieve correct
error code. In addition, we have to specify the path of fiddle by RUBYLIB
because RubyInstaller loads Ruby's bundled fiddle before initializing gem.
See also:
* ruby/fiddle#72
* https://bugs.ruby-lang.org/issues/17813
* https://github.com/oneclick/rubyinstaller2/blob/8225034c22152d8195bc0aabc42a956c79d6c712/lib/ruby_installer/build/dll_directory.rb

Fix fluent#3263

Signed-off-by: Takuro Ashie <ashie@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants