Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oracledb.connect_async hangs forever #320

Open
kisv-SOPTIM opened this issue Apr 3, 2024 · 7 comments
Open

oracledb.connect_async hangs forever #320

kisv-SOPTIM opened this issue Apr 3, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@kisv-SOPTIM
Copy link

  1. What versions are you using?

python-oracledb 2.1.1
python 3.9.16

  1. Is it an error or a hang or a crash?

hang

  1. What error(s) or behavior you are seeing?

Calling oracledb.connect_async hangs forever. We only experience this issue in two of three (identical) environments, the third one works normally. Other software components can connect to the database in the affected environment normally.

Please see the following log: https://gist.github.com/kisv-SOPTIM/18d91a699f6050dbadfd6da50bc12407

  1. Does your application call init_oracle_client()?

no

Any hint regarding a possible cause is highly appreciated.

@kisv-SOPTIM kisv-SOPTIM added the bug Something isn't working label Apr 3, 2024
@anthony-tuininga
Copy link
Member

One possibility: the database you are connecting to has NNE enabled. Can you check with the queries mentioned in this link? It is possible that the code for asyncio isn't addressing that correctly.

@kisv-SOPTIM
Copy link
Author

Thanks for the quick response!

NNE is not enabled.

@cjbj
Copy link
Member

cjbj commented Apr 8, 2024

Can you set PYO_DEBUG_PACKETS and provide some traces to @anthony-tuininga ? See https://python-oracledb.readthedocs.io/en/latest/user_guide/tracing.html#low-level-python-oracledb-driver-tracing and various other references in Issues and Discussions.

@kisv-SOPTIM
Copy link
Author

Can you set PYO_DEBUG_PACKETS and provide some traces to @anthony-tuininga ? See https://python-oracledb.readthedocs.io/en/latest/user_guide/tracing.html#low-level-python-oracledb-driver-tracing and various other references in Issues and Discussions.

The gist linked in my first comment already contains a log excerpt with PYO_DEBUG_PACKETS enabled. Do you require any further debugging information?

Please note that the stack trace at the end of the log is caused by canceling the task externally in an attempt to work around the issue.

@anthony-tuininga
Copy link
Member

A few more questions, the answers to which may be helpful:

  • do you have any difficulty connecting using a regular (not asyncio) connection?
  • does the failure occur consistently in this environment? Or is it intermittent?
  • which "other software components" connect normally?
  • what database versions are being used?
  • are you familiar with getting server traces? It looks like the server is not responding (the wait is occurring right after the handoff from the listener to the database)
  • are you familiar with Wireshark? You can see whether or not the server is responding but perhaps not responding the way the driver is expecting
  • have you tried with thick mode?

@kisv-SOPTIM
Copy link
Author

Some general context first: We only experience this problem in our staging and production environment, but neither in our identical test environment, nor in our development environments. The affected software component is part of a project in a highly regulated industry. We only act as the software provider for the project, and the infrastructure, including the database server, is managed by a third party. We only have very limited access to the staging and prod environments, and every change we want to deploy there requires a very long lead time, which makes low level debugging nearly impossible.

  • do you have any difficulty connecting using a regular (not asyncio) connection?

We originally used the non async connection variant before switching to async, and expirienced no problems with it. We cant easily verify that this would still work today tho.

  • does the failure occur consistently in this environment? Or is it intermittent?

The failure occurs consistently in the two affected environments, even with different users/schemas

  • which "other software components" connect normally?

We have about 40 services written in java which can connect normally

  • what database versions are being used?

select BANNER_FULL from v$version
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.11.0.0.0

The database is an active/standby dataguard cluster. Our python application tries to connect to each of the two nodes separately, to determine their status. We get an (expected) ORA-12514 error from the inactive one, and experience the hang problem on the active one.

  • are you familiar with getting server traces? It looks like the server is not responding (the wait is occurring right after the handoff from the listener to the database)

No, because of our situation described above, I don't think we will get access to them.

  • are you familiar with Wireshark? You can see whether or not the server is responding but perhaps not responding the way the driver is expecting

We can't easily perform packet captures in these environments.

  • have you tried with thick mode?

No

@anthony-tuininga
Copy link
Member

Hmm, that sounds like debugging this issue could be a nightmare! Based on the traces you gave earlier it seems like the server has either given up or has sent a partial packet for some reason. If you can't perform a packet capture you could add a print() statement to the data_received() method of the BaseAsyncProtocol class -- just to see if any data is actually being received! If you are able to try thick mode, that might also be informative. And if you can verify that a regular connection still works, then I think there is something haywire with the asyncio implementation -- that would be helpful as well. Hopefully you can do some of this testing so we can track down what is causing this issue for you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants