Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function not found EXCEPTION: [Errno 38] when using helpers.parallel_bulk in aws lambda #2319

Open
Bryson14 opened this issue Sep 28, 2023 · 3 comments

Comments

@Bryson14
Copy link

Describe the feature:

Elasticsearch version (bin/elasticsearch --version):

elasticsearch-py version (elasticsearch.__versionstr__):
elasticsearch==7.19.9
Please make sure the major version matches the Elasticsearch server you are running.

Description of the problem including expected versus actual behavior:
We were using elasticsearch helper parallel_bulk inside of a AWS lambda function. We were running with the same version of elasticsearch, but on python runtime 3.7.

Now that we upgraded to python runtime 3.11, I get this error when it tries to execute parallel_bulk:

EXCEPTION: [Errno 38] Function not implemented
It might be because lambda doesn't allow for the use of some of the python multiprocessing pacakge.

Steps to reproduce:
Try uploading document to elasticsearch from AWS lambda python runtime 3.11 using the parallel_bulk helper

@pquentin
Copy link
Member

pquentin commented Oct 2, 2023

Thanks for the report. We may want to try to make this work on AWS Lambda, but I'm confused, how could this work on Python 3.7 since multiprocessing.pool.ThreadPool does not work on AWS Lambda?

Also, can you please share the full exception/traceback?

@pquentin
Copy link
Member

Closing, but I'll reopen if I get more details. Thank you!

@pquentin pquentin closed this as not planned Won't fix, can't repro, duplicate, stale Nov 30, 2023
@pquentin
Copy link
Member

Got another report that this indeed fails starting with Python 3.8, and the links above give possible workarounds to support AWS Lambda. I also now understand that the reason it works on Python 3.7: Python 3.8 and above use SemLock which isn't supported by AWS Lambda.

We still want to use the faster ThreadPool when possible, but fallback to Pipe when it's not available.

This is very unlikely to be backported to elasticsearch-py 7.x, but should be available in a later elasticsearch-py 8.x version. (The migration path is easier now, with changes to the body parameter that went in 8.12.)

@pquentin pquentin reopened this Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants