Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RESOURCE_EXHAUSTED: Maximum length exceeded #493

Open
pieterjanpintens opened this issue Sep 19, 2022 · 1 comment
Open

RESOURCE_EXHAUSTED: Maximum length exceeded #493

pieterjanpintens opened this issue Sep 19, 2022 · 1 comment

Comments

@pieterjanpintens
Copy link

pieterjanpintens commented Sep 19, 2022

We are seeing these errors like this in our logs from fluentd.

2022-09-18 21:50:00.944377087 +0000 fluent.warn: {"error":"3:Data decompression failed with decompression status: RESOURCE_EXHAUSTED: Maximum length exceeded: 10485760; at byte 755395; at uncompressed byte 10485760. debug_error_string:{\"created\":\"@1663537800.943751927\",\"description\":\"Error received from peer ipv4:216.239.34.174:443\",\"file\":\"src/core/lib/surface/call.cc\",\"file_line\":905,\"grpc_message\":\"Data decompression failed with decompression status: RESOURCE_EXHAUSTED: Maximum length exceeded: 10485760; at byte 755395; at uncompressed byte 10485760\",\"grpc_status\":3}","error_code":"3","message":"Dropping 4805 log message(s) error=\"3:Data decompression failed with decompression status: RESOURCE_EXHAUSTED: Maximum length exceeded: 10485760; at byte 755395; at uncompressed byte 10485760. debug_error_string:{\\\"created\\\":\\\"@1663537800.943751927\\\",\\\"description\\\":\\\"Error received from peer ipv4:216.239.34.174:443\\\",\\\"file\\\":\\\"src/core/lib/surface/call.cc\\\",\\\"file_line\\\":905,\\\"grpc_message\\\":\\\"Data decompression failed with decompression status: RESOURCE_EXHAUSTED: Maximum length exceeded: 10485760; at byte 755395; at uncompressed byte 10485760\\\",\\\"grpc_status\\\":3}\" error_code=\"3\""}

Our setup is a batch like system that processes big log files from s3.
Out config is like this. We tried to set the buffer_chunk_limit low but it does not help.

<match **>
    @type google_cloud
    @log_level debug
    # prevents errors in logs,it will fail anyway
    use_metadata_service false
    label_map {
      "environment": "environment",
      "project": "project",
      "branch": "branch",
      "function": "function",
      "program": "program",
      "stream": "log"
    }
    # Set the chunk limit conservatively to avoid exceeding the recommended
    # chunk size of 10MB per write request. The API request size can be a few
    # times bigger than the raw log size.
    buffer_chunk_limit 512KB
    # Flush logs every 5 seconds, even if the buffer is not full.
    flush_interval 5s
    # Enforce some limit on the number of retries.
    disable_retry_limit false
    # After 3 retries, a given chunk will be discarded.
    retry_limit 3
    # Wait 10 seconds before the first retry. The wait interval will be doubled on
    # each following retry (20s, 40s...) until it hits the retry limit.
    retry_wait 10
    # Never wait longer than 5 minutes between retries. If the wait interval
    # reaches this limit, the exponentiation stops.
    # Given the default config, this limit should never be reached, but if
    # retry_limit and retry_wait are customized, this limit might take effect.
    max_retry_wait 300
    # Use multiple threads for processing.
    num_threads 8
    # Use the gRPC transport.
    use_grpc true
    # Try to limit the size of the uploaded data
    grpc_compression_algorithm gzip
    # If a request is a mix of valid log entries and invalid ones, ingest the
    # valid ones and drop the invalid ones instead of dropping everything.
    partial_success true
    <buffer>
      @type memory
      timekey 60
      timekey_wait 10
      overflow_action block
    </buffer>
</match>

Looking futher down the line it seems that you can specify a channel option on grcp channel: GRPC_ARG_MAX_SEND_MESSAGE_LENGTH. Reading about it I wonder if setting this option would solve this problem?
It currently is not exposed to the fluentd config. By default it is set to -1? Not sure if grcp would split the message or if it would just turn the server error into a client error...

We are looking for guidance on how we should proceed

@pieterjanpintens
Copy link
Author

Looking at the code it seems that log entries are bundled per tag before sending them out. Would it make sense to set a limit on the number of entries in each send operation and split the entries over multiple send operations when needed? I think this allows to limit the outgoing message size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant