Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambry compaction and uploading very large files #2560

Open
Menolith opened this issue Aug 31, 2023 · 0 comments
Open

Ambry compaction and uploading very large files #2560

Menolith opened this issue Aug 31, 2023 · 0 comments

Comments

@Menolith
Copy link

I've fallen into a bit of a rabbit hole when trying to deal with large files in Ambry, and specifically trying to compact them out of logs properly. Namely, compaction does not seem to recognize deleted blobs if they were uploaded in parts.

To illustrate, I have a 3.8GB test file which is too large to be read into memory for uploading. Using the -T flag with CUrl, I can successfully upload the file into the system:
curl -i -H "x-ambry-service-id: CUrlUpload" -H "x-ambry-owner-id: whoami" -H "x-ambry-content-type: text/plain" -H "x-ambry-um-description: Demonstration File" http://localhost:1174/ -T ./a_very_large_file.txt -X POST

The blob can be retrieved with a GET as expected, but something seems to be wrong since if I use HEAD, the request-cost header claims that the blob is 0.0GB:
x-ambry-request-cost: WRITE_CAPACITY_UNIT=906.0; STORAGE_IN_GB=0.0

Despite this, deletion seems to work properly since further gets return a 410: Gone.
The problem arises when compaction tries to run, and the Ambry server simply says the following:
[2023-08-31 13:30:06,291] INFO StoreId: Partition[0]. DataDir: /home/meno/testambry/db/0. Capacity: 10737418240 is not eligible for compaction due to empty compaction details (com.github.ambry.store.CompactionManager$CompactionExecutor)

Thus, the blob stays in the log, ostensibly permanently.

The default CUrl command provided in the quickstart guide works (the request-cost shows as non-zero and compaction will run successfully) but that uses the --data-binary flag which requires the blob to be small enough to be fully read into memory, which is not possible for many of the blobs I'm working with.
I've also tried using the requests library in Python to upload the file without CUrl, but that runs into the same issues with large files.

Additionally, this may or may not be related, all of the upload operations (whether using --data-binary, -T or the Python script) result in cryptic quota errors:

[2023-08-31 13:36:40,513] WARN Could not get recommendation for quota [READ_CAPACITY_UNIT, WRITE_CAPACITY_UNIT] due to exception: Could not recommend for request / with resourceid -1 due to Couldn't find quota for resource: -1 (com.github.ambry.quota.AmbryQuotaManager)
[2023-08-31 13:36:41,180] WARN Exception com.github.ambry.quota.QuotaException: Could not charge for request DefaultHttpRequest(decodeResult: success, version: HTTP/1.1)
POST / HTTP/1.1
Host: localhost:1174
User-Agent: curl/7.81.0
Accept: */*
x-ambry-service-id: CUrlUpload
x-ambry-owner-id: whoami
x-ambry-content-type: text/plain
x-ambry-um-description: Demonstration File
Content-Type: application/x-www-form-urlencoded
content-length: 385159 due to Couldn't find quota for resource: -1 while charging for [READ_CAPACITY_UNIT, WRITE_CAPACITY_UNIT] quotas. (com.github.ambry.quota.AmbryQuotaManager)
[2023-08-31 13:36:41,209] WARN Could not get recommendation for quota [READ_CAPACITY_UNIT, WRITE_CAPACITY_UNIT] due to exception: Could not recommend for request / with resourceid -1 due to Couldn't find quota for resource: -1 (com.github.ambry.quota.AmbryQuotaManager)

These errors repeat by the hundreds for large files. I don't know what to make of this, as the upload/download operations still work just fine—it's just the compaction that causes issues. Any insights?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant