You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that issue occurs when the size of the index is larger than 1G, and from what we have been able to find, it seems that the segments blocks are coming from Thanos-IO which Mimir uses to talk to swift:
Check this
According to this: By default, OpenStack Swift has a limit for maximum file size of 5 GiB. Thanos index files are often larger than that. To resolve this issue, Thanos uses Static Large Objects (SLO) which are uploaded as segments. These are by default put into the segments directory of the same container. The default limit for using SLO is 1 GiB which is also the maximum size of the segment. If you don't want to use the same container for the segments (best practise is to use <container_name>_segments to avoid polluting listing of the container objects) you can use the large_file_segments_container_name option to override the default and put the segments to other container. In rare cases you can switch to Dynamic Large Objects (DLO) by setting the use_dynamic_large_objects to true, but use it with caution since it even more relies on eventual consistency.
To overcome the problem we have introduced grouping and sharding in our compactors, which seems to reduce the size of the indexes. We have not been using sharding until now, as we have ~4.5M metrics, and the recommendation is to have 1 group per 8M metrics.
Please note that by using openstack swiftclient we can upload files larger than 1GB in a few seconds, so the issue seems not to be on the storage side.
To Reproduce
Integrate Mimir with swift object storage.
Try to compact blocks whose resulting index will be greater than 1GB.
Expected behavior
Files with size greater than 1GB should be succesfully uploaded to swift object storage.
Describe the bug
We noticed that the compactor is failing to compact two specific 12hours blocks into a 24hour block with the error below:
At the same time, objects with
segment
prefix are being created in our swift object storage:We tried increasing the request_timeout, but it didn't help, we just got a different error:
It seems that issue occurs when the size of the index is larger than 1G, and from what we have been able to find, it seems that the segments blocks are coming from Thanos-IO which Mimir uses to talk to swift:
Check this
According to this:
By default, OpenStack Swift has a limit for maximum file size of 5 GiB. Thanos index files are often larger than that. To resolve this issue, Thanos uses Static Large Objects (SLO) which are uploaded as segments. These are by default put into the
segments
directory of the same container. The default limit for using SLO is 1 GiB which is also the maximum size of the segment. If you don't want to use the same container for the segments (best practise is to use<container_name>_segments
to avoid polluting listing of the container objects) you can use thelarge_file_segments_container_name
option to override the default and put the segments to other container. In rare cases you can switch to Dynamic Large Objects (DLO) by setting theuse_dynamic_large_objects
to true, but use it with caution since it even more relies on eventual consistency.To overcome the problem we have introduced grouping and sharding in our compactors, which seems to reduce the size of the indexes. We have not been using sharding until now, as we have ~4.5M metrics, and the recommendation is to have 1 group per 8M metrics.
Please note that by using openstack swiftclient we can upload files larger than 1GB in a few seconds, so the issue seems not to be on the storage side.
To Reproduce
Integrate Mimir with swift object storage.
Try to compact blocks whose resulting index will be greater than 1GB.
Expected behavior
Files with size greater than 1GB should be succesfully uploaded to swift object storage.
Environment
Additional Context
Compactor Logs
Source blocks
The text was updated successfully, but these errors were encountered: