Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Indexnode is oomkilled when concurrent upserting and searching #33005

Closed
1 task done
ThreadDao opened this issue May 13, 2024 · 13 comments
Closed
1 task done

[Bug]: Indexnode is oomkilled when concurrent upserting and searching #33005

ThreadDao opened this issue May 13, 2024 · 13 comments
Assignees
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@ThreadDao
Copy link
Contributor

ThreadDao commented May 13, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:2.4-20240509-263d3cc9
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka):  pulsar  
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus 2.4.0rc53
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

  1. deploy milvus with config
    indexNode:
      replicas: 2
      resources:
        limits:
          cpu: "8"
          memory: 8Gi
        requests:
          cpu: "4"
          memory: 4Gi
  config:
    dataCoord:
      enableActiveStandby: true
      segment:
        enableLevelZero: true
    indexCoord:
      enableActiveStandby: true
    log:
      level: debug
    queryCoord:
      enableActiveStandby: true
    quotaAndLimits:
      flushRate:
        enabled: true
        max: 0.1
    rootCoord:
      enableActiveStandby: true
    trace:
      exporter: jaeger
      jaeger:
        url: http://tempo-distributor.tempo:14268/api/traces
      sampleFraction: 1
  1. create collection with 2 shards -> build HNSW index 'M': 8,'efConstruction': 200} -> insert 30m-128d data -> flush -> index again -> load
  2. concurrent: upsert + flush + search
    image
  3. two indexNodes oomkilled many times
level-zero-upsert-op-92-2277-milvus-indexnode-b657cf956-4rhb6     1/1     Running            706 (9m46s ago)   3d12h   10.104.23.173   4am-node27   <none>           <none>
level-zero-upsert-op-92-2277-milvus-indexnode-b657cf956-phctq     1/1     Running            703 (2m56s ago)   3d12h   10.104.21.117   4am-node24   <none>           <none>

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

level-zero-upsert-op-92-2277-etcd-0                               1/1     Running            0                 3d13h   10.104.27.56    4am-node31   <none>           <none>
level-zero-upsert-op-92-2277-etcd-1                               1/1     Running            0                 3d13h   10.104.16.203   4am-node21   <none>           <none>
level-zero-upsert-op-92-2277-etcd-2                               1/1     Running            0                 3d13h   10.104.24.210   4am-node29   <none>           <none>
level-zero-upsert-op-92-2277-milvus-datanode-584cf4b79c-9wql8     1/1     Running            0                 3d12h   10.104.27.71    4am-node31   <none>           <none>
level-zero-upsert-op-92-2277-milvus-datanode-584cf4b79c-ccq6h     1/1     Running            0                 3d12h   10.104.30.141   4am-node38   <none>           <none>
level-zero-upsert-op-92-2277-milvus-indexnode-b657cf956-4rhb6     1/1     Running            706 (9m46s ago)   3d12h   10.104.23.173   4am-node27   <none>           <none>
level-zero-upsert-op-92-2277-milvus-indexnode-b657cf956-phctq     1/1     Running            703 (2m56s ago)   3d12h   10.104.21.117   4am-node24   <none>           <none>
level-zero-upsert-op-92-2277-milvus-mixcoord-578db558d4-kz9hr     1/1     Running            0                 3d12h   10.104.27.72    4am-node31   <none>           <none>
level-zero-upsert-op-92-2277-milvus-proxy-6545c9f4b6-qhbmd        1/1     Running            0                 3d12h   10.104.16.222   4am-node21   <none>           <none>
level-zero-upsert-op-92-2277-milvus-querynode-0-55c44868d75zs58   1/1     Running            0                 3d12h   10.104.4.232    4am-node11   <none>           <none>
level-zero-upsert-op-92-2277-milvus-querynode-0-55c44868d77rqvk   1/1     Running            0                 3d12h   10.104.23.174   4am-node27   <none>           <none>
level-zero-upsert-op-92-2277-milvus-querynode-0-55c44868d7b7jdf   1/1     Running            0                 3d12h   10.104.24.225   4am-node29   <none>           <none>
level-zero-upsert-op-92-2277-milvus-querynode-0-55c44868d7djc6n   1/1     Running            0                 3d12h   10.104.6.112    4am-node13   <none>           <none>
level-zero-upsert-op-92-2277-minio-0                              1/1     Running            0                 3d13h   10.104.16.204   4am-node21   <none>           <none>
level-zero-upsert-op-92-2277-minio-1                              1/1     Running            0                 3d13h   10.104.34.135   4am-node37   <none>           <none>
level-zero-upsert-op-92-2277-minio-2                              1/1     Running            0                 3d13h   10.104.24.212   4am-node29   <none>           <none>
level-zero-upsert-op-92-2277-minio-3                              1/1     Running            0                 3d13h   10.104.30.129   4am-node38   <none>           <none>
level-zero-upsert-op-92-2277-pulsar-bookie-0                      1/1     Running            0                 3d13h   10.104.24.213   4am-node29   <none>           <none>
level-zero-upsert-op-92-2277-pulsar-bookie-1                      1/1     Running            0                 3d13h   10.104.34.137   4am-node37   <none>           <none>
level-zero-upsert-op-92-2277-pulsar-bookie-2                      1/1     Running            0                 3d13h   10.104.27.61    4am-node31   <none>           <none>
level-zero-upsert-op-92-2277-pulsar-bookie-init-pmpnr             0/1     Completed          0                 3d13h   10.104.6.88     4am-node13   <none>           <none>
level-zero-upsert-op-92-2277-pulsar-broker-0                      1/1     Running            0                 3d13h   10.104.6.91     4am-node13   <none>           <none>
level-zero-upsert-op-92-2277-pulsar-proxy-0                       1/1     Running            0                 3d13h   10.104.17.132   4am-node23   <none>           <none>
level-zero-upsert-op-92-2277-pulsar-pulsar-init-znkjf             0/1     Completed          0                 3d13h   10.104.6.89     4am-node13   <none>           <none>
level-zero-upsert-op-92-2277-pulsar-recovery-0                    1/1     Running            0                 3d13h   10.104.19.83    4am-node28   <none>           <none>
level-zero-upsert-op-92-2277-pulsar-zookeeper-0                   1/1     Running            0                 3d13h   10.104.18.163   4am-node25   <none>           <none>
level-zero-upsert-op-92-2277-pulsar-zookeeper-1                   1/1     Running            0                 3d13h   10.104.16.214   4am-node21   <none>           <none>
level-zero-upsert-op-92-2277-pulsar-zookeeper-2                   1/1     Running            0                 3d13h   10.104.17.152   4am-node23   <none>           <none>

Anything else?

No response

@ThreadDao ThreadDao added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 13, 2024
@ThreadDao ThreadDao added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label May 13, 2024
@ThreadDao ThreadDao added this to the 2.4.2 milestone May 13, 2024
@ThreadDao ThreadDao added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 13, 2024
@ThreadDao
Copy link
Contributor Author

@XuanYang-cn
image

@ThreadDao
Copy link
Contributor Author

The same issue on test scene: create collection with 2 shards and partition-key field, 32 num_partitions -> hnsw index M: 8 efConstruction: 200 -> insert 30m-128d data -> index again and index node oomkilled

level-zero-upsert-op-92-2277-milvus-datanode-584cf4b79c-9wql8     1/1     Running            0                 3d12h   10.104.27.71    4am-node31   <none>           <none>
level-zero-upsert-op-92-2277-milvus-datanode-584cf4b79c-ccq6h     1/1     Running            0                 3d12h   10.104.30.141   4am-node38   <none>           <none>
level-zero-upsert-op-92-2277-milvus-indexnode-b657cf956-4rhb6     1/1     Running            706 (9m46s ago)   3d12h   10.104.23.173   4am-node27   <none>           <none>
level-zero-upsert-op-92-2277-milvus-indexnode-b657cf956-phctq     1/1     Running            703 (2m56s ago)   3d12h   10.104.21.117   4am-node24   <none>           <none>
level-zero-upsert-op-92-2277-milvus-mixcoord-578db558d4-kz9hr     1/1     Running            0                 3d12h   10.104.27.72    4am-node31   <none>           <none>
level-zero-upsert-op-92-2277-milvus-proxy-6545c9f4b6-qhbmd        1/1     Running            0                 3d12h   10.104.16.222   4am-node21   <none>           <none>
level-zero-upsert-op-92-2277-milvus-querynode-0-55c44868d75zs58   1/1     Running            0                 3d12h   10.104.4.232    4am-node11   <none>           <none>
level-zero-upsert-op-92-2277-milvus-querynode-0-55c44868d77rqvk   1/1     Running            0                 3d12h   10.104.23.174   4am-node27   <none>           <none>
level-zero-upsert-op-92-2277-milvus-querynode-0-55c44868d7b7jdf   1/1     Running            0                 3d12h   10.104.24.225   4am-node29   <none>           <none>
level-zero-upsert-op-92-2277-milvus-querynode-0-55c44868d7djc6n   1/1     Running            0                 3d12h   10.104.6.112    4am-node13   <none>           <none>

@yanliang567 yanliang567 removed their assignment May 13, 2024
@XuanYang-cn
Copy link
Contributor

XuanYang-cn commented May 13, 2024

The segment causing indexnode OOM is compacted as expected, the sum of binlogSize < 1G

Log Size: 19569012       Entry Num: 125299       TimeRange:2024-05-09 23:43:50.752 +0800-2024-05-09 23:44:01.402 +0800
Path: ROOT_PATH/insert_log/449647020700604643/449647020700604644/449647020700807478/101/449647020700807909
Log Size: 14532680       Entry Num: 92466        TimeRange:2024-05-09 23:44:01.402 +0800-2024-05-09 23:44:07.001 +0800
--- Field Log Size: 967.335433 MB
=== Segment Total Binlog Size:  1006.485847 MB
**************************************

@XuanYang-cn
Copy link
Contributor

Segment ID: 449647020700807478
Segment State: Flushed  Segment Level: L1
Collection ID: 449647020700604643               PartitionID: 449647020700604644
Insert Channel:level-zero-upsert-op-92-2277-rootcoord-dml_0_449647020700604643v0
Num of Rows: 6482715            Max Row Num: 2003249
Last Expire Time: 2024-05-10 00:03:10.001 +0800

Two questions

  1. 1G binlog files == 6482715 numRows, way larger than expectedMaxSize(2003249), why?
  2. 8G indexnode paniked at 1G segment binlogs, is there a 8 times fanout for indexnode to build indexes?

@xiaofan-luan
Copy link
Contributor

Segment ID: 449647020700807478
Segment State: Flushed  Segment Level: L1
Collection ID: 449647020700604643               PartitionID: 449647020700604644
Insert Channel:level-zero-upsert-op-92-2277-rootcoord-dml_0_449647020700604643v0
Num of Rows: 6482715            Max Row Num: 2003249
Last Expire Time: 2024-05-10 00:03:10.001 +0800

Two questions

  1. 1G binlog files == 6482715 numRows, way larger than expectedMaxSize(2003249), why?
  2. 8G indexnode paniked at 1G segment binlogs, is there a 8 times fanout for indexnode to build indexes?

8GB index failed on 1GB segment might be expeceted

usually we see 8 times memory usage

@xiaofan-luan
Copy link
Contributor

should we revert the segment size change and keep it 512MB?

@xiaocai2333
Copy link
Contributor

Path: files/insert_log/449647020700604643/449647020700604644/449647020700807478/101/449647020700807909
Log Size: 14532680       Entry Num: 92466        TimeRange:2024-05-09 23:44:01.402 +0800-2024-05-09 23:44:07.001 +0800
--- Field Log Size: 967.335433 MB
=== Segment Total Binlog Size:  1006.485847 MB

from the binlog size info in birdwatcher, We got almost 3.25x compression ratio, is this normal?

>>> 92466*128*4/14532680
3.257664243621961

@xiaocai2333
Copy link
Contributor

If a compression ratio of 3.25 times is normal, then we probably should not use binlog size to determine the compaction strategy.

@xiaocai2333
Copy link
Contributor

In my local test, the calculated compression ratio is 1.

Path: files/insert_log/449730971532001323/449730971532001324/449730971532201366/101/449730971532201376
Log Size: 51200004       Entry Num: 100000       TimeRange:2024-05-13 16:05:18.699 +0800-2024-05-13 16:05:18.699 +0800

@xiaocai2333
Copy link
Contributor

I tested the case where the contents of the vector are all 0, and found that the binlog size after compaction is indeed much smaller than the data size.
And notice that the binlog size of insert segment uses the memory size, not the file size. The binlog size of compacted segment uses the file size.

segment: **449730971534417859**
Field 101:
Path: files/insert_log/449730971534217828/449730971534217829/449730971534417859/101/449730971532201506
Log Size: 51200004       Entry Num: 100000       TimeRange:2024-05-13 16:59:53.849 +0800-2024-05-13 16:59:53.849 +0800
--- Field Log Size: 48.828129 MB


segment: **449730971532201555**
Field 101:
Path: files/insert_log/449730971534217828/449730971534217829/449730971532201555/101/449730971532201557
Log Size: 2445   Entry Num: 125299       TimeRange:2024-05-13 16:59:53.849 +0800-2024-05-13 17:00:13.099 +0800
Path: files/insert_log/449730971534217828/449730971534217829/449730971532201555/101/449730971532201561
Log Size: 2445   Entry Num: 125299       TimeRange:2024-05-13 17:00:05.449 +0800-2024-05-13 17:00:13.099 +0800
Path: files/insert_log/449730971534217828/449730971534217829/449730971532201555/101/449730971532201565
Log Size: 2445   Entry Num: 49402        TimeRange:2024-05-13 17:00:05.449 +0800-2024-05-13 17:00:05.449 +0800

segment 449730971532201555 compacted from 449730971534417859 and another two segment.

XuanYang-cn added a commit to XuanYang-cn/milvus that referenced this issue May 13, 2024
See also: milvus-io#33005

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
@XuanYang-cn
Copy link
Contributor

/unassign

sre-ci-robot pushed a commit that referenced this issue May 15, 2024
issue: #33005
1. add `MemorySize` field for insert binlog.
2. `LogSize` means the file size in the storage object.
3. `MemorySize` means the size of the data in the memory.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
@xiaocai2333
Copy link
Contributor

fix pr has been merged, please verify.
/assign @ThreadDao

@ThreadDao
Copy link
Contributor Author

fixed 2.4-20240515-b2d83d33
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants