Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [benchmark][stadnalone][LRU] hybrid_search failed and querynode restart in diskann case #33007

Open
1 task done
wangting0128 opened this issue May 13, 2024 · 0 comments
Assignees
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@wangting0128
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: milvus-io-lru-dev-9234a94-20240506
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka):  pulsar  
- SDK version(e.g. pymilvus v2.0.0rc2): 2.4.0rc66
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: lru-fouramf-4c5j7
test case name: test_hybrid_search_locust_shard1_float_dql_diskann_standalone

server:

NAME                                                              READY   STATUS        RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
lru-scene17-etcd-0                                                1/1     Running       0               4m27s   10.104.19.219   4am-node28   <none>           <none>
lru-scene17-milvus-standalone-86878b77d6-dmdlk                    1/1     Running       0               4m27s   10.104.26.87    4am-node32   <none>           <none>
lru-scene17-minio-985878f45-khn9g                                 1/1     Running       0               4m27s   10.104.19.218   4am-node28   <none>           <none>
lru-scene17-minio-update-prometheus-secret-8cgbf                  0/1     Completed     0               28s     10.104.9.33     4am-node14   <none>           <none>
lru-scene17-pulsar-bookie-0                                       1/1     Running       0               4m27s   10.104.21.182   4am-node24   <none>           <none>
lru-scene17-pulsar-bookie-1                                       1/1     Running       0               4m27s   10.104.17.54    4am-node23   <none>           <none>
lru-scene17-pulsar-bookie-2                                       1/1     Running       0               4m27s   10.104.30.124   4am-node38   <none>           <none>
lru-scene17-pulsar-bookie-init-8847w                              0/1     Completed     0               4m27s   10.104.5.192    4am-node12   <none>           <none>
lru-scene17-pulsar-broker-0                                       1/1     Running       0               4m27s   10.104.5.191    4am-node12   <none>           <none>
lru-scene17-pulsar-proxy-0                                        1/1     Running       0               4m27s   10.104.9.25     4am-node14   <none>           <none>
lru-scene17-pulsar-pulsar-init-j6b8r                              0/1     Completed     0               4m27s   10.104.5.190    4am-node12   <none>           <none>
lru-scene17-pulsar-recovery-0                                     1/1     Running       0               4m27s   10.104.6.27     4am-node13   <none>           <none>
lru-scene17-pulsar-zookeeper-0                                    1/1     Running       0               4m27s   10.104.17.53    4am-node23   <none>           <none>
lru-scene17-pulsar-zookeeper-1                                    1/1     Running       0               3m46s   10.104.33.100   4am-node36   <none>           <none>
lru-scene17-pulsar-zookeeper-2                                    1/1     Running       0               3m11s   10.104.19.230   4am-node28   <none>           <none>
截屏2024-05-13 14 23 48 截屏2024-05-13 14 25 04

client log:
截屏2024-05-13 14 22 55

test result:

[2024-05-10 07:09:06,831 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-05-10 07:09:06,831 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-05-10 07:09:06,831 -  INFO - fouram]: grpc     hybrid_search                                                                    740 740(100.00%) |  96626   90001  153830  91000 |    0.21        0.21 (stats.py:789)
[2024-05-10 07:09:06,832 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-05-10 07:09:06,832 -  INFO - fouram]:          Aggregated                                                                       740 740(100.00%) |  96626   90001  153830  91000 |    0.21        0.21 (stats.py:789)
[2024-05-10 07:09:06,832 -  INFO - fouram]:  (stats.py:790)
[2024-05-10 07:09:06,835 -  INFO - fouram]: [PerfTemplate] Report data: 
{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'standalone',
            'config_name': 'standalone_16c64m',
            'config': {'standalone': {'resources': {'limits': {'cpu': '2',
                                                               'memory': '8Gi',
                                                               'ephemeral-storage': '70Gi'},
                                                    'requests': {'cpu': '2',
                                                                 'memory': '8Gi'}},
                                      'messageQueue': 'pulsar',
                                      'extraEnv': [{'name': 'LOCAL_STORAGE_SIZE',
                                                    'value': '70'}],
                                      'disk': {'size': {'enabled': True}}},
                       'cluster': {'enabled': False},
                       'etcd': {'replicaCount': 1,
                                'metrics': {'enabled': True,
                                            'podMonitor': {'enabled': True}}},
                       'minio': {'mode': 'standalone',
                                 'metrics': {'podMonitor': {'enabled': True}},
                                 'persistence': {'size': '320Gi'}},
                       'pulsar': {'enabled': True},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'extraConfigFiles': {'user.yaml': 'queryNode:\n'
                                                         '  '
                                                         'diskCacheCapacityLimit: '
                                                         '51539607552\n'
                                                         '  mmap:\n'
                                                         '    mmapEnabled: '
                                                         'true\n'
                                                         '  lazyloadEnabled: '
                                                         'true\n'
                                                         '  '
                                                         'useStreamComputing: '
                                                         'true\n'
                                                         '  cache:\n'
                                                         '    warmup: sync\n'
                                                         '  '
                                                         'lazyloadWaitTimeout: '
                                                         '300000\n'},
                       'image': {'all': {'repository': 'harbor.milvus.io/milvus/milvus',
                                         'tag': 'milvus-io-lru-dev-9234a94-20240506'}}},
            'host': 'lru-scene17-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_hybrid_search_locust_shard1_float_dql_diskann_standalone',
            'test_case_params': {'dataset_params': {'metric_type': 'L2',
                                                    'dim': 2048,
                                                    'max_length': 10,
                                                    'scalars_index': {'int64_1': {},
                                                                      'id': {'index_type': 'INVERTED'},
                                                                      'bool_3': {'index_type': 'INVERTED'}},
                                                    'vectors_index': {'float_vector_1': {'index_type': 'DISKANN',
                                                                                         'index_param': {},
                                                                                         'metric_type': 'L2'},
                                                                      'float_vector_2': {'index_type': 'DISKANN',
                                                                                         'index_param': {},
                                                                                         'metric_type': 'L2'},
                                                                      'float_vector_3': {'index_type': 'DISKANN',
                                                                                         'index_param': {},
                                                                                         'metric_type': 'L2'}},
                                                    'scalars_params': {'array_int8_1': {'params': {'max_capacity': 7}},
                                                                       'array_int16_1': {'params': {'max_capacity': 7}},
                                                                       'array_int32_1': {'params': {'max_capacity': 7}},
                                                                       'array_int64_1': {'params': {'max_capacity': 7}},
                                                                       'array_double_1': {'params': {'max_capacity': 7}},
                                                                       'array_float_1': {'params': {'max_capacity': 7}},
                                                                       'array_varchar_1': {'params': {'max_capacity': 7}},
                                                                       'array_bool_1': {'params': {'max_capacity': 7}},
                                                                       'array_int8_2': {'params': {'max_capacity': 7}},
                                                                       'array_int16_2': {'params': {'max_capacity': 7}},
                                                                       'array_int32_2': {'params': {'max_capacity': 7}},
                                                                       'array_int64_2': {'params': {'max_capacity': 7}},
                                                                       'array_double_2': {'params': {'max_capacity': 7}},
                                                                       'array_float_2': {'params': {'max_capacity': 7}},
                                                                       'array_varchar_2': {'params': {'max_capacity': 7}},
                                                                       'array_bool_2': {'params': {'max_capacity': 7}},
                                                                       'array_int8_3': {'params': {'max_capacity': 7}},
                                                                       'array_int16_3': {'params': {'max_capacity': 7}},
                                                                       'array_int32_3': {'params': {'max_capacity': 7}},
                                                                       'array_int64_3': {'params': {'max_capacity': 7}},
                                                                       'array_double_3': {'params': {'max_capacity': 7}},
                                                                       'array_float_3': {'params': {'max_capacity': 7}},
                                                                       'array_varchar_3': {'params': {'max_capacity': 7}},
                                                                       'array_bool_3': {'params': {'max_capacity': 7}}},
                                                    'dataset_name': 'local',
                                                    'dataset_size': 1500000,
                                                    'ni_per': 100},
                                 'collection_params': {'other_fields': ['float_vector_1',
                                                                        'float_vector_2',
                                                                        'float_vector_3',
                                                                        'int8_1',
                                                                        'int16_1',
                                                                        'int32_1',
                                                                        'int64_1',
                                                                        'double_1',
                                                                        'float_1',
                                                                        'varchar_1',
                                                                        'bool_1',
                                                                        'json_1',
                                                                        'array_int8_1',
                                                                        'array_int16_1',
                                                                        'array_int32_1',
                                                                        'array_int64_1',
                                                                        'array_double_1',
                                                                        'array_float_1',
                                                                        'array_varchar_1',
                                                                        'array_bool_1',
                                                                        'int8_2',
                                                                        'int16_2',
                                                                        'int32_2',
                                                                        'int64_2',
                                                                        'double_2',
                                                                        'float_2',
                                                                        'varchar_2',
                                                                        'bool_2',
                                                                        'json_2',
                                                                        'array_int8_2',
                                                                        'array_int16_2',
                                                                        'array_int32_2',
                                                                        'array_int64_2',
                                                                        'array_double_2',
                                                                        'array_float_2',
                                                                        'array_varchar_2',
                                                                        'array_bool_2',
                                                                        'int8_3',
                                                                        'int16_3',
                                                                        'int32_3',
                                                                        'int64_3',
                                                                        'double_3',
                                                                        'float_3',
                                                                        'varchar_3',
                                                                        'bool_3',
                                                                        'json_3',
                                                                        'array_int8_3',
                                                                        'array_int16_3',
                                                                        'array_int32_3',
                                                                        'array_int64_3',
                                                                        'array_double_3',
                                                                        'array_float_3',
                                                                        'array_varchar_3',
                                                                        'array_bool_3',
                                                                        'varchar_tail_1',
                                                                        'varchar_tail_2',
                                                                        'varchar_tail_3',
                                                                        'varchar_tail_4',
                                                                        'varchar_tail_5',
                                                                        'varchar_tail_6',
                                                                        'varchar_tail_7',
                                                                        'varchar_tail_8'],
                                                       'shards_num': 1},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False,
                                                          'reset_db': False},
                                 'index_params': {'index_type': 'DISKANN',
                                                  'index_param': {}},
                                 'concurrent_params': {'concurrent_number': 20,
                                                       'during_time': '1h',
                                                       'interval': 20,
                                                       'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'hybrid_search',
                                                       'weight': 1,
                                                       'params': {'nq': 1,
                                                                  'top_k': 100,
                                                                  'reqs': [{'search_param': {'search_list': 30},
                                                                            'anns_field': 'float_vector',
                                                                            'expr': 'id '
                                                                                    '> '
                                                                                    '150000',
                                                                            'top_k': 10},
                                                                           {'search_param': {'search_list': 100},
                                                                            'anns_field': 'float_vector_1',
                                                                            'expr': 'int64_1 '
                                                                                    '<= '
                                                                                    '1350000',
                                                                            'top_k': 50},
                                                                           {'search_param': {'search_list': 1500},
                                                                            'anns_field': 'float_vector_2',
                                                                            'expr': 'array_length(array_int8_2) '
                                                                                    '== '
                                                                                    '7',
                                                                            'top_k': 1000},
                                                                           {'search_param': {'search_list': 20000},
                                                                            'anns_field': 'float_vector_3',
                                                                            'expr': 'bool_3 '
                                                                                    '== '
                                                                                    'True',
                                                                            'top_k': 16384}],
                                                                  'rerank': {'RRFRanker': []},
                                                                  'output_fields': ['float_vector'],
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': 90,
                                                                  'random_data': True}}]},
            'run_id': 2024050683566738,
            'datetime': '2024-05-06 06:52:36.047869',
            'client_version': '2.2'},
 'result': {'test_result': {'index': {'RT': 91499.8555,
                                      'float_vector_1': {'RT': 108520.8358},
                                      'float_vector_2': {'RT': 126870.0384},
                                      'float_vector_3': {'RT': 6847.9264},
                                      'int64_1': {'RT': 1.0224},
                                      'id': {'RT': 1.0149},
                                      'bool_3': {'RT': 0.5121}},
                            'insert': {'total_time': 4024.4509,
                                       'VPS': 372.7217,
                                       'batch_time': 0.2683,
                                       'batch': 100},
                            'flush': {'RT': 2.5421},
                            'load': {'RT': 4.5266},
                            'Locust': {'Aggregated': {'Requests': 740,
                                                      'Fails': 740,
                                                      'RPS': 0.21,
                                                      'fail_s': 1.0,
                                                      'RT_max': 153830.53,
                                                      'RT_avg': 96626.43,
                                                      'TP50': 91000.0,
                                                      'TP99': 154000.0},
                                       'hybrid_search': {'Requests': 740,
                                                         'Fails': 740,
                                                         'RPS': 0.21,
                                                         'fail_s': 1.0,
                                                         'RT_max': 153830.53,
                                                         'RT_avg': 96626.43,
                                                         'TP50': 91000.0,
                                                         'TP99': 154000.0}}}}}

Expected Behavior

No response

Steps To Reproduce

concurrent test and calculation of RT and QPS

        :purpose:  `shard_num=1, float_vector DQL`
            verify concurrent DQL scenario which has 4 float_vector fields(DISKANN) and 60 scalar fields

        :test steps:
            1. create collection with fields:
                'float_vector': 2048dim,
                'float_vector_1': 2048dim,
                'float_vector_2': 2048dim,
                'float_vector_3': 2048dim,
                all scalar fields: varchar max_length=10, array max_capacity=7
            2. build indexes:
                DISKANN: 'float_vector', 'float_vector_1', 'float_vector_2', 'float_vector_3'
                default_scalar_index: 'int64_1'
                INVERTED: 'id', 'bool_3'
            3. insert 100k data
            4. flush collection
            5. build indexes again using the same params
            6. load collection
                replica: 1
            7. concurrent request:
                - hybrid_search

Milvus Log

No response

Anything else?

No response

@wangting0128 wangting0128 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. test/benchmark benchmark test labels May 13, 2024
MrPresent-Han added a commit to MrPresent-Han/milvus that referenced this issue May 13, 2024
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
MrPresent-Han added a commit to MrPresent-Han/milvus that referenced this issue May 13, 2024
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 14, 2024
@yanliang567 yanliang567 added this to the 2.4.lru milestone May 14, 2024
@yanliang567 yanliang567 removed their assignment May 14, 2024
MrPresent-Han added a commit to MrPresent-Han/milvus that referenced this issue May 14, 2024
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

3 participants