Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Existing replication factor increase implementation does not work with raft #4840

Closed
1 task done
parkerduckworth opened this issue May 3, 2024 · 1 comment
Closed
1 task done
Labels

Comments

@parkerduckworth
Copy link
Member

parkerduckworth commented May 3, 2024

Updating replication factor in a collection doesn't work as expected in weaviate RAFT version 1.25.

How to reproduce this bug?

  1. Setup a weaviate multinode cluster locally. Start the server
client = weaviate.connect_to_local()
  1. Create a collection with replication factor =1
if (client.collections.exists("Testing_replication")):
  # delete collection "Article" - THIS WILL DELETE THE COLLECTION AND ALL ITS DATA
  client.collections.delete("Testing_replication")  # Replace with your collection name

collection = client.collections.create(
    name="Testing_replication",
    vector_index_config=wvc.Configure.VectorIndex.hnsw(),
    properties=[
        wvc.Property(name="title", data_type=wvc.DataType.TEXT, vectorize_property_name=True),
        wvc.Property(name="genres", data_type=wvc.DataType.TEXT),
        wvc.Property(name="keywords", data_type=wvc.DataType.TEXT, vectorize_property_name=True),
        wvc.Property(name="popularity", data_type=wvc.DataType.TEXT),
        wvc.Property(name="runtime", data_type=wvc.DataType.TEXT),
        wvc.Property(name="cast", data_type=wvc.DataType.TEXT),
        wvc.Property(name="language", data_type=wvc.DataType.TEXT, vectorize_property_name=True),
        wvc.Property(name="tagline", data_type=wvc.DataType.TEXT),
    ],
    replication_config=wvc.Configure.replication(factor=1),
    multi_tenancy_config=wvc.Configure.multi_tenancy(False),
)
  1. Update the collection with replication factor = 3
from weaviate.classes.config import Reconfigure

# Get the Article collection object
articles = client.collections.get("Testing_replication")

# Update the collection configuration
articles.config.update(
    # Note, use Reconfigure here (not Configure)
  replication_config=Reconfigure.replication(
        factor = 3
    )
)
  1. insert objects in the collection
# Insert objects to the collection
import random
import string
import time


def get_random_string(length):
    # choose from all lowercase letter
    letters = string.ascii_lowercase
    result_str = ''.join(random.choice(letters) for i in range(length))
    return result_str
    # print("Random string of length", length, "is:", result_str)
                    

def generate_data_object(numberOfRecord):
    print(numberOfRecord);
    data_objects = [
        {
            "title" : "title"+ get_random_string(10),
            "genres" : "genre"+ get_random_string(3),
            "keywords" : "keywords"+ get_random_string(3),
            "popularity" : "popularity"+ get_random_string(3),
            "runtime" : "runtime"+ get_random_string(3),
            "cast" : "cast"+ get_random_string(3),
            "langauge" : "language"+ get_random_string(3),
            "tagline" : "tagline"+ get_random_string(3)
        } for _ in range(numberOfRecord)
    ]
    # print(data_objects)    
    return data_objects


def load_records(client: weaviate.Client, number_of_record, class_name):
    data_objects = generate_data_object(number_of_record)
    start_time=time.time()
    with client.batch.dynamic() as batch:
        for data_object in data_objects:
            # uuid_title = generate_uuid5(title,"title")
            batch.add_object(
               collection= class_name,
               properties= data_object      
            )
    end_time=time.time()
    total_time=(end_time - start_time)
    print(total_time)
    assert len(client.batch.failed_objects) == 0
    client.close();

load_records(client,100,"Testing_replication"),
  1. Send a graphql aggregate query to get the object count of the collection after the object is inserted.

What is the expected behavior?

Updating replication factor and performing aggregate api should give correct object count of the collection.

What is the actual behavior?

The /nodes api shows the correct shard distribution and object count if scale out is executed after objects exist. But whether repl factor is changed before or after objects are inserted, getting the object count with Aggregate always results in shard <name>: shard \"<shard>\" already shut or dropped,hence the belief that they aren't being init'd properly.

The replication factor acceptance test also fails with the same error GraphQL resolved to an error: [{"locations":[{"column":7,"line":1}],"message":"explorer: list class: search: object search at index paragraph: shard \"IvGSULRqyvKi\" already shut or dropped"

Server Version

v1.25.0

Code of Conduct

@rthiiyer82
Copy link

Verified on 1.25.1 patch release. Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants