Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Model.listSearchIndexes() #14450

Closed
2 tasks done
hlebegue opened this issue Mar 19, 2024 · 5 comments
Closed
2 tasks done

Add Model.listSearchIndexes() #14450

hlebegue opened this issue Mar 19, 2024 · 5 comments
Labels
new feature This change adds new functionality, like a new method or class
Milestone

Comments

@hlebegue
Copy link

hlebegue commented Mar 19, 2024

Prerequisites

  • I have written a descriptive issue title
  • I have searched existing issues to ensure the bug has not already been reported

Mongoose version

8.2.2

Node.js version

16.20.2

MongoDB server version

Atlas latest

Typescript version (if applicable)

No response

Description

When I use model.createSearchIndex() with an index that already exists, my node application is going down with a SIGTERM

Steps to Reproduce

Use an existing test case to create a search index, but attempt to create it twice with the same name

//just there 
const vectorModelsSupported = [
		{
			name: 'text_embedding_3_small',
			size: 1536,
			enabled: true
		},
		{ 
			name: 'text_embedding_3_small', 
			size: 1536,
			enabled: true
		}
	]

const EmbeddingSchema = new Schema({
	chunkText: { type: String, maxLength: 32764, default: "" },
	embedding_text_embedding_3_small: { type: Array, maxLength: 1536 },
	embedding_text_embedding_3_large: { type: Array, },
	embedding_text_embedding_ada_002: { type: Array, maxLength: 1536 },
	chunkLen: Number,
	chunkIndex: Number,
	upload: {
		type: ObjectId,
		ref: "upload",
	},
	chunkUrl: { type: String, maxLength: 1000, }, /*ex: #page=2 for PDF or #id for HTML*/
	created: { type: Date, default: Date.now },
	usage: { type: Number },
	status: { type: String, default: "inactive" }, // inactive, active | error, selected, deleted
})

const createSearchIndexes = async () => { 
	
	//debug("createSearchIndexes", "This is the list of indexes:", await embedding.listIndexes())
	debug("createSearchIndexes", "This is the list of indexes:", await embedding.searchIndexes)
	if (!modelsSupported) return;
	
	modelsSupported.forEach((model) =>{
		if (model.enabled){
			
			//const field =
			
			const fields = {}
			fields[`embedding_${model.name}`]=  [
				{
					"dimensions": model.size, // 1536,
					"similarity": "euclidean",
					"type": "knnVector"
				}
			]

			try{
				embedding.createSearchIndex({
					name: model.name, // "text_embedding_3_small",
					definition: {
						"mappings": {
							"fields": fields
							}
						}
					}
				)
			}
			catch (err){
				debug("createSearchIndexes", "Failed to create index for model ",model, err)
			}
		
		}

	})
}

embedding.on('index',(event) => {
	debug("Embedding", "===> This is an index event", event)
	
	createSearchIndexes()

})

Expected Behavior

the createSearchIndex should throw an error I can catch, or there should be an API that returns the search index.

The Model.listIndexes does not return the search indexes.

@vkarpov15 vkarpov15 added this to the 8.2.4 milestone Mar 25, 2024
@vkarpov15 vkarpov15 added the has repro script There is a repro script, the Mongoose devs need to confirm that it reproduces the issue label Mar 25, 2024
@vkarpov15 vkarpov15 modified the milestones: 8.2.4, 8.2.5 Mar 28, 2024
@IslandRhythms IslandRhythms added can't reproduce Mongoose devs have been unable to reproduce this issue. Close after 14 days of inactivity. and removed has repro script There is a repro script, the Mongoose devs need to confirm that it reproduces the issue labels Apr 5, 2024
@IslandRhythms
Copy link
Collaborator

IslandRhythms commented Apr 5, 2024

const mongoose = require('mongoose');

const { Schema } = mongoose;

//just there 
const vectorModelsSupported = [
  {
    name: 'text_embedding_3_small',
    size: 1536,
    enabled: true
  },
  { 
    name: 'text_embedding_3_small', 
    size: 1536,
    enabled: true
  }
]

const EmbeddingSchema = new Schema({
chunkText: { type: String, maxLength: 32764, default: "" },
embedding_text_embedding_3_small: { type: Array, maxLength: 1536 },
embedding_text_embedding_3_large: { type: Array, },
embedding_text_embedding_ada_002: { type: Array, maxLength: 1536 },
chunkLen: Number,
chunkIndex: Number,
upload: {
  type: mongoose.Schema.Types.ObjectId,
  ref: "upload",
},
chunkUrl: { type: String, maxLength: 1000, }, /*ex: #page=2 for PDF or #id for HTML*/
created: { type: Date, default: Date.now },
usage: { type: Number },
status: { type: String, default: "inactive" }, // inactive, active | error, selected, deleted
});

const Test = mongoose.model('Test', EmbeddingSchema);


async function run() {
  await mongoose.connect('mongodb://localhost:27017');

  vectorModelsSupported.forEach((model) =>{
		if (model.enabled){
			
			//const field =
			
			const fields = {}
			fields[`embedding_${model.name}`]=  [
				{
					"dimensions": model.size, // 1536,
					"similarity": "euclidean",
					"type": "knnVector"
				}
			]

			try{
				Test.createSearchIndex({
					name: model.name, // "text_embedding_3_small",
					definition: {
						"mappings": {
							"fields": fields
							}
						}
					}
				)
			}
			catch (err){
				console.log('err', err);
			}
		
		}

	})
}

run();

@vkarpov15 vkarpov15 added needs repro script Maybe a bug, but no repro script. The issue reporter should create a script that demos the issue and removed can't reproduce Mongoose devs have been unable to reproduce this issue. Close after 14 days of inactivity. labels Apr 5, 2024
@vkarpov15 vkarpov15 modified the milestones: 8.3.1, 8.3.2 Apr 5, 2024
@hlebegue
Copy link
Author

hlebegue commented Apr 5, 2024

With the latest Mongoose, I was able to catch error directly treating as a promise so it didn't send a SIG TERM


  embedding.createSearchIndex({
	  name: model.name, // "text_embedding_3_small",
	  definition: {
		  "mappings": {
			  "fields": fields
			  }
		  }
  }).then( (success)=> {
	  console.log("Succeeded!", success)
  }).catch((err) => {
	  //console.log("This is the part that preventsa SIGTERM")
  })

I suppose the real issue is that I don't get the list of search indexes when I use

await embedding.listIndexes()

or accessing this property

await embedding.searchIndexes

The list returned is always empty.

Question:
How can I determine if a searchIndex already exists so I don't have to rely on creating it even if it exists?
What API should I use? If none are available, can 1 be added?

Thank you

@IslandRhythms
Copy link
Collaborator

IslandRhythms commented Apr 5, 2024

Please modify my script to demonstrate your issue. Also when using code blocks, use three ` to have it appear as a code block like so:
"```"
const name = 'Test'

"```"

@vkarpov15
Copy link
Collaborator

The issue in the original post is that you need to await on embedding.createSearchIndex. Since createSearchIndex is async, you can't handle errors with try/catch unless you use await.

In terms of listing existing search indexes, we're currently missing an implementation of listSearchIndexes(). We will add that.

@vkarpov15 vkarpov15 added new feature This change adds new functionality, like a new method or class and removed needs repro script Maybe a bug, but no repro script. The issue reporter should create a script that demos the issue labels Apr 10, 2024
@vkarpov15 vkarpov15 modified the milestones: 8.3.2, 8.4 Apr 10, 2024
@vkarpov15 vkarpov15 changed the title Model.createSearchIndex() create SIGTERM on DuplicateIndex Add Model.listSearchIndexes() Apr 10, 2024
vkarpov15 added a commit that referenced this issue Apr 22, 2024
@hlebegue
Copy link
Author

Fantastic ! Thank you for adding the Model.listSearchIndex() and respond to this issue. Best wishes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature This change adds new functionality, like a new method or class
Projects
None yet
Development

No branches or pull requests

3 participants