Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can this work offline? #3

Open
linonetwo opened this issue May 28, 2023 · 4 comments
Open

Can this work offline? #3

linonetwo opened this issue May 28, 2023 · 4 comments

Comments

@linonetwo
Copy link

Do you have some suggestion about replace openai api with some local npm package?

@poweroutlet2
Copy link

Check these two repos out:

https://github.com/xenova/transformers.js
https://github.com/do-me/SemanticFinder

I've tried to use transformersjs in a chrome extension service worker for just embedding to no avail 😞. Let me know if you figure it out

@iskandarreza
Copy link

Check these two repos out:

https://github.com/xenova/transformers.js https://github.com/do-me/SemanticFinder

I've tried to use transformersjs in a chrome extension service worker for just embedding to no avail 😞. Let me know if you figure it out

Oh wow I didn't know about xenova. I've heard of onnx though and that uses onnx. From the description it doesn't look like it does embeddings, but apparently it can do summarization and that is something I definitely want to test out.

@poweroutlet2
Copy link

From the SemanticFinder repo, it looks like you can do embeddings. They just renamed the pipeline to "feature-extraction"
https://github.com/xenova/transformers.js/blob/main/src/pipelines.js#L1384

But maybe I'm wrong... That would explain why I haven't gotten it to work

@nico-martin
Copy link

nico-martin commented Apr 24, 2024

I found a completely offline solution using transformersJS:

import {
  pipeline,
  FeatureExtractionPipeline,
  Tensor,
} from "@xenova/transformers";

const localEmbedTexts = async (texts: string[]): Promise<number[][]> => {
  const extractor: FeatureExtractionPipeline = await pipeline(
    "feature-extraction",
    "Xenova/all-MiniLM-L6-v2",
    {},
  );
  const getEmbedding = (text: string): Promise<Tensor> =>
    new Promise(async (resolve, reject) => {
      try {
        const response = await extractor(text, {
          pooling: "mean",
        });
        resolve(response);
      } catch (error) {
        reject(error);
      }
    });

  try {
    const embeddings = await Promise.all(texts.map(getEmbedding));
    return embeddings.map((e) => e.data) as number[][];
  } catch (error) {
    throw error;
  }
};

const vectorStore = new VectorStorage({
  embedTextsFn: localEmbedTexts,
});

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants