Project Overview (basic version)

Navigation through the repository files:

Creating the vector store using DeepLake: creating_arxiv_dataset_vector_store.ipynb
Insight Stream main code: server/src/insight.py
FastAPI Endpoints: server/src/app.py
Docker File to deploy endpoints: server/Dockerfile
Streamlit web application: static/app.py

Project Overview (basic version)

Goal: This is an insight bot for YouTube videos. Its goal is to summarize, generate questions, and obtain relevant information from the RAG.
Motivation: Whenever I attend webinars, I always have questions on specific topics, and sometimes, I don't even know what questions I could ask. I have always thought of building a bot that can listen to webinars and generate questions along with relevant content.
Dataset: arXiv Dataset (https://www.kaggle.com/datasets/Cornell-University/arxiv/data)

Technology Stact:

  Python

  LLamaIndex (to build the RAG system)

  DeepLake (vector database)

  FastAPI (to create and expose the endpoints)

  OpenAI (for LLMs)

  Google Cloud Run (server to deploy the FastAPI endpoints)

  Streamlit (for front-end)

Working: In this basic version, whenever the user enters a YouTube link:

The app sends an API call to the FastAPI server with the YouTube link as the payload.
The vector database (arXiv dataset) is loaded. An index is created from this vector store.
Using the YoutubeTranscriptReader, the video is loaded, transcribed, and indexed.
This index is initiated as a query engine to communicate with the video transcript.
This query engine is used to get the video summary

    loader = YoutubeTranscriptReader(is_remote=True)

    youtube_docs = loader.load_data(ytlinks=[youtube_url])

    youtube_index = VectorStoreIndex.from_documents(youtube_docs, service_context=self.service_context)

    youtube_engine = youtube_index.as_query_engine(output_cls=<Output Parser>)

    youtube_summarize = youtube_engine.query("Summarize the video and generate technical questions for the video to ask the speaker.")

The vector database index is also initiated as a query engine and queried to extract relevant papers based on the summary.

    db = DeepLakeVectorStore(
        dataset_path=<dataset_url>,
        read_only=True,
        overwrite=False
    )

    arxiv_index = VectorStoreIndex.from_vector_store(self.db, service_context=self.service_context)

    # RAG Vector Engine
    vector_engine = arxiv_index.as_query_engine(output_cls=<Output Parser>, similarity_top_k=5)

    query = f"Extract research papers relevant to the video summary: {youtube_summarize.summary}"

    papers = vector_engine.query(query)

The summary, questions, and relevant papers are displayed to the user.
Now, a chat interface appears where the user can interact with the RAG system.
When the user asks a question, an API is called along with the prompt payload to get the response and relevant papers to the response.

Application Architecture

Web application Design

Steps to interact with the application and endpoints

Make sure to interact with the Analyzer Endpoint before Query Endpoint

FastAPI Endpoints:

https://insight-stream-service-kvbcxmn5bq-uc.a.run.app/docs

Webapp (https://github.com/Krishna2709/insight-stream-webapp):

https://insight-stream.streamlit.app

(Optional) Using Postman for API interaction (⚠️ will update soon)
Examples

Neglected Improvements due to time constraints

⚠️ Will update the Question Generation prompt in some time since the prompt is not clear. The model generates questions from the video transcription(like quizzes) instead of questions for the speaker.

gpt-3.5-turbo to reduce latency or response time
Prompt Engineering - for effective responses.
UI/UX optimization
Chat functionality with memory ability
Logging for evaluation
Improving retrieval methods like Rerank wrapper
LLM variants
Asynchronous methods

Further Improvements

Chat ability using Chat Engine, not just Query Engine
Prompt Engineering
RAG and Chat evaluations. Logging
Reducing latency
Live video transcription and RAG querying.
Using Bark(Suno's open-source text-to-speech model) to ask questions

Resources

activeloop RAG for Production (https://learn.activeloop.ai/courses/rag)
LLamaIndex Documentation (https://docs.llamaindex.ai/en/stable/)
LLamaIndex YouTube Channel (https://www.youtube.com/@LlamaIndex)

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
images		images
server		server
static		static
README.md		README.md
config.toml		config.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

server

server

static

static

README.md

README.md

config.toml

config.toml

requirements.txt

requirements.txt

Repository files navigation

Project Overview (basic version)

Application Architecture

Web application Design

Steps to interact with the application and endpoints

Neglected Improvements due to time constraints

Further Improvements

Resources

About

Releases

Packages

Languages

Krishna2709/insight-stream

Folders and files

Latest commit

History

Repository files navigation

Project Overview (basic version)

Application Architecture

Web application Design

Steps to interact with the application and endpoints

Neglected Improvements due to time constraints

Further Improvements

Resources

About

Resources

Stars

Watchers

Forks

Languages