This is a custom-made documentation Chatbot based on the NERSC documentation.
This bot is not made to replace the documentation but rather to improve information discoverability. Our goals are to:
- Being able to answer questions on NERSC with up-to-date, sourced, answers,
- run fully open source technologies on-premise giving us control, security, and privacy,
- serve this model to NERSC users in production with acceptable performance,
- make it fairly easy to switch the underlying models, embeddings and LLM, to be able to follow a rapidly evolving technology.
- clone the repo,
- use the
environment.yml
file to install dependencies withconda
- clone the NERSC doc repository into a folder.
Those scripts are meant to be run locally, mainly by developers of the project:
chatbot.py
this is a basic local question answering loopchatbot_dev.py
is a more feature rich version of local loop, making it easy to run test questions and switch models around.update_database.py
update the vector database (for a given llm, sentence embeder, and vector database)1token_counter.py
measure the size of questions and answers for a given tokenizer
On NERSC supercomputers, you might want to run module load python cudatoolkit cudnn pytorch
before using those commands.
Those scripts are meant to be user with the superfacility API:
api_client.py
this is a deonstration client, calling the chatbot via the superfacility API,api_consumer.py
this is a worker, answering questions asked to the superfacility API on a loop
In no particular order:
-
move to container
-
get flash attention back to working2
-
refresh prompt (and move information chunks elsewhere?)
-
refuse
svg
andout
files from the doc (which file types are in the doc?) -
document the inner-workings,
-
establish a canonical list of test questions / conversations,
-
move this code to the NERSC github,
-
try fine-tuning sentence embedder,
-
add a page rank type of score to documentation items? (to be integrated with the vector search)
-
try a home-trained model,
Nestor Demeure leading the effort and writing the glue code |
Ermal Rrapaj finetuning and testing home-made models |
Gabor Torok writing the superfacility API integration and web front-end |
Andrew Naylor scaling the service to production throughputs |