Let Me NERSC that For You

This is a custom-made documentation Chatbot based on the NERSC documentation.

Goals

This bot is not made to replace the documentation but rather to improve information discoverability. Our goals are to:

Being able to answer questions on NERSC with up-to-date, sourced, answers,
run fully open source technologies on-premise giving us control, security, and privacy,
serve this model to NERSC users in production with acceptable performance,
make it fairly easy to switch the underlying models, embeddings and LLM, to be able to follow a rapidly evolving technology.

Those scripts are meant to be run locally, mainly by developers of the project:

chatbot.py this is a basic local question answering loop
chatbot_dev.py is a more feature rich version of local loop, making it easy to run test questions and switch models around.
update_database.py update the vector database (for a given llm, sentence embeder, and vector database)¹
token_counter.py measure the size of questions and answers for a given tokenizer

On NERSC supercomputers, you might want to run module load python cudatoolkit cudnn pytorch before using those commands.

Those scripts are meant to be user with the superfacility API:

api_client.py this is a deonstration client, calling the chatbot via the superfacility API,
api_consumer.py this is a worker, answering questions asked to the superfacility API on a loop

In no particular order:

move to container
get flash attention back to working²
refresh prompt (and move information chunks elsewhere?)
refuse svg and out files from the doc (which file types are in the doc?)
document the inner-workings,
establish a canonical list of test questions / conversations,
move this code to the NERSC github,
try fine-tuning sentence embedder,
add a page rank type of score to documentation items? (to be integrated with the vector search)
try a home-trained model,

Nestor Demeure
leading the effort and writing the glue code

Ermal Rrapaj
finetuning and testing home-made models

Gabor Torok
writing the superfacility API integration and web front-end

Andrew Naylor
scaling the service to production throughputs

This script is run once everyday (on a scron job). ↩
Currently, running python -c "import flash_attn" triggers a ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory error. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 257 Commits
lmntfy		lmntfy
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
api_async_consumer.py		api_async_consumer.py
api_client.py		api_client.py
api_client_stresstester.py		api_client_stresstester.py
chatbot.py		chatbot.py
chatbot_dev.py		chatbot_dev.py
check_retrieval.py		check_retrieval.py
environment.yml		environment.yml
local_async_stresstester.py		local_async_stresstester.py
token_counter.py		token_counter.py
update_database.py		update_database.py