llm-api

A fast CPU-based API for OpenChat 3.6, hosted on Hugging Face Spaces. To achieve faster executions, we are using CTranslate2 as our inference engine.

Usage

Simply cURL the endpoint like in the following.

curl -N 'https://winstxnhdw-llm-api.hf.space/api/v1/generate' \
     -H 'Content-Type: application/json' \
     -d \
     '{
         "instruction": "What is the capital of Japan?"
      }'

Development

First, install the required dependencies for your editor with the following.

poetry install

Now, you can access the Swagger UI at localhost:7860/api/docs after spinning the server up locally with the following.

docker build -f Dockerfile.build -t llm-api .
docker run --rm -e APP_PORT=7860 -p 7860:7860 llm-api

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
.github		.github
server		server
.dockerignore		.dockerignore
.gitignore		.gitignore
Caddyfile		Caddyfile
Dockerfile		Dockerfile
Dockerfile.build		Dockerfile.build
README.md		README.md
gunicorn.conf.py		gunicorn.conf.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
supervisord.conf		supervisord.conf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github

.github

server

server

.dockerignore

.dockerignore

.gitignore

.gitignore

Caddyfile

Caddyfile

Dockerfile

Dockerfile

Dockerfile.build

Dockerfile.build

README.md

README.md

gunicorn.conf.py

gunicorn.conf.py

poetry.lock

poetry.lock

pyproject.toml

pyproject.toml

supervisord.conf

supervisord.conf

Repository files navigation

llm-api

Usage

Development

About

Packages 1

Contributors 3

Languages

winstxnhdw/llm-api

Folders and files

Latest commit

History

Repository files navigation

llm-api

Usage

Development

About

Topics

Resources

Stars

Watchers

Forks

Languages