Silent failure when text is too long when using the playground #1021

lstocchi · 2024-04-30T09:54:39Z

If you ask a question that has a very long content, it never replies back. If you go to the server logs you can see that it failed but nothing has been displayed to the user

ValueError: Requested tokens (4237) exceed context window of 2048
INFO:     10.88.0.1:37666 - "POST /v1/chat/completions HTTP/1.1" 400 Bad Request
Exception: Requested tokens (4237) exceed context window of 2048
Traceback (most recent call last):
  File "/usr/local/lib64/python3.9/site-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
    response = await original_route_handler(request)
  File "/usr/local/lib/python3.9/site-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.9/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/usr/local/lib64/python3.9/site-packages/llama_cpp/server/app.py", line 462, in create_chat_completion
    ] = await run_in_threadpool(llama.create_chat_completion, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/starlette/concurrency.py", line 42, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/usr/local/lib/python3.9/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "/usr/local/lib64/python3.9/site-packages/llama_cpp/llama.py", line 1657, in create_chat_completion
    return handler(
  File "/usr/local/lib64/python3.9/site-packages/llama_cpp/llama_chat_format.py", line 599, in chat_completion_handler
    completion_or_chunks = llama.create_completion(
  File "/usr/local/lib64/python3.9/site-packages/llama_cpp/llama.py", line 1493, in create_completion
    completion: Completion = next(completion_or_chunks)  # type: ignore
  File "/usr/local/lib64/python3.9/site-packages/llama_cpp/llama.py", line 972, in _create_completion
    raise ValueError(

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Silent failure when text is too long when using the playground #1021

Silent failure when text is too long when using the playground #1021

lstocchi commented Apr 30, 2024

Silent failure when text is too long when using the playground #1021

Silent failure when text is too long when using the playground #1021

Comments

lstocchi commented Apr 30, 2024