Hitting input token limit on local language models #22

Ademsk1 · 2024-05-03T14:10:42Z

When scraping fairly large websites, we hit the token limit and receive the GGML_ASSERT error:

 n_tokens_all <= cparams.n_batch

For smaller websites this isn't an issue.

We should think about decomposing the website into chunks if it hits a certain length threshold, summarising each chunk using the local language model, and then stitch together these summaries coherently using the model once more.

Another thought I've had is to take screenshots instead using playwright, and get some text recognition in there. Or perhaps even better, if there is a playwright method to only extract the text content, and leave the html entirely.

The text was updated successfully, but these errors were encountered:

DraconPern · 2024-05-04T21:19:20Z

The example https://news.ycombinator.com actually runs into this. I get a GGML_ASSERT: D:\a\node-llama-cpp\node-llama-cpp\llama\llama.cpp\llama.cpp:11163: n_tokens_all <= cparams.n_batch error

Ademsk1 · 2024-05-06T14:53:10Z

We can try and use the Accessibility feature on playwright
https://playwright.dev/docs/accessibility-testing
This would extract all the text. Could be a good start to reduce the HTML size.
@mishushakov

siquick · 2024-05-11T11:23:16Z

Also getting this on GPT-4-Turbo on some web pages. Only seems to hit the context length when mode: "html" but I find that mode: "text" isn't as accurate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hitting input token limit on local language models #22

Hitting input token limit on local language models #22

Ademsk1 commented May 3, 2024 •

edited

DraconPern commented May 4, 2024

Ademsk1 commented May 6, 2024 •

edited

siquick commented May 11, 2024

Hitting input token limit on local language models #22

Hitting input token limit on local language models #22

Comments

Ademsk1 commented May 3, 2024 • edited

DraconPern commented May 4, 2024

Ademsk1 commented May 6, 2024 • edited

siquick commented May 11, 2024

Ademsk1 commented May 3, 2024 •

edited

Ademsk1 commented May 6, 2024 •

edited