You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When scraping fairly large websites, we hit the token limit and receive the GGML_ASSERT error:
n_tokens_all <= cparams.n_batch
For smaller websites this isn't an issue.
We should think about decomposing the website into chunks if it hits a certain length threshold, summarising each chunk using the local language model, and then stitch together these summaries coherently using the model once more.
Another thought I've had is to take screenshots instead using playwright, and get some text recognition in there. Or perhaps even better, if there is a playwright method to only extract the text content, and leave the html entirely.
The text was updated successfully, but these errors were encountered:
The example https://news.ycombinator.com actually runs into this. I get a GGML_ASSERT: D:\a\node-llama-cpp\node-llama-cpp\llama\llama.cpp\llama.cpp:11163: n_tokens_all <= cparams.n_batch error
Also getting this on GPT-4-Turbo on some web pages. Only seems to hit the context length when mode: "html" but I find that mode: "text" isn't as accurate.
When scraping fairly large websites, we hit the token limit and receive the
GGML_ASSERT
error:For smaller websites this isn't an issue.
We should think about decomposing the website into chunks if it hits a certain length threshold, summarising each chunk using the local language model, and then stitch together these summaries coherently using the model once more.
Another thought I've had is to take screenshots instead using
playwright
, and get some text recognition in there. Or perhaps even better, if there is aplaywright
method to only extract the text content, and leave the html entirely.The text was updated successfully, but these errors were encountered: