You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The local llama chat response can take minutes sometimes. If you want to update the request and tweak it, then this can mean a lot of waiting in order to retry your request. Add some way to send an interrupt signal from the UI to cancel the request.
Mini-update. I looked into this today and found that gpt4all only supports shortcircuiting the model response after tokens have already started emitting. So, you can't stop it from 'thinking', so to speak, once it's already been given a query. To that end, I'll update the UI so that you can cancel the query once tokens are being spit out, but not before then.
Hopefully the time to first token issue will be less of a headache for folks using Mistral. That'll become the default model (see commit 0f1ebca) in the next release.
The local llama chat response can take minutes sometimes. If you want to update the request and tweak it, then this can mean a lot of waiting in order to retry your request. Add some way to send an interrupt signal from the UI to cancel the request.
See relevant discussion on Discord.
The text was updated successfully, but these errors were encountered: