We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
见 #100 (reply in thread) 的讨论,官网示例 + ggml-vic7b-q5_1.bin 就会出乱码,而直接用 llama.cpp 运行则不会。
用
function toUnicode(string_) { return string_.split('').map(function(value, index, array) { const temporary = value.charCodeAt(0).toString(16).toUpperCase(); if (temporary.length > 2) { return '\\u' + temporary; } return value; }).join(''); } function unicodeToChar(text) { return text.replace(/\\u[\dA-F]{4}/gi, function (match) { return String.fromCharCode(parseInt(match.replace(/\\u/g, ''), 16)); }); }
看了一下,输出的乱码字符都是 \uFFFD\uFFFD\uFFFD , 实际显示就是 � https://www.fileformat.info/info/unicode/char/fffd/index.htm
\uFFFD\uFFFD\uFFFD
�
% node "/Users/linonetwo/Desktop/repo/TiddlyGit-Desktop/scripts/tryllm.mjs" llama.cpp: loading model from /Users/linonetwo/Documents/languageModel/ggml-vic7b-q5_1.bin llama_model_load_internal: format = ggjt v2 (pre #1508) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 1024 llama_model_load_internal: n_embd = 4096 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 32 llama_model_load_internal: n_layer = 32 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 9 (mostly Q5_1) llama_model_load_internal: n_ff = 11008 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 7B llama_model_load_internal: ggml ctx size = 0.07 MB llama_model_load_internal: mem required = 6612.59 MB (+ 2052.00 MB per state) . llama_init_from_file: kv self size = 1024.00 MB [Wed, 19 Jul 2023 05:53:18 +0000 - INFO - llama_node_cpp::context] - AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | [Wed, 19 Jul 2023 05:53:18 +0000 - INFO - llama_node_cpp::llama] - tokenized_stop_prompt: None Tiddlywiki是一种非常���活的单页网站,���可以用于记录任何类型的信息,包���文本、���加文件、URL等等。���的特点是: 1. 非常���活:Tiddlywiki可以根据自���的需求进行定制,可以添加任何类型的元素。 2. ���于使用:Tiddlywiki的界面非常���单,只需要点击一下���可开始编���。 3. 可���展性���:Tiddlywiki可以通过���件���展其功能,可以���展到任何需要。 4. 可���性���:Tiddlywiki使用文本文件存���数据,可以在任何时间和任何地方���问。 ���的来说,Tiddlywiki是一种非常实用的工���,可以用于���种场景,包���个人用户、学生、工作人员等等。 <end>
用 https://belladoreai.github.io/llama-tokenizer-js/example-demo/build/ 可以看到 Tiddlywiki是一种非常灵活的单页网站 中的 灵 应该变成 <0xE7><0x81><0xB5> ,但可能没有被处理好
Tiddlywiki是一种非常灵活的单页网站
灵
<0xE7><0x81><0xB5>
The text was updated successfully, but these errors were encountered:
No branches or pull requests
见 #100 (reply in thread) 的讨论,官网示例 + ggml-vic7b-q5_1.bin 就会出乱码,而直接用 llama.cpp 运行则不会。
用
看了一下,输出的乱码字符都是
\uFFFD\uFFFD\uFFFD
, 实际显示就是�
https://www.fileformat.info/info/unicode/char/fffd/index.htm用 https://belladoreai.github.io/llama-tokenizer-js/example-demo/build/ 可以看到
Tiddlywiki是一种非常灵活的单页网站
中的灵
应该变成<0xE7><0x81><0xB5>
,但可能没有被处理好The text was updated successfully, but these errors were encountered: