Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用Turbo进行命名实体识别,测试速度无明显提升 #245

Open
Hap-Zhang opened this issue Aug 11, 2021 · 9 comments
Open

使用Turbo进行命名实体识别,测试速度无明显提升 #245

Hap-Zhang opened this issue Aug 11, 2021 · 9 comments

Comments

@Hap-Zhang
Copy link

Hi,feifeibear

我仿照您提供的例子(bert_for_sequence_classification_example.py)写了个用于命名实体识别的测试程序,运行下来发现Turbo提速并不明显,请问这中间有什么潜在的注意事项吗?
image

@feifeibear
Copy link
Collaborator

可能是输入太短?用onnxrt有加速么?

@Hap-Zhang
Copy link
Author

请问中文大概多少个字能体现出效果呢,我可以再测试下

onnxrt是有些加速的,虽然我不知道为什么onnxrt这边Torch的时长更长了。。。。
image

@Hap-Zhang
Copy link
Author

我这边把timeline打出来了,您可以帮忙看看哪块有比较大的嫌疑吗?

image

@feifeibear
Copy link
Collaborator

看起来挺正常的,你多测几次,避免warmup开销,试试设置一下OMP线程数目

@Hap-Zhang
Copy link
Author

好的,OMP线程数目默认是机器本身CPU个数吗?

@feifeibear
Copy link
Collaborator

@Hap-Zhang
Copy link
Author

好的,非常感谢!!
最后想再请教下turbo在CPU环境下主要是改了哪里来加速的呢,有相关的paper吗,我只找到了一篇说GPU的:TurboTransformers: An Efficient GPU Serving System For Transformer Models

@feifeibear
Copy link
Collaborator

把pytorc的代码用C++重写了一遍,加入了算子融合,矩阵乘法用了mkl,其他操作用omp并行加速。
没有相关paper。

@suxue
Copy link

suxue commented Sep 4, 2021

请在README里注明一下实际上目前版本的turbo只是在调用onnxruntime吧,编译二进制还挺麻烦的,代码里都没用上,不如直接掉onnxruntime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants