Add graph mode to embedding models which benefits from ipex optimize #359

devpramod · 2024-05-17T15:35:23Z

Graph mode (torchscript) is an additional step that can further accelerate a workload that has been optimized with ipex and bfloat16 along with mixed precision. Graph mode utilizes AMX on SPR for additional speedups.

This PR modifies config file to have graphmode as an option to end users
The graph is compiled in the constructor of the embedding module such that it is ready for execution when requests arrive.

gkumbhat

Thanks for the contribution @devpramod ...

Couple of small housekeeping related comments for the PR:

Can you rebase the PR with a sign-off, this allows github to verify and the DCO check would pass.
It seems the lint and formatting checks are failing. Can you please address those.

Guidelines for contribution documenting some of these steps can be found here: https://github.com/caikit/caikit-nlp/blob/main/CONTRIBUTING.md

On the graphmode integration, couple of question and suggestions:

When should the graph model be enabled? Can you add describe a bit when it is applicable, for which models etc ?
How much of speedup is expected with enablement of graphmode ? Can you share some results ?

devpramod · 2024-05-29T18:35:54Z

Hi @gkumbhat

Enabling graph mode is most applicable for models that need high performance and efficiency in production. A wide range of PyTorch operations are covered in TorchScript's graph mode. It is not recommended in cases where there might be complex post processing involving custom python libraries as part of the model code.
The speedups are dependent on the workload at hand. More information can be found here - https://pytorch.org/blog/optimizing-production-pytorch-performance-with-graph-transformations/
On Intel hardware, using torchscript along with ipex optimize uses AMX an ISA with instructions to speed up AI workloads.

working on addressing 1 & 2

Signed-off-by: devpramod <pramod.pai@intel.com>

devpramod · 2024-06-03T18:40:55Z

Hi @gkumbhat
I have resolved the formatting, linting and DCO check issues

Signed-off-by: devpramod <pramod.pai@intel.com>

devpramod requested review from alex-jw-brooks, gkumbhat, evaline-ju, gabe-l-hart, tharapalanivel and Ssukriti as code owners May 17, 2024 15:35

gkumbhat requested changes May 17, 2024

View reviewed changes

devpramod added 3 commits June 3, 2024 18:37

add graph mode

797541e

Signed-off-by: devpramod <pramod.pai@intel.com>

graph mode compiled in constructor

5a67e19

Signed-off-by: devpramod <pramod.pai@intel.com>

formatting and linting

c9c5245

Signed-off-by: devpramod <pramod.pai@intel.com>

devpramod force-pushed the main branch from e031c84 to c9c5245 Compare June 3, 2024 18:40

minor formatting fix

5f54260

Signed-off-by: devpramod <pramod.pai@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add graph mode to embedding models which benefits from ipex optimize #359

Add graph mode to embedding models which benefits from ipex optimize #359

devpramod commented May 17, 2024

gkumbhat left a comment •

edited

devpramod commented May 29, 2024

devpramod commented Jun 3, 2024

Add graph mode to embedding models which benefits from ipex optimize #359

Are you sure you want to change the base?

Add graph mode to embedding models which benefits from ipex optimize #359

Conversation

devpramod commented May 17, 2024

gkumbhat left a comment • edited

Choose a reason for hiding this comment

devpramod commented May 29, 2024

devpramod commented Jun 3, 2024

gkumbhat left a comment •

edited