Add graph mode to embedding models which benefits from ipex optimize#359
Add graph mode to embedding models which benefits from ipex optimize#359devpramod wants to merge 4 commits into
Conversation
There was a problem hiding this comment.
Thanks for the contribution @devpramod ...
Couple of small housekeeping related comments for the PR:
- Can you rebase the PR with a
sign-off, this allows github to verify and the DCO check would pass. - It seems the lint and formatting checks are failing. Can you please address those.
Guidelines for contribution documenting some of these steps can be found here: https://github.com/caikit/caikit-nlp/blob/main/CONTRIBUTING.md
On the graphmode integration, couple of question and suggestions:
- When should the graph model be enabled? Can you add describe a bit when it is applicable, for which models etc ?
- How much of speedup is expected with enablement of
graphmode? Can you share some results ?
|
Hi @gkumbhat
working on addressing 1 & 2 |
Signed-off-by: devpramod <pramod.pai@intel.com>
Signed-off-by: devpramod <pramod.pai@intel.com>
Signed-off-by: devpramod <pramod.pai@intel.com>
|
Hi @gkumbhat |
Signed-off-by: devpramod <pramod.pai@intel.com>
Graph mode (torchscript) is an additional step that can further accelerate a workload that has been optimized with ipex and bfloat16 along with mixed precision. Graph mode utilizes AMX on SPR for additional speedups.
This PR modifies config file to have
graphmodeas an option to end usersThe graph is compiled in the constructor of the embedding module such that it is ready for execution when requests arrive.