LLM Inference in TEE
LLM ( Large Language Model) inference in TEE can protect the model, input prompt or output. The key challenges are:
the performance of LLM inference in TEE (CPU)
can LLM inference run in TEE?
With the significant LLM inference speed-up brought by BigDL-LLM, and the Occlum LibOS, now high-performance and efficient LLM inference in TEE could be realized.
Overview
LLM inference
Above is the overview chart and flow description.
For step 3, users could use the Occlum init-ra AECS solution which has no invasion to the application.
More details please refer to LLM demo.