Lightning Talk: Exploring PiPPY, Tensor Parallel and Torchserve for Large... - Hamid Shojanazeri

Опубликовано: 18 Октябрь 2024
на канале: PyTorch
576
8

Lightning Talk: Exploring PiPPY, Tensor Parallel and Torchserve for Large Model Inference - Hamid Shojanazeri, Meta

Here, we talk about large model inference with Torchserve, using PiPPy, Tensor Parallel, challenges of distributed inference and available solutions. Discuss the features that Torchserve provide today for serving LLMs in production today.