Baseten is a San Francisco-based machine learning infrastructure company with a focus on model inference. Offering an advanced machine learning operations (MLOps) platform for model deployment, model serving, and model fine-tuning, customers come to Baseten to run large language models (LLMs) at scale reliably, and cost-efficiently. With LLM performance as a top priority, Baseten teamed up with AWS and NVIDIA to deliver measurable throughput and latency improvements.