Building an AI Training Data Pipeline with VAST Data | 07x02

Опубликовано: 21 Ноябрь 2024
на канале: Gestalt IT

475

Model training seriously stresses data infrastructure, but preparing that data to be used is a much more difficult challenge. This episode of Utilizing Tech features Subramanian Kartik of VAST Data discussing the broad data pipeline with Jeniece Wnorowski of Solidigm and Stephen Foskett. The first step in building an AI model is collecting, organizing, tagging, and transforming data. Yet this data is spread around the organization in databases, data lakes, and unstructured repositories. The challenge of building a data pipeline is familiar to most businesses, since a similar process is required in analytics, business intelligence, observability, and simulation, but generative AI applications have an insatiable appetite for data. These applications also demand extreme levels of storage performance, and only flash SSDs can meet this demand. A side benefit is the improvements in power consumption and cooling versus hard disk drives, and this is especially true as massive SSDs come to market. Ultimately the success of generative AI will drive greater collection and processing of data on the inferencing side, perhaps at the edge, and this will drive AI data infrastructure further.

Hosts:
Stephen Foskett, Organizer of Tech Field Day:   / sfoskett
Jeniece Wnorowski, Datacenter Product Marketing Manager at Solidigm:   / jeniecewnorowski

Guest:
Subramanian Kartik, Ph. D, Global Systems Engineering Lead at VAST Data:   / subramanian-kartik-ph-d-1880835

Follow Utilizing Tech
Website: https://www.UtilizingTech.com/
X/Twitter:   / utilizingtech

Tech Field Day
Website: https://www.TechFieldDay.com
LinkedIn:   / tech-field-day
X/Twitter:   / techfieldday

#UtilizingTech #AIDataInfrastructure #Sponsored #AI @SFoskett @TechFieldDay @UtilizingTech @Solidigm