This week on The Data Stack Show, Eric and Kostas chat with Kevin Liu, Software Engineer at Stripe. During the episode, Kevin discusses data infrastructure challenges and the development of data products. He also shares insights on the importance of metadata management and the role of catalogs in maintaining data consistency across various systems. The conversation also covers open-source projects like the Python Iceberg library and the future of databases in the cloud, the ease of use of internal tools, the integration of data for builders, the balance between simplicity and functionality in user interfaces, and more.
Highlights from this week’s conversation include:
Kevin’s background and work at Stripe (0:31)
Evolution of Data Infrastructure at Stripe (2:18)
Kevin's Interest in Data (5:29)
Software Engineer or Data Engineer? (8:27)
Speech Recognition Work at Amazon (11:06)
Efficiency and Cost Management (15:50)
Metadata and Query Analysis (18:38)
Surprising Discoveries in Metadata Analysis (21:43)
Optimizing Cost and Value (23:55)
Product Sizing Stripe Data (26:39)
Popular Tool for Data Interaction (30:08)
Enabling Data Infrastructure Integration (35:22)
Value of Data Pipelining for Stripe (39:32)
Next Generation Product and Technology (43:54)
Maximizing value in a decentralized environment (51:34)
Future of open source projects in data infrastructure (57:59)
Final thoughts and takeaways (59:02)
The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
182: Building a Dynamic Data Infrastructure at Enterprise Scale Featuring Kevin Liu of Stripe