The Data Stack Show Newsletter Edition 012

Positive applications for generative AI, a 101 on time-series databases, a look at analytics and BI, and more.

Apr 05, 2023

👋 Hi everyone, a bit of news this month – we’re on Substack! One of the hard things about podcasting is that it can feel like speaking into a void. It’s not easy to get feedback or build relationships with your listeners. We hope Substack helps us get to know more of you, so please join us and comment on our stuff. We’d love to hear from you.

As far as last month's shows go, we kick off with a timely episode. Disruptive technologies always create some level of fear. With ChatGPT, initial doom and gloom chatter was around how generative AI is going to eliminate all of our jobs. Since then things have escalated, and while concerns around the technology are real, it’s helpful to remember how LLMs can drive positive progress. Gretel’s Alex Watson does just that. Check out the episode to learn about some use cases you may not be aware of, and stay tuned until the end where Alex gives his perspective on managing the ethics of the technology.

🌯 The March Wrap

Here’s what you missed (if you missed it) on The Data Stack Show over the last month:

The Possibilities Are Endless for Synthetic Data with Alex Watson (twitter) of Gretel.a

Why you should listen – To learn all about synthetic data and to find out how training generative AI models on data instead of language can deliver an alternative approach to data privacy.

🎧 Listen / Tweet

Databases, Data Warehouses, and Timeseries Data with David Kohn (twitter) of Timescale

Why you should listen – To find out why we need time series databases and hear David detail the differences between time series data and data warehouse data.

🎧 Listen / Tweet

From Business Intelligence to Product Analytics and Beyond with Vijay Ganesan (twitter) of NetSpring.io

Why you should listen – Because Vijay articulates where the lines are drawn in the analytics landscape better than anyone. You’ll come away with a clear understanding of BI vs. product analytics, why BI isn’t enough, and of what’s next for analytics.

🎧 Listen / Tweet

How Data Teams Interact With Marketing Tools with Jason Davis (twitter) of Simon Data

Why you should listen – For a discussion on how data teams and marketing teams can productively collaborate, and to find out how a marketing CDP can enable marketers to become more data-driven while saving the data team the legwork.

🎧 Listen / Tweet

Data Quality and Data Contracts with Chad Sanderson of Data Quality Camp

Why you should listen – To get Chad’s expert advice on the value of data contracts, dealing with the semantic and logical layers of data, implicit contracts at companies, and how contracts fit into data infrastructure.

🎧 Listen / Tweet

🤗 Data Council Austin

Last week we recorded some in-person magic in breakout room 109 at Data Council Austin. Find out who we talked to in our Twitter thread, and make sure to subscribe to the show if you haven’t already. We’re sprinting to edit these SEVEN special episodes to publish later this month – you won’t want to miss them.

🎥 The April Preview

Get ready to hear from these brilliant minds this month:

Sammy Sidhu – Co-Founder, CEO of Eventual
H.O. Maycotte – Entrepreneur, Founder, and Investor at FeatureBase
Andy Pavlo and Dana Van Aken – Co-Founder & CTO at Ottertune
Data Council Week! – Seven(!) shows served up for you in one week (see above 🙂)

🔗 Saved to Pocket

Further reading from last month’s shows plus curated links from Eric and Kostas:

Teaching Large Language Models to Zip Their Lips – With the call to jam the brakes on AI, you’ll find this article from the team at Gretel particularly enlightening. Find out what they’re doing to preserve privacy and reduce bias in LLMs.
What Is a Time-Series Database and Why Do You Need One? – Dive deeper into time-series databases in this explainer from Timescale.
The Convergence of BI & Product Analytics – See a breakdown of Product Analytics vs. BI architectures, and find out how Netspring is bringing the two together.
ChatGPT Will Not Replace Data Engineers – Read Chad’s latest post on LinkedIn to find out why he’s not worried about ChatGPT.
A Pipeline Stack for Deeper Analysis of Garmin data – Eric nerded out pretty hard to build a report that shows him how many hours, not miles, he’s put on various mountain bike components. Check out the write-up to find out how he used RudderStack, BigQuery, and Mixpanel to build the missing report from his Garmin data.
MLOps is Mostly Data Engineering – Read Kostas’ latest post on how Data Engineering is the foundation of ML in production.

🗓 Upcoming Events

Join once and future guests of The Data Stack Show at these upcoming events:

4/19 | DRE Con | Data Reliability Engineering Conference
4/27 | RudderStack | Pipelines & Pints – NYC

🙏 Gratitude

“We just got back from Data Council Austin. It was great to meet so many of our guests in person. I got to have some great conversations, and we got to record a few of them. We also had a chance for the first time to meet a few of our listeners at our meet & greet last Wednesday, which was amazing. Thank you all for supporting the show, and thanks to everyone who worked hard to make Data Council Austin happen.” –Kostas

Thanks for reading! If it was worth your time, please share the newsletter with your friends, and subscribe if you haven’t yet. Oh, and we’d love to hear from you. Reply to this email if you have any feedback for us or just want to connect. ✌ See ya next month.

- Brooks & The Data Stack Show Team

Discussion about this post

Ready for more?