Second Brain: Crafted, Curated, Connected, Compounded on 10月02日
数据工程与AI趋势
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了数据工程与AI结合的趋势,重点讨论了GenBI(生成式商业智能)的兴起。文章指出,虽然AI可以处理技术任务,但人类判断和上下文理解对于GenBI工作流程中的指导和质量保证仍然至关重要。文章还强调了高质量数据、速度和BI代码作为关键组件,以及开发者体验的转型。最后,文章讨论了数据角色的定义、Bluesky的流行、DuckDB的应用、数据堆栈的演变以及数据建模的重要性。

📊 GenBI(生成式商业智能)通过结合自然语言和商业智能工作流程,实现了真正的自助式商业智能,将线性手动开发转变为迭代式人机协作。

🔍 高质量数据、速度(OLAP系统如DuckDB/MotherDuck)和BI代码(声明式YAML/SQL)是有效实施GenBI的基础。

🗣️ LLMs(大型语言模型)在声明式接口上表现最佳,使用YAML进行配置和SQL进行业务逻辑,为人类和AI之间提供了共同的商业上下文语言。

⏱️ GenBI将洞察时间从数天缩短到数分钟,使技术开发者和业务用户能够在数秒内生成仪表板,同时保持版本控制和自动化的好处。

🗨️ 类似于ChatGPT通过语义商业层和MCP集成提供上下文理解,GenBI解决了自助式商业智能的挑战。

Here I have a list of my appearance online or slides I shared during an event.

# 9. Andreas Kretz Podcast: Why Writing Is Thinking, and What Data Engineers Can Learn from It

In this episode, Andreas Kretz and I discuss about:

    Writing as a thinking toolRecognizing the same data patterns reinvented over decadesBuilding a Second Brain to grow and organize your own expertiseWhy the real value often lies beneath the tools we use

🎧 Listen as Podcast
📺 Watch on YouTube

# 8. Event: SQL Meets GenBI: AI-Powered Dashboards in a Declarative World

Applied AI, Data & Analytics Meetup at Google Zurich that brought together the worlds of Data Engineering, analytics and AI on June 3rd 2025.
This meetup blends the foundations of modern data infrastructure, analytics with the emerging wave of AI-assisted workflows. Whether you’re building scalable pipelines, experimenting with LLMs for analytics, or creating tools to bridge both—this event is for you!

# Key Insights

I talk about that while AI handles technical tasks, human judgment and context remain crucial for steering and quality assurance in the GenBI workflow:

    GenBI Revolution: After 20 years of unfulfilled promises, AI finally enables true self-serve BI (?) by combining natural language with analytics workflows, moving from linear manual development to iterative human-AI collaboration.Three Critical Components: Quality data (DWH/organized lake), speed (OLAP systems like DuckDB/MotherDuck), and BI-as-Code (declarative YAML/SQL) form the foundation for effective GenBI implementation.Declarative as AI Language: LLMs work best with declarative interfaces using YAML for configuration and SQL for business logic - creating a common language between humans and AI that provides essential business context.Developer Experience Transformation: GenBI reduces time-to-insight from days to minutes, enabling both technical developers and business users to generate dashboards in seconds while maintaining the benefits of version control and automation.The ChatGPT Moment for BI: Just as ChatGPT demonstrated complex question-answering with minimal effort, GenBI solves the self-serve BI challenge through semantic business layers and MCP integrations that provide contextual understanding.Self-Serve BI Finally Achieved: After two decades of evolution from pixel-perfect reports to drag-and-drop tools to modern data stacks, 2025 marks the first time true self-service BI is actually possible through AI-powered workflows.

# 7. Joe Reis Podcast: The Art of Writing about Data Engineering

Joe Reis introduction:

Ever wondered how to turn complex data engineering concepts into compelling content? Join me as I dive deep with Simon Späti, a master of writing about data engineering. We explore ‘The Art of Writing about Data Engineering,’ uncovering his journey from business intelligence to becoming a leading voice in the field.

In this episode, Joe and I discuss about:

    The essential skills for writing effective data engineering content.How AI is transforming the writing process (and what it means for you).My personal journey and the lessons I learned along the way.The future of technical writing.How to build an audience through your writing.

🎧 Listen on Spotify
📺 Wahttps://www.ssp.sh/brain/My%20Gigs-1742559484757.webp"/brain/My%20Gigs-1742559484757.webp">

# 6. Data Talks on the Rocks (Rill)

I had the pleasure of talking with Mike Driscoll from Rill Data. Our topics include

    My journey from data engineer to authorThe hype behind Bluesky🦋Interesting use cases for DuckDB in productionWhat do you call data people? analysts? data developers? data engineers?Latest trends in object storage, data modeling, & declarative stacks

📺 Watch on YouTube
🎧

# Key Insights

    Data Roles: The definition of data professionals (analyst, engineer, etc.) varies significantly by region and company, with many starting in dashboards before moving deeper into the stack.Bluesky: Gaining popularity due to its open architecture and Twitter-like feel, running on-premise with just 20 people serving 20M+ users using SQLite databases.DuckDB: Functions as a fast, lightweight OLAP solution excelling as a zero-copy layer, with notable production uses at Okta, Spare Course, and Hugging Face.Data Stack Evolution: Trending toward object storage (S3) as foundation, with movement from unbundling to rebundling of tools, and growing adoption of declarative approaches.Data Modeling: Remains crucial despite “schema-on-read” trends, with emphasis on centralized metrics definitions and business rules, though cloud vendor incentives don’t align with efficient modeling.Future Outlook: Growing importance of human judgment alongside AI/GenBI tools, with a continued need for balancing quick wins against thoughtful architecture.Your Journey: Discussed my path from computer science apprentice in Switzerland to data engineer, technical writer, and now author focusing on data engineering patterns.

# 5. Bedag Podcast: Open Source Meets Data Engineering

This podcast episode Matthias Stürmer and I dive into the topic of Open Source and Data. Matthias is a professor at the Bern University of Applied Sciences (BFH) and Head of the Institute for Public Sector Transformation. I’m grateful for his time and the insightful discussion we had.

🎧 Listen on Spotify - 🚨 Attention: This podcast is in Swiss German

# Key Insights

# Open Source: The Core of Innovation

    What is Open Source? A behind-the-scenes look: Open Source stands for access and freedom. Unlike Closed Source software like CRM or ERP, Open Source offers the opportunity to customize and further develop the code.The Open Model: Why it represents the perfect balance between Open Source flexibility and commercial success. No lock-in, more flexibility, and innovation, but with the awareness that more work and maintenance are required.Open Core or Closed Source? The choice between Buy vs. Build depends heavily on technical capabilities and long-term goals. A Managed Service supporting the Open Core model can be an attractive solution.

# Data Engineering: The Future of Data Processing

    HelloDATA: An open enterprise data platform, developed by Bedag and adopted by the Canton of Bern. We discuss the advantages of choosing Open Source for data platforms and why it can be a superior alternative to solutions like Azure/Power BI.Why not Azure/Power BI? A critical look at the lock-in risks and the importance of Open Standards and Open Source for a sustainable data strategy.

Join us as we explore these topics with industry experts and gain deep insights into the world of Open Source and Data Engineering. Your opinions and questions are, as always, welcome – let’s shape the future of technology together!

Podcast Bedag - OSS Data Engineering

# 4. Open-source data engineering, productivity, writing, and philosophy

In this episode, I had the pleasure of talking to Thomas Bustos on “Let’s Talk AI.” We explored topics close to my heart, ranging from open-source data engineering to the nuances of my productivity workflow. We even touched upon the philosophies I’ve embraced through life’s journey. As a Swiss data engineer who transitioned into a writer and author, this episode encapsulates my two decades of experience and my unbridled passion for productivity and knowledge sharing.

📺 Watch on YouTube
🎧

# Key Insights

Some sneak peeks from it:

Reflecting on the Modern Data Stack: The future of data engineering, I see a landscape where tools consolidate, integrating more seamlessly, with Kubernetes steering the ship. My fascination lies in the continuous evolution of the Semantic Layer and the untapped potential of Rust in the realm of data engineering. These advancements might redefine how we approach data engineering in the future.

Career Insights: I advocate for a “pathless path” approach, one that’s guided by curiosity and intuition. Engage in personal projects; they are fun and offer deep, practical learning. Remember, learning is a journey, not a race, so take your time to absorb and understand.

Merging Programming, Writing, and Note-Taking: In pursuing productivity, I’ve found a unique harmony in combining my love for programming with my passion for writing and note-taking. Tools like Obsidian and Vim motions are integral to my workflow, enhancing efficiency and organization. Embracing the Second Brain concept, I’ve created a comprehensive knowledge repository, uniting all these facets seamlessly.

Embracing Minimalism in Work and Writing: My work, both as a data engineer and writer, is positively influenced by minimalism. This approach helps me focus on the essentials, strip away the superfluous, and truly listen to my intuition. This minimalist philosophy aids in making more meaningful decisions in my professional and personal life.

Stoic Principles: Adopting Stoicism has been transformative, helping me manage my thoughts and emotions effectively. The stoic principle of focusing on what I can control and accepting what I can’t has been a guiding light in my work and personal life.

Advice on DE Programming Languages: In the ever-evolving landscape of data engineering, Python and SQL remain the cornerstone, with Rust emerging as a powerful tool for performant applications.

Ending Message: Stay curious, trust yourself, and embrace the journey. I’m a proponent of “Slow Productivity” – a mindful, deliberate approach to work and life.

Date: 2023-11-28

# 3. TechTalk with Databricks Denny Lee: Data Integration with Airbyte and Delta Lake

In this engaging session with Denny Lee from Databricks, I explored the intricate world of data integration using Airbyte and Delta Lake. We delved into the challenges and solutions of integrating diverse data sources into a unified data lake. This discussion was particularly enriching as it combined practical insights with theoretical understanding, bridging the gap between concept and application in the realm of data engineering.

📺 Watch on YouTube
🎧 Listen on Spotify

# Key Insights

    Integration Challenges: We discussed the complexity of integrating multiple data sources, emphasizing the importance of robust and scalable solutions like Airbyte for efficient data ingestion.The Power of Delta Lake: The conversation highlighted Delta Lake’s role in simplifying data management, offering a structured yet flexible approach to handling vast datasets.Open Source Synergy: A significant focus was on the synergy between open-source technologies like Airbyte and Delta Lake, showcasing how they empower data engineers to build more dynamic and resilient data pipelines.Future of Data Engineering: The discussion ventured into the future trends in data engineering, touching on the evolving landscape and the role of emerging technologies in shaping it.

# Personal Reflection

The interaction with Denny not only allowed for a deep dive into technical nuances but also offered a chance to reflect on the broader implications and future direction of data engineering. It served as a reminder of the continuous learning and adaptation required in our ever-evolving field.

Date: 2022-12-06

# 2. Data Engineer Turns His Hobby Into a Career

In this heartening podcast episode, I delved into my personal journey from a business intelligence engineer to a data engineer and eventually embraced my passion for writing as a career. The pivot from a conventional BI engineering role to the more dynamic field of data engineering was driven by a desire for automation and a deeper engagement with coding, moving away from the drag-and-drop tools of traditional BI platforms.

📺 Watch on YouTube

# Key Insights

    Transition from BI to Data Engineering: This episode highlighted my shift from business intelligence, where I worked with tools like SSIS and Oracle, to embracing the challenges and creativity of data engineering. It underscored the evolution of the field and how I adapted to these changes.Embracing Writing as a Career: Discussing my side hobby of writing about data, productivity, and personal knowledge management, I shared how this hobby transformed into a professional pursuit. This shift was marked by a transition from writing as a personal interest to taking up technical writing as part of my professional role.The Power of Continuous Learning: The conversation also focused on the importance of continuous learning and staying updated with the latest trends in technology. This aspect has been crucial in both my technical and writing endeavors.Innovation and Automation: I shared insights into my drive for innovation and automation, a core aspect of my work in data engineering. This reflects my approach to work, where I constantly seek to streamline processes and enhance efficiency.

# Personal Reflection

Reflecting on my journey, this podcast with Chris was a testament to the power of pursuing one’s passions and interests. My story illustrates that career paths are not linear and that embracing change and new opportunities can lead to fulfilling and dynamic career trajectories. This episode serves as an encouragement to others to explore and integrate their hobbies and passions into their professional lives.

Date: 2022-08-23

# 1. Dagster at Rohde and Schwarz Mobile Network Testing

In this podcast episode, I shared the innovative use of Dagster at Rohde & Schwarz Mobile Network Testing. My focus was on detailing the transformation in data architecture at Rohde & Schwarz, showcasing how we harnessed the power of open-source data engineering tools, orchestrated with Dagster.

📺 Watch on YouTubeView Slides

# Key Insights

    Rohde & Schwarz’s Journey: I described Rohde & Schwarz’s journey in developing smart analytics for mobile network testing, highlighting the company’s focus on electronic test equipment, broadcast and media, cybersecurity, and radio communications.Innovative Data Architecture: The core of my talk revolved around the innovative data architecture we implemented at Rohde & Schwarz. This included the transition of our data warehouse to the cloud, aiming for big data readiness and the integration of various open-source tools.Dagster’s Role: I explained how Dagster played a pivotal role in managing and orchestrating our ETL (Extract, Transform, Load) logic in a centralized manner. The move to Dagster was motivated by the need for a more scalable, cloud-compatible tool that could handle increasing data volumes due to 5G advancements.Custom ETL and Challenges: The custom ETL processes were a significant part of our architecture, dealing with complex data transformation and aggregation. Challenges included handling large data sets, ensuring fast query responses, and maintaining an efficient web UI for user interactions.Dagster Advantages: I elaborated on the advantages of using Dagster, such as its out-of-the-box features like restart capabilities, backfilling, dependency management, and a user-friendly UI. Dagster’s functional design and the reusability of code significantly streamlined our development process.Future Directions: Looking forward, I highlighted our plans to enhance our use of Dagster with unit testing, smoke tests, more intensive use of assets, integration of data lineage features, and the exploration of dynamic orchestration and partitioning.

# Personal Reflection

This presentation was a deep dive into how data engineering and architecture can be revolutionized in a corporate setting. It emphasized the importance of embracing open-source tools and technologies to stay ahead in the rapidly evolving field of data engineering. My journey with Dagster at Rohde & Schwarz serves as a case study for others in the industry, demonstrating how adapting to new tools and methodologies can lead to significant improvements in data processing and analysis.

Date: 2021-05-11


Origin: CV
References: Data Engineering Podcasts, Joe Reis Podcast with Me
Created 2023-11-27

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

数据工程 AI GenBI 商业智能 数据堆栈
相关文章