PoddsändningarTeknologiThe Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

Astronomer
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Senaste avsnittet

92 avsnitt

  • The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

    Scaling Airflow at Wix for Analytics and AI with Ethan Shalev

    2026-02-26 | 18 min.
    Modern data orchestration at scale demands reliability, speed and thoughtful adoption of new tooling. As organizations grow, keeping pipelines efficient while supporting more teams becomes a critical challenge.

    In this episode, we’re joined by Ethan Shalev, Data Engineer at Wix, to discuss how Wix operates Airflow at massive scale, migrates to Airflow 3 and uses AI to accelerate development.

    Key Takeaways:

    00:00 Introduction.
    02:13 Wix structures data engineering across multiple product-focused organizations.
    03:40 Migrating nearly 8,000 DAGs to Airflow 3 requires careful planning.
    04:31 Migration creates an opportunity to remove long-standing legacy Airflow code.
    05:32 Internal playbooks and Cursor rules standardize and speed up DAG migrations.
    07:39 Airflow 3 introduces backfills, DAG versioning and asset-aware scheduling.
    09:16 Deferrable operators reduce scheduler congestion in large Airflow environments.
    12:54 AI-generated code still requires review and strong testing practices.
    14:52 Moving to managed Airflow reduces operational burden on internal platform teams.
    15:57 Improving multi-tenancy and UI personalization remains a key Airflow need.

    Resources Mentioned:

    Ethan Shalev
    https://www.linkedin.com/in/eshalev/

    Wix | LinkedIn
    https://www.linkedin.com/company/wix-com/

    Wix | Website
    https://www.wix.com/

    Apache Airflow
    https://airflow.apache.org/

    Astronomer
    https://www.astronomer.io/

    Trino
    https://trino.io/

    Apache Iceberg
    https://iceberg.apache.org/

    Cursor
    https://cursor.sh/

    Airflow Summit
    https://airflowsummit.org/

    Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.

    #AI #Automation #Airflow
  • The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

    Using Airflow To Orchestrate Billions of Events at Addi with Carlos Daniel Puerto Niño

    2026-02-19 | 24 min.
    Strong data orchestration is as much about culture and visibility as it is about technology. As data platforms scale, teams need systems that reduce cognitive load while increasing reliability and observability.

    In this episode, Carlos Daniel Puerto Niño, Senior Analytics Engineer and Data Analyst at Addi, joins us to share how Addi uses Airflow to support batch orchestration, manage organizational complexity and improve monitoring across its data platform.

    Key Takeaways:

    00:00 Introduction.
    01:25 Changes in company strategy increase data platform complexity over time.
    04:00 Centralized data teams help manage organizational and technical change.
    06:08 Scalable architectures support growing data volumes and use cases.
    09:10 Adopting orchestration tools introduces operational and maintenance challenges.
    14:43 Abstraction layers lower technical barriers for onboarding new team members.
    15:36 Modularity and visibility improve the reliability of data pipelines.
    18:14 Integrated monitoring supports faster incident response and resolution.
    22:19 Limited access to orchestration metadata constrains proactive analysis.

    Resources Mentioned:

    Carlos Daniel Puerto Niño
    https://www.linkedin.com/in/carlospuertoni%C3%B1o/

    Addi | LinkedIn
    https://www.linkedin.com/company/addicol/

    Addi | Website
    https://www.addi.com

    Apache Airflow
    https://airflow.apache.org/

    Astronomer
    https://www.astronomer.io/

    Databricks
    https://www.databricks.com/

    dbt
    https://www.getdbt.com/

    Grafana
    https://grafana.com/

    Slack
    https://slack.com/

    Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.

    #AI #Automation #Airflow
  • The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

    Building Event-Driven Data Pipelines With Airflow 3 at Astrafy with Andrea Bombino

    2026-02-12 | 18 min.
    Real-time data expectations are reshaping how modern data teams think about orchestration and dependencies. As event-driven architectures become more common, teams need to rethink how pipelines react to data changes, rather than schedules.

    In this episode, Andrea Bombino, Co-Founder and Head of Analytics Engineering at Astrafy, joins us to discuss how event-driven scheduling in Airflow is evolving and how Astrafy applies it to deliver faster, more responsive data pipelines.

    Key Takeaways:

    00:00 Introduction.
    02:02 Astrafy’s role in guiding clients across the modern data stack.
    03:15 Strong DAG dependencies create challenges for time-based scheduling.
    04:48 Event-driven pipelines respond to increasing real-time data demands.
    05:30 Airflow 3 introduces native support for event-driven orchestration.
    06:27 Sensor-based workflows reveal scalability and efficiency limitations.
    11:32 Event-driven assets improve efficiency and pipeline elegance.
    14:45 Governance and cross-instance coordination emerge as ongoing challenges.

    Resources Mentioned:

    Andrea Bombino
    https://www.linkedin.com/in/andrea-bombino/

    Astrafy | LinkedIn
    https://www.linkedin.com/company/astrafy/

    Astrafy | Website
    https://www.astrafy.io

    Apache Airflow
    https://airflow.apache.org/

    Google Cloud
    https://cloud.google.com/

    Google Pub/Sub
    https://cloud.google.com/pubsub

    Google BigQuery
    https://cloud.google.com/bigquery

    Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.

    #AI #Automation #Airflow
  • The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

    Uphold’s Approach to Orchestrating Modern Data Workflows with Jaime Oliveira

    2026-02-05 | 18 min.
    A strong data-driven mindset underpins how fintech teams scale analytics, infrastructure and decision-making across the business.

    In this episode, Jaime Oliveira, Lead Data Engineer at Uphold, joins us to discuss how Uphold structures its data organization and orchestration strategy. Jaime shares how the team uses Airflow and dbt to support analytics, reporting and data activation while evolving their approach as the stack grows.

    Key Takeaways:

    00:00 Introduction.
    01:23 A data-driven mindset supports product development and business decisions.
    02:55 Diverse ingestion pipelines enable scalable analytics.
    04:18 A single orchestration platform simplifies analytics workflows.
    05:17 Early experience with orchestration tools shapes engineering practices.
    08:16 Analytics orchestration works best when aligned with transformation workflows.
    09:25 Infrastructure choices involve tradeoffs in testing, visibility and overhead.
    16:39 More collaborative workflow tools could improve accessibility and autonomy.

    Resources Mentioned:

    Jaime Oliveira
    https://www.linkedin.com/in/jaime-oliveira-b075855a/

    Uphold | LinkedIn
    https://www.linkedin.com/company/upholdinc/

    Uphold | Website
    https://uphold.com

    Apache Airflow
    https://airflow.apache.org

    dbt
    https://www.getdbt.com

    Snowflake
    https://www.snowflake.com

    Kubernetes
    https://kubernetes.io

    Astronomer Cosmos
    https://astronomer.github.io/astronomer-cosmos

    Cosmos e-book
    https://www.astronomer.io/ebooks/orchestrating-dbt-with-airflow-using-cosmos/

    Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.

    #AI #Automation #Airflow
  • The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

    Modern Airflow Best Practices for Scalable Data Pipelines with Bhavani Ravi

    2026-01-29 | 17 min.
    Building reliable data pipelines at scale requires more than writing code. It depends on thoughtful design, infrastructure trade-offs and an understanding of how orchestration platforms evolve over time.

    In this episode, Airflow best practices shaped by real-world implementation are examined. Bhavani Ravi, Independent Software Consultant and Apache Airflow Champion, shares lessons on pipeline design, architectural decisions and the evolution of the Airflow ecosystem in modern data environments.

    Key Takeaways:

    00:00 Introduction.
    01:30 Independent consulting supports effective Airflow adoption.
    02:38 Early challenges shaped modern Airflow practices.
    03:21 Airflow setup has become significantly simpler.
    04:30 New features expanded workflow capabilities.
    06:03 Frequent releases support long-term sustainability.
    07:34 Community and providers strengthen the ecosystem.
    10:03 Pipeline design should come before coding.
    10:55 Decoupling logic requires careful trade-offs.
    13:30 Plugins extend Airflow into new use cases.

    Resources Mentioned:

    Bhavani Ravi
    https://www.linkedin.com/in/bhavanicodes/

    Apache Airflow
    https://airflow.apache.org/

    Kubernetes
    https://kubernetes.io/

    Azure Fabric
    https://learn.microsoft.com/en-us/fabric/

    Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.

    #AI #Automation #Airflow

Fler podcasts i Teknologi

Om The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

Welcome to The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI— the podcast where we keep you up to date with insights and ideas propelling the Airflow community forward. Join us each week, as we explore the current state, future and potential of Airflow with leading thinkers in the community, and discover how best to leverage this workflow management system to meet the ever-evolving needs of data engineering and AI ecosystems. Podcast Webpage: https://www.astronomer.io/podcast/
Podcast-webbplats

Lyssna på The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI, Darknet Diaries och många andra poddar från världens alla hörn med radio.se-appen

Hämta den kostnadsfria radio.se-appen

  • Bokmärk stationer och podcasts
  • Strömma via Wi-Fi eller Bluetooth
  • Stödjer Carplay & Android Auto
  • Många andra appfunktioner

The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI: Poddsändningar i Familj

Sociala nätverk
v8.7.0 | © 2007-2026 radio.de GmbH
Generated: 3/1/2026 - 6:54:13 AM