Contribute Media
A thank you to everyone who makes this possible: Read More

Building Data Pipelines on Apache NiFi with Python

Description

Day 2, R0 13:15–14:00

In today's big data world, the data you need to analyze comes from diverse sources in a variety of different formats. Combining all that data and reconciling it is incredibly difficult. Based on your need, adopting a proper and manageable ETL tool can make data integration easier.

An open source project, Apache NiFi, is a tool to built to automate and manage the flow of data between systems. You can use NiFi to build streaming data pipelines between different data-related systems, including Apache Kafka and Apache Spark, various RDBS, and so much more!

In this talk, I will start with introducing a concept of ETL and Apache NiFi, what it can solve, and how to use Python to enable NiFi's ability. Then, a sample demo will help you to understand how to build a streaming data pipeline with NiFi.

Slides: https://speakerdeck.com/sucitw/building-data-pipelines-on-apache-nifi-with-python

Speaker: Shuhsi Lin

A data engineer and python programmer. Currently working on various data applications in a manufacturing company.

Research interests: IoT applications, data streaming processing, data analysis and data visualization.

Details

Improve this page