Posts Tagged Big Data Streaming

Big Data Ingestion: Flume, Kafka and NiFi

Preliminaries When building Big Data pipelines, we need to think on how to ingest the Volume, Variety and Velocity of data showing up at the gates of what would typically be a Hadoop ecosystem. Preliminary considerations such as scalability, reliability, adaptability, cost in terms of development time, etc. will all come into play when deciding […]

, , , , , , , , , , ,

Leave a comment

Streaming Big Data: Storm, Spark and Samza

There are a number of distributed computation systems that can process Big Data in real time or near-real time. This article will start with a short description of three Apache frameworks, and attempt to provide a quick, high-level overview of some of their similarities and differences. Apache Storm In Storm, you design a graph of real-time computation called a topology, […]

, , , , , , ,