Apache Kafka is used when you require real-time data ingestion and processing, especially for scenarios where quick data updates are essential, like monitoring systems or online transaction processing. For example, if you're building a system to handle live user activity logs from a website and need to process those logs in real-time to respond to user actions instantly, Kafka is a better choice than HDFS.
HDFS is ideal when you need to store large volumes of data permanently for batch processing. It is well-suited for scenarios where data analysis, like Hadoop MapReduce jobs, is needed over large datasets, such as analyzing historical data for trends.