Setting Up a 2-Node Hadoop Cluster and Kafka for Distributed Data Collection and Web Log Analysis

Published in PTIT, 2024

This report outlines the steps to set up a 2-node Hadoop cluster and Kafka for collecting distributed data from multiple nodes, such as web logs. Additionally, it provides a guide to writing a MapReduce code for web log analysis and setting up a dashboard for data visualization. This setup ensures efficient data processing and real-time insights into the collected data.

Recommended citation:
Download Paper | Download Slides