A Comparison of different Big Data Processing Platforms
Cluster | Expected Volume | Benchmark hardware | Project Hardware requirements | |||
Cores | RAM | # nodes | Disk | |||
Source | 6 Million records / month ~ 3 records per second | |||||
HDFS | 6 million/month | 1 namenode, 20 datanodes, 2 CPU/node, 64GB RAM/node | 1 | 6G | 1 : Master 3: Slaves | 120% of 6G =7.2GB/month |
Kafka | 4 topics 6 million/month per topic | 1 nodes @ 4 GB RAM, 1 CPU,200 GB disk each 16,000 Msg/sec | 1 | 4G | 1 | 24 GB/month |
Kafka Connector for HBASE | N/A | |||||
Kafka Connector for HDFS | N/A | |||||
Logstash | 6 million/month | 1 node, 3.75GB RAM, 1 CPU Cores 180 Events/sec | 1 | 4G | 1 | |
Hbase | 6GB/month | 7 nodes, 32gb RAM, 8 CPU cores 60240 req/sec – 200 req/sec | 1 | 8G | 1 | 4GB/month |