A Comparison of different Big Data Processing Platforms
Cluster | Expected Volume | Benchmark hardware | Project Hardware requirements | |||
Cores | RAM | # nodes | Disk | |||
Source | 6 Million records / month
~ 3 records per second |
|||||
HDFS | 6 million/month | 1 namenode, 20 datanodes, 2 CPU/node, 64GB RAM/node | 1 | 6G | 1 : Master
3: Slaves |
120% of 6G
=7.2GB/month |
Kafka | 4 topics
6 million/month per topic |
1 nodes @ 4 GB RAM, 1 CPU,200 GB disk each
16,000 Msg/sec |
1 | 4G | 1 | 24 GB/month |
Kafka Connector for HBASE | N/A | |||||
Kafka Connector for HDFS | N/A | |||||
Logstash | 6 million/month | 1 node, 3.75GB RAM, 1 CPU Cores
180 Events/sec |
1 | 4G | 1 | |
Hbase | 6GB/month | 7 nodes, 32gb RAM, 8 CPU cores
60240 req/sec – 200 req/sec |
1 | 8G | 1 | 4GB/month |