Cluster Expected Volume Benchmark hardware Project Hardware requirements Cores RAM # nodes Disk Source 6 Million records / month ~ 3 records per second HDFS 6 million/month 1 namenode, 20 datanodes, 2 CPU/node, 64GB RAM/node 1 6G 1 : Master 3: Slaves 120% of 6G =7.2GB/month Kafka 4 topics 6 million/month per topic 1 nodes […]
With the help of Big data, Small Businesses can gain the competitive edge they require to stay ahead of the curve. For beginners, Big data includes large data sets of information that can reveal insights about your customers to help you make valuable business decisions. Data can help you to compose an overall strategy for […]
Data Scientists are known for having a knack for statistics, data analysis, etc. to understand and obtain insights from a given dataset, usually quite enormous in quantity. If you don’t think you have this skill set, but the career appeals to you, you can always take a course like a springboard data science which will […]
What is a Dashboard? A Dashboard is a visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glance. A user interface that organizes, integrates, and presents mission-critical information, pulled from multiple sources to users in […]
I have been involved in consulting for about 20 years now, and I think one of the biggest challenges (in the long term) for consulting firms is building a sense of ownership and fairness within the company. One of the biggest problems with people, in general, is that they overvalue what they do and undervalue […]
Resource Management in Information Technology There is a whole host of technology available nowadays to ensure that your IT hardware resources manages efficiently. You may have a data center in-house, and a few cloud nodes/services/apps which together may constitute your investment in hardware. That would translate to resource capacity: memory, disk, and processor. In the […]
Synchronous vs Async pipelines Synchronous big data pipelines are a series of data processing components that get triggered when a user invokes an action on a screen. e.g. clicking a button. The user typically waits till a response receives to intimate the user of the results. In contrast in asynchronous implementation, the user initiates the […]
Logic app is a codeless integration service for communication between multiple platforms and services. It has an extensive set of connectors for communication purposes. In this article, we are just going to discuss two of them, Dropbox and SMTP connectors. Logic apps can perform various actions on Dropbox-like, get all lists of files, copying files […]
In BizTalk Server, SQL Adapter is the main component in order to communicate with the SQL Server. On the other hand, In Logic App we use the connector to communicate with the SQL server. In this tutorial, we will create a table in SQL Azure and perform read and write operations using the logic app. […]
Platform-as-a-Services in Azure We are going to discuss the Platform-as-a-Services offering from Microsoft Azure. The platform, as a service part of the Microsoft cloud, offers a lot of services that you can use to build applications. Using Microsoft azure you can target any platform with any technology. You can host any application like .net, node.js […]