Big Data Blog Evaluation Marketing

Big Data Hadoop Use Cases in 8 Verticals

Best Hadoop Use Cases in Eight Industries

Hadoop in Retail

Big Data Analytics is not a new concept in the retail industry. A survey conducted in June 2013 by Gartner predicts that Big Data spending in Retail Analytics will cross the $232 billion mark by 2016. Gartner survey also projects that the retail Big Data Analytics market is anticipated to grow from $1.8 billion in 2014 to 4.5 billion dollars in 2019.

An example of a company that successfully employed the use of Big Data to gain retail insights is Amazon. In all the use cases, it is enlisted how Amazon benefited from using Big Data technologies.

Following are the key Hadoop use cases that can benefit the retail industry:

  • Dynamic Pricing

With a boom in retail channels and increased demand for social media, consumers can compare the services, products, and prices regardless of the fact that they shop online or in retail stores

With the advent of new social channels, consumers are moving away from traditional brick-and-mortar retail stores to online shopping arenas

There is a need to ba uild dynamic pricing platform that will trigger pricing decisions among biggest retailers. Two ways the retailers can do that are:

Internal Profitability Intelligence

 Every online transaction is tracked at unit-level profitability by taking into consideration various variable costs such as vendor funding, COGS (Cost of Goods Sold), and shipping charges.

External Competitor Intelligence

For a given set of retailer products, retail analytics provide real-time intelligence information about those products on the competitor’s website with corresponding prices.

The impact of building dynamic pricing for retailers is:

  • Increased profit margin
  • Competitive advantage
  • Increased customer satisfaction

Amazon’s analytical platform has a great advantage in dynamic pricing as it responds to the competitive market rapidly by changing the prices of its products every 2 minutes (if required) whilst other retailers change the prices of the products every 3 months, reports Dezyre.

  • Localize and personalize promotions
    • Personalization of promotions depends on various factors such as demographics, location-specific attributes, and purchase behavior of the customer.
    • By using Big data technologies like Hadoop, customer personalized experience will bring in utmost customer service resulting in happy customers.
    • Retailers can use Big Data by providing customized messages and shopping offers
    • Retailers can recommend products based on what other similar customers have bought—providing upsell, cross-sell, or “next best offer” opportunities

Amazon has pioneered a personalization strategy by using product-based collaborative retail analytics. Amazon provides data-driven recommendations to customers depending on previous purchase history, browser cookies, and wish lists.

  • Predicting stock demands
    • Profiling consumers is just one-way retail use cases can drive profits from big data.
    • The process of demand management can be improved by exposing the results to customers via online portals and mobile apps to get them to work together with the retail company to improve their ordering patterns and provide more information that can help enhance the accuracy of the forecasting models
    • Retailers are making the most out of Big Data technologies like Hadoop to reduce costs and maximize profitability arising from forecasting consumer demand
  • Better security/ Fraud detection
    • Retail fraud can range from fraud in returns or abuse of customer service, or credit risk for larger purchases
    • The most common frauds in retail occur in the form of the fraudulent return of products purchased and stolen credit/debit card information
    • Retailers need to protect their margins and their reputations by proactively detecting fraudulent activities
    • Hadoop MapReduce and Spark perform analysis on more than 50 Petabytes of data accurately predicting the risks and frauds.

Amazon has an intensive program to detect and prevent credit card fraud, which has led to a 50% reduction in frauds within the first 6 months. Amazon developed fraud detection tools that use the scoring approach in predictive analysis. This retail analytics depends on huge data sets that contain not just financial information of the transactions but it also keeps a track of browser information, IP address of the users, and any other related technical data that might help Amazon refine their Analytic models to detect and prevent fraudulent activities.

 Hadoop in Financial Services

 Below are a few of the use cases that illustrate how big data and Hadoop are being integrated into the financial services industry, providing companies with insights into their operations, their customers, and their markets:

  • Customer Segmentation Analysis

Banks can create a more meaningful and effective context for marketing to customers if they can define distinct categories or “segments” in which each customer belongs

If a customer wants to depth and breadth of his financial portfolio, they need to have an integrated platform. For that financial institutions need to understand and comprehend the customer context:

   The context in financial services has the following two components:

  • Knowledge about customer’s holding
  • Understanding what type of financial service is likely to resonate with the customer

Big data help financial services firms create ‘buckets’ of customers having customers that can be grouped into different segments based on their banking needs

With an accurate and up-to-date customer segmentation, the customer base of banks and financial services can be increased by:

  • Improving relationships with profitable customers
  • Increasing new product development and bundling specific offers
  • Cutting costs by understanding channel usage
  • Real-time offers and portfolio optimization

In a recent poll by CapGemini, 85% of executives said that the real issue in executing big data in financial services is not about the customer-generated volume of data, but the ability to analyze and act on the data in real-time.

Big data can be used by banks through the use of software that supports flexible and integrated processes and can accordingly increase their profits by:

  • Customizing offers to segmented customers
  • Creating a targeted marketing campaign to segment customers

Before big data was controlled by technology, Bank of America took the usual approach to understand customers by relying on samples. Now with big data technology, the bank is using transaction and propensity models to determine which customers have a credit card or mortgage that could benefit from refinancing at a competitor and then makes an offer when the customer contacts the bank through an online, call center, or branch channel.

  • Fraud Detection
    • Fraud detection is a primary concern for financial companies since banks are the primary targets for cybercriminals and fraudsters
    • Banks have a vested interest in any technology that prevents data breaches and fraud

Hadoop & technologies built over core Hadoop makes it possible for banks to:

  • Capture real-time activity and immediately detects anomalies, detecting fraudulent behavior before any unwanted incident of fraud takes place
  • Combine data about parties that participate in trade with other parties which allows banks to recognize unusual trading activities
  • Use online data to understand the creditworthiness of new customers to minimize chances of fraud
  • Extract a wealth of data to help rank a person’s likelihood of defaulting instead of relying on tools like FICO scores (a type of credit score that lenders use to assess an applicant’s credit risk and whether to extend a loan)
  • Use separate data warehouses from multiple departments and combine them into a single global repository in Hadoop for analysis
  • Construct a new score of risk in the customer portfolio since an accurate score allows banks to manage their exposure better
  • Risk Management
    • New legal requirements and increasing demand for better internal management support make risk management a key sphere of concern for banks
    • The solution is a central integrated data platform that can quickly and flexibly address new data requirements

A good data management platform can:

  • Enable cross-functional data management process
  • Support different data source strategies
  • Consolidate and share data from diverse source systems into an integrated platform


With the introduction of advanced technology shifts into the manufacturing system, modern plants and manufacturing equipment have grown more sophisticated and automated.

Research by MIT Sloan Management Review found that two-third two-thirdstwo-thirds of survey respondents reported their companies had gained a competitive advantage by making better use of big data and analytics.

In manufacturing, operations managers can use advanced analytics to take a deep dive into historical process data, identify patterns and relationships among discrete process steps

The use of Big Data Analytics can provide benefits to the manufacturing sector in the following areas:

  1. Plant Operations and Production: manufacturers operate at a higher capacity, with operating margins averaging 16 percent higher and unscheduled downtime reduced by 8 percent.
  2. Sales and Customer Management: Companies keep more high-value customers with more responsive service and greater consistency in quality.
  3. Asset Management and Maintenance: Manufacturers leverage equipment-condition data and predictive analytics to plan optimal maintenance time, improving overall line efficiency and utilization while reducing unplanned stoppages.
  4. Supply Chain and Inventory: Manufacturers make use of extended connections to anticipate the availability of materials and the impact of factors that may influence supply.

According to a McKinsey Report, Big data experts and scientists can bring improvements in the above areas through the use of big data analytics by:

  • Identification of initial patterns

Using data visualization techniques through the use of moving averages, distribution histograms, standard deviations, and clustering to prioritize data collection and analysis.

  • Identification of core determinants of process performance

By using correlation analysis, the initial hypothesis about the root causes of yield drop and variability can be checked.

Schneider Electric, a global specialist in energy management with 150,000 employees and operations in more than 100 countries, used Big Data Analytics to gather, manage, and blend in-house and third-party data, delivering deep insights to its sales team in less than half the time than was previously required. The results of Big Data usage were deeper Insights into improved data quality and incorporation of more data sources and intuitive Workflow through shared and re-used analytic applications between analysts, increasing productivity and time-to-insight.


The last decade has seen a huge rise in the use of Big Data analytics to monitor and manage huge amounts of data for leveraging processes. In the healthcare industry too, Big Data is being used to harness the power of advanced analytics to predict epidemics, cure disease, improve quality of life and avoid preventable deaths.

Healthcare organizations are leveraging big data technology to capture all of the information about a patient to get a more complete view for insight into care coordination and outcomes-based reimbursement models, population health management, and patient engagement and outreach.

In the last few years, there has been a move toward evidence-based medicine, which involves making use of all clinical data available and factoring that into clinical and advanced analytics.

Big Data Analytics can benefit the healthcare sector by:

  • Allowing prediction of outbreaks using reliable EHR information on geographical distribution and incidence of diseases as quickly as possible
  • Using pattern matching for the productive patient and disease analysis
  • Correlating patient visits, diagnostics, and hospital-provider interaction across years of multiple visits
  • Helping to identify best care approaches via the usage of clinical analysis (longitudinal analysis of care across patients and diagnosis can be conducted
  • Allowing performance of fraud analysis and identification via patterns analysis

An innovative big data search platform is helping in the fight against cancer. A tool called CordMatch (an internet-based donor matching system), by Berlin-based Cytolon (a healthcare unit), uses big data techniques and a unique matching algorithm to quickly find cord blood matches for cancer patients in need of a stem cell transplant. Cytolon also uses Neo4j, a high-performance, enterprise-level graph database. The result is a customizable search option that simplifies the matching of cord blood units by using a combination of innovative frameworks and a big data graph.


With the explosive growth of smartphones, telecommunications service providers are seeing a huge expansion in the volume of data traveling across their networks.

These telecommunication service providers have a major opportunity to gain value from this huge amount of data across their enterprise, including customer insights for marketing and product development, network operations, sales, and risk management.

The benefits that can be gained from Big Data technologies are:

  • Revenue assurance and price optimization
  • Customer churn prevention
  • Campaign management and customer loyalty
  • Call Detail Record (CDR) analysis
  • Network performance and optimization

Big Data can be coupled with technology components to provide operational efficiency by:

  • Gaining a deeper understanding of switching, frequency utilization, and capacity use for capacity planning and management
  • Analyzing consumption of services and bandwidth in specific regions, helping with planning locations for infrastructure investment.
  • Capturing and analyzing data produced by the infrastructure and by using sensors which can accelerate troubleshooting information about the network.

Customer Churn Analysis can be done through the use of Big Data:

  • Enabling alerts when a customer exhibits behavior that suggests imminent defection is a critical requirement
  • Observing multiple factors (such as comments on social media and declining usage) along with historical data that show patterns of behavior that suggest churn, companies can predict when a customer is at risk of defecting

Revenue generation and product development can be done by:

  • Tracking and analyzing customer click streams to understand their preferences and propensity to buy
  • Optimizing web pages to increase conversion including cross-sell opportunities

Ufone, a Pakistan-based mobile-service provider, is using a combination of big data, software services, and enterprise systems to more precisely market special offers to its customers, which has significantly helped reduce its churn rate and improve the perception of the company. The hardware it uses is an IBM Power Systems 795 along with IBM InfoSphere Streams, IBM Unica Campaign Management solution, and various IBM WebSphere middleware tools as software.


In the past, hospitality and technology have never complemented each other because the hospitality players have always preferred to spend their time and resources on defined areas of operations such as improving the ambiance of a place, widening the choice of menus, and enhancing the quality of service delivery, instead of focusing on technology and big data.

Big Data provides the following benefits to the hospitality industry:

  1. More tableside transactions:
  • Restaurants are now taking wireless services a step further by handling credit card transactions at tableside.
  • New POS software support off-the-shelf tablet computers and smartphones to both place orders and accept payments from customers.
  • The transaction data is captured by the restaurant’s network to not only handle the transaction but capture important customer data that can be used for business analytics and customer loyalty programs
  • More self-service option
    • In addition to using online services to book reservations, hospitality providers are looking to integrate online services using on-site data capture solutions
    • Restaurants can match online profile data with self-service reservation check-in and match preferences regarding seating and service as part of the self-service experience
    • Customers get more personalized service by typically maintaining their online profiles, including preferences and payment options

The American hotel chain Denihan uses IBM Big Data Analytics software to maximize profit and revenue across their 3.450 rooms by combing their own data sets and data from review sites, blogs, and social network websites. By understanding the likes and dislikes of their guests, they optimize their offerings and adjust the room rates accordingly. This has allowed Denihan New York hotels to double the room rate by 2013.


Big Data and Logistics industry go hand-in-hand as far as operational efficiency and improvement are concerned. Logistics service providers move masses of goods and there is constantly more and more of it. At the same time, they gather data sets that are just waiting to be turned into information that supports decisions. Timely and accurate delivery can only be assured if data travels ahead of every single shipment.

In a recent study on supply chain trends by McKinsey, sixty percent of the respondents stated that they are planning to invest in Big Data Analytics within the next five years. Seeing this trend, it is important to highlight the benefits that are provided to the logistics industry by the use of Big Data.

  • Operational efficiency: real-time route optimization, crowd-based pickup and delivery, strategic network planning, and operational capacity planning
  • Customer experience: customer loyalty management, continuous service improvement and product innovation, and risk evaluation and resilience planning
  • New business models: market intelligence for small and medium-sized enterprises, financial demand and supply chain analytics, address verification, and environmental intelligence

The power of big data can be used to achieve efficiency in the Logistics and Supply Chain in the following ways:

  • Customer Loyalty Management: Public customer information is mapped against business parameters to predict customer churn rate.
  • Strategic Network Planning: Long-term demand forecasts for transport capacity are generated to support strategic investment
  • Environmental Intelligence: Sensors attached to delivery vehicles produce statistics on pollution, traffic density, noise, parking spot utilization, etc.
  • Financial demand and Supply Chain Analytics: A macroeconomic view is created on global supply chain data that helps financial institutions improve their rating and investment decisions
  • Risk Evaluation and Resilience Planning: by tracking and predicting events that lead to supply chain disruptions, the resilience level of transport services is enhanced.

Big Data techniques can be correlated to supply chain benefits through:

  • Consolidated pick-up and delivery:
  • The automated control of a large number of randomly moving delivery resources requires extensive data processing capabilities
  • Big Data techniques like complex event processing and geo-correlation can be used
  • A real-time data stream is traced to assign shipments to available carriers based on their respective location and destination
  • Interfaced through a mobile application affiliates publish their current position and accept pre-selected delivery assignments
  • Real-time route optimization:
    • When the delivery vehicle is loaded and unloaded a dynamic calculation of the optimal delivery sequence based on sensor-based detection of shipment items frees the staff from manual sequencing
    • On-the-road telematics databases are tapped to automatically change delivery routes according to current traffic conditions.
    • The routing intelligence considers availability and location information posted by recipients to avoid unsuccessful delivery attempts.
  • Predictive Network and Capacity Planning:
    • Big Data techniques support network planning and optimization by analyzing comprehensive historical capacity and utilization data of transit points and transportation routes
    • Seasonal factors and emerging trends of freight flows are considered by learning algorithms that are fed with extensive statistical series
    • External data like industry-specific and regional growth forecasts are included for more accurate prediction of specific transportation capacity demand

FedEx has created a next-generation, first-of-its-kind information service that combines a GPS sensor device and a web-based collaboration platform: SenseAware. Originally used by the healthcare and life sciences industries as a means to track high-value and/or extremely time sensitive. SenseAware attaches digital information to packages, providing information about a shipment’s exact location, notification when a shipment is opened or if the contents have been exposed and real-time alerts and analytics between trusted parties regarding the vital signs of a shipment


The economic downturn, demographic shifts such as greater longevity, and new competitive challenges are all prompting many changes in the insurance business. These changes encompass the type of products being sold, how those products are marketed and advertised, how risk is assessed, and how fraud is detected.

Relatively few insurers are fully immersed in a comprehensive Big Data strategy and reaping its benefits; however, the good news is that most insurers are planning their Big Data approach.

The use of Big Data can bring benefits to the insurance industry in the following areas:

  • Risk Avoidance
    • Insurers can access a myriad of new sources of data and build statistical models to better understand and quantify risk
    • Big Data analytical applications include behavioral models based on customer profile data compiled over time and cross-referenced with other data that is relevant to specific types of products
  • Product Personalization
    • The ability to offer customers the policies they need at the most competitive premiums is a big advantage for insurers.
    • Scoring models of customer behavior based on demographics, account information, collection performance, driving records, health information, and other data can aid insurers in tailoring products and premiums for individual customers based on their needs and risk factors
    • Some insurers have begun collecting data from sensors in their customer’s cars that record average miles driven, average speed, time of day most driving occurs, and how sharply a person brakes.
    • This data is compared with other aggregate data, actuarial data, and policy and profile data to determine the best rate for each driver based on their habits, history, and degree of risk
  • Fraud Detection:
    • Using analytical techniques such as pattern analysis, graph analysis of cohort networks, and insights from social media, insurance companies can do a better job of detecting fraud
    • Collecting data on behaviors from online channels and automated systems helps determine the potential for and existence of fraud
    • Chief claims officers (CCOs) are adopting a multi-channel approach to fraud detection by looking at structured data in their claims and policy data warehouses, and combining it with textual data in adjustor notes, police reports, and social media.
  • Customer Need Analysis
    • Automating the discussion between prospects and advisors about complex insurance products such as 5 life and annuity, based on a customer’s desire and resources can enhance the sales process.

MetLife, a leading global provider of insurance, annuities, and employee benefit programs,  selected MongoDB as the data engine for “The Wall”, an innovative customer service application. Similar to the Facebook User Interface, The Wall provides a 360-degree, consolidated view of MetLife customers, including policy details and transactions across lines of business. The Wall improves customer satisfaction and boosts call center productivity.



Comments (4)

  1. Shankar
    September 10, 2015

    Sharjeel, this article is awesome and thanks for mentioning Dezyre.

    • Sharjeel Sohaib
      September 10, 2015

      Thank you Shankar for appreciation.

  2. Amit
    April 17, 2017

    Hello Sohaib , Really this is nice article for which I was looking for , But could you please elaborate more on Credit card industry which is heavily use to work with Vision plus & mainframe platform , in which area apart from Fraud , Sentiment , Segmentation we can use ?

  3. Muhammad Omer
    April 19, 2017

    Speaking specifically for credit card industry, based on customer or country/location profiling, which itself can be based on historical data, certain features could be added – on top of existing offers. For example, looking at the card (both debit and credit) historical data, it may reveal that certain age group or income bracket people are far more reliable and regular in their re-payments. Alternatively, they may tend to use it for more home appliances than, for example, online transactions. This information could be used to offer certain incentives in collaboration with manufacturers or large distributors, to offer tailored incentives.
    Another area where its particularly useful is with regulatory authorities. Looking at types and frequency of different transactions, they may be able to with-hold or deny potential non-compliant transaction – in real-time. One of main issues faced by such central authorities is non-compliance of directives, or certain “features” offered by one institution and others not offering, could potentially cause process/transaction being refused or allowed (depending on scenario), which could potentially be averted with help of intelligent analytics.
    There are other potentials too, but I think this answer hopefully should be enough.

Comments are closed.