Extracting Insights from Big Data

Techniques to Work with Big Data

After organizations collect and process, they usually use a variety of techniques to extract insights from it that can help drive business decisions.

  • Example - To contextualize AI, let’s look at a classic example - a Turing Test. In a Turing Test -
  • Example - Supervised Machine Learning is commonly used in detecting fraud. It works at a high-level, like the following -
  • Example - Dep Learning is often used to classify images. For example, say that you want to build a Machine Learning model to classify if an image contains a Koala. You would feed hundreds, thousands, or, millions of pictures into a machine - some of these showing Koalas, and, others not showing Koalas. Over time, the model learns what a Koala is and what it is not. Over time, it can more easily and quickly identify a Koala over other images.
    It is important to note that while humans might recognize Koalas by their fluffy ears or large oval-shaped noses, a machine will detect things that we cannot - things like patterns in the Koala’s fur or the exact shape of its eyes. It is able to make decisions quickly based on that information.
  • Example - Machine Learning and Deep Learning are common tools, among many others, in a Data Scientist’s toolbox to help extract insights from data.

Data Science Workflow

The Data Science Workflow is a series of steps that Data Practitioners follow to work with Big Data. It is a cyclical process that often starts with “Identifying Business Problems”, and, ends with “Delivering Business Value”.

  • Which of our customers are at the greatest risk for churn and why?
  • Can we save money by changing the way we are pricing our products?

Roles on a Data Science Team

Data Science teams usually include several individuals that have different skill sets and tools they need to work with Big Data. While no two Data Teams look the same, the overall mission of a Data Team is to follow the steps in the Data Science Workflow to help organizations make more informed business decisions.

  • Performing updates and maintenance work.
  • Performing health checks.
  • Keeping track of how team members are using the Big Data Platforms, for example, by setting up and monitoring Alerts.
  • Implementing best practices for managing data.
  • Different Data Storage solutions.
  • Data processing engines, like Apache Spark.
  • Machine Learning Libraries.
  • Notebook Interfaces, like Databricks Notebooks, or, Jupyter.
  • Systems that help them log and keep track of Machine Learning Experiments.
  • Visualization tools, like “Tableau”, “PowerBI”, “Looker”, and, others.

Big Data Use Cases in Different Industries

Thousands of organizations around the world are applying Advanced Analytics to Big Data to enrich and accelerate business outcomes.

  • Predictive Maintenance - Avoiding production failures by analyzing real-time machine data, maintenance schedules and other historical data to predict equipment maintenance.
  • Improved Well Production - Analyzing geospatial data to determine optimal well placement and real-time insights to improve drill and well efficiency.
  • Disease Prediction - Using real-world evidence and public datasets to identify biomarkers that have a high probability of driving the onset of disease.
  • Claims Analysis - Applying Machine Learning to large volumes of Claims to determine preventative measures to improve patient health and identify fraud patterns.
  • Demand Forecasting - Predicting real-time demand and returns at a granular level using new and non-traditional data sources to optimize inventory.
  • Optimized Pricing - Improving campaign conversion and return-on-ad-spend by using Big Data to serve the right ad, at the right time, to the right person.
  • Upselling Services - Maximizing customer revenue by using customer usage data to drive cross and upsell services and products.
  • Fraud Prevention - Analyzing SIM cards, and, other data sources to minimize fraudulent transactions.
  • Personalized Banking - Delivering the right financial products and guidance to customers with real-time customer insights and predictive analytics.
  • Fraud Prevention - Detecting and preventing fraudulent activities (e.g. - money laundering, credit card fraud) by leveraging Machine Learning to predict anomalies in real-time.
  • Sentiment Analytics - Understanding how content is resonating in social channels and using data to find the next most popular article, show, or, game.
  • Churn Management - Determining which customers are likely to churn to drive personalization and prevent them from churning.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Oindrila Chakraborty

I have 10+ experience in IT industry. I love to learn about the data and work with data. I am happy to share my knowledge with all. Hope this will be of help.