For decades, companies have been making business decisions based on the transactional data stored in relational databases. But with the availability of Big Data processing power, it is becoming more meaningful to include external data sources, such as social media, web logs, sensors, etc. and their integration with internal data sources. When Big Data is distilled and analyzed in combination with traditional enterprise data, enterprises can develop a more thorough and insightful understanding of their business leading to enhanced productivity, a better competitive positioning and greater innovation – all of which can positively impact the bottom line. No wonder, companies are rushing to capture, refine, analyze and experiment with all types of so-far-unexplored data sources including the data generated from customer interactions and business operations.
The challenge lies in finding the perfect blend of tools and techniques to harness the underlying potential of Big Data. However, the business analysts and data scientists who understand Big Data are themselves facing tremendous pressure to work with multiple disparate systems to get an integrated view before they can actually focus on the business problems.
Conventional top-down approach using Business Intelligence (BI) requires the analysts to know in advance about the queries for the data warehouses, but Big Data is forcing a paradigm shift as data-driven bottom-up approach where analysts can engage in ad-hoc projects and explore data sources, even beyond corporate boundaries, to reveal undiscovered insights that can help identify new opportunities and challenges. But this bottom up approach should not be seen as an alternative to the top-down approach rather as a complement the top-down approach. The need of the hour is to design an architecture which can balance both these approaches to optimize the entire process.
Taking big data into account, a typical problem solving process involves following steps1:
- Recognize the problem or question - Identify the problem set that you are trying to solve
- Review previous findings - From the past data, build a story that the data speaks of
- Model the solution and select the variables - Based on the past data, model a business problem
- Collect the data – Collect fresh sample of data
- Analyze the data – ad-hoc query, graph based analysis, predictive models etc.
- Present and act on the results – Visualization of data, delivery of data and provide tools to act on the data
With such a comprehensive problem solving algorithm, many tools and technologies have emerged to address various segments separately and effectively. Traditionally, business analysts and data scientists are accustomed to carry out analytics on data irrespective of where the data sits as most of the data is internal to the enterprise. In the case of Big Data, one cannot straightaway start analyzing the data as it is in the crude form and requires considerable effort to consolidate it across multiple external sources before any type of analysis can be performed. This has given rise to data infrastructure issues like data capturing, data storage, data management, data analysis, data delivery and data visualization.
Several companies have built platforms that cater to only one of the data architecture issues. It seems logical too, as adoption of Big Data is still in its early stages. Soon, organizations will seek a unified platform which can fulfill all the data infrastructure requirements and help business analysts and data scientists concentrate only on the analytics part.
FORMCEPT has envisioned a scenario where data scientists would not have to worry about any of the data infrastructure problems and focus only on the business problem with their data. FORMCEPT's MECBOT is a unified analytics platform that can capture data from multiple sources, store it and provide a variety of analytics options out-of-the-box. It supports batch processing, interactive analysis and stream processing, all of which is available as a single packaged solution. It is like a Big Data Middleware that takes care of all your data infrastructure requirements providing you the freedom to concentrate only on solving the business problems.
References
- Davenport, Thomas H. (2013, July). Managing yourself: Keep up with you Quants. An innumerate’s guide to navigating Big Data. http://hbr.org/2013/07/keep-up-with-your-quants/ar/1