As every year begins, several experts and analyst firms like to make predictions. Let us try to make some observations in an area much talked about lately – Big Data. So here goes:
- Big Data quandary will continue as companies try to understand its value to business. Just dumping all kinds of data into a data lake (read Hadoop) is not going to solve anything. There has to be business value on what insights are needed. Therefore much like the Data Warehousing era brought additional tools in the ETL space, there is need for data curation and transformation for practical use besides the analytics piece.
- Demand for BI and Analytics will reach new heights. The next-generation BI and analytics platform should help business tap into the power of their data, whether in the cloud or on-premises. This ‘Networked BI’ capability creates an interwoven data fabric that delivers business-user self-service while eliminating analytical silos, resulting in faster and more trusted decision-making. Real-time or streaming analytics will become crucial, as decisions must be taken as soon as some events occur.
- SPARK will get even hotter. I had described IBM’s big endorsement of SPARK last year in a blogpost. Spark gives us a comprehensive, unified framework to manage big data processing requirements with a variety of data sets that are diverse in nature (text data, graph data etc) as well as the source of data (batch v. real-time streaming data). Spark enables applications in Hadoop clusters to run up to 100 times faster in memory and 10 times faster even when running on disk. In addition to Map and Reduce operations, it supports SQL queries, streaming data, machine learning and graph data processing. This also says in-memory processing will continue to thrive.
- Analytics & big events will drive demand exponentially. This year’s big events like the US presidential election and the Olympics in Brazil will see the harnessing of big data to provide data-driven insights like never before.
- Protection of data itself will become paramount. It’s still too easy for hackers to circumvent perimeter defenses, steal valid user credentials, and get access to data records. In 2016, as companies protect themselves from the threat of data loss, new means of data-centric security will become mainstream to consistently control user access and credentials where it matters the most.
- Shortage of Data Scientists will drive companies to look for Big data cloud services. To circumvent the need to hire more data scientists and Hadoop admins, organizations will rely on fully managed cloud services with built-in operational support, freeing up existing data science teams to focus their time and effort on analysis instead of wrangling complex Hadoop clusters.
- Finally, shift to cloud is getting to be main stream, because of the clear ROI. At least the dev-and-test shift is happening quite fast. AWS seems to dominate the production config, even though big data as service is still in its infancy. Microsoft Azure and IBM’s cloud service plus Oracle’s new cloud offerings will make this space quite vibrant.