Category Archives: Google

Big Data & Analytics – what’s ahead?

Recently I read somewhere this statement – As we end 2017 and look ahead to 2018, topics that are top of mind for data professionals are the growing range of data management mandates, including the EU’s new General Data Protection Regulation that is directed at personal data and privacy, the growing role of artificial intelligence (AI) and machine learning in enterprise applications, the need for better security in light of the onslaught of hacking cases, and the ability to leverage the expanding Internet of Things.

Here are the key areas as we look ahead:

  • Business owners demand outcomes – not just a data lake to store all kinds of data in its native format and API’s.
  • Data Science must produce results – Play and Explore is not enough. Learn to ask the right questions. Visualization of analytics from search.
  • Everyone wants Real Time – Days and weeks too slow, need immediate actionable outcomes. Analytics & recommendations based on real time data.
  • Everyone wants AI (artificial intelligence) – Tell me what I don’t know.
  • Systems must be secure – no longer a mere platitude.
  • ML (machine learning) and IoT at massive scale – Thousands of ML models. Need model accuracy.
  • Blockchain – need to understand its full potential to business – since it’s not transformational, but a foundational technology shift.

In the area of big data, a combination of new and long-established technologies are being put to work. Hadoop and Spark are expanding their roles within organizations. NoSQL and NewSQL databases bring their own unique attributes to the enterprise, while in-memory capabilities (such as Redis) are increasingly being utilized to deliver insights to decision makers faster. And through it all, tried-and-true relational databases continue to support many of the most critical enterprise data environments.

Cloud becomes the de-facto deployment choice for both users and developers. Serverless technology with FaaS (Function as a Service) is getting rapid adoption amongst developers. According to IDC, enterprises are undergoing IT transformation as they rethink their business operations, including how they use information and what technology to deploy. In line with that transformation, nearly 80% of large organizations already have a hybrid cloud strategy in place. The modern application architecture, sometimes referred to as SMAC (social, mobile, analytics, cloud) is becoming standard everywhere.

The DBaaS (database as a service) is still not as widespread as other cloud services. Microsoft is arguably making the strongest explicit claim for a converged database system with its Azure Cosmo DB as DBaaS. Cosmo DB claims to support four data models – key-value, column-family, document, and graph. However, databases have been slower to migrate to the cloud than other elements of computing infrastructure mainly for security and performance reasons. But DBaaS adoption is poised to accelerate. Some of these cloud based DBaaS systems – Cosmo DB, Spanner from Google, and AWS DynamoDB – now offer significant advantages over their on-premise counterparts.

One thing for sure, big data and analytics will continue to be vibrant and exciting in 2018.

Advertisements

Meet the new richest man on earth

This morning Jeff Bezos beat his nemesis from the same town Bill Gates as the richest man on the planet with his worth exceeding $90B. This was due to a huge surge in Amazon’s stock price (over $128 rise) to $1100 plus today. Their 3Q results came out yesterday and Amazon grew its revenue by 34% and profits inched up as well. There were fears that heavy investments in new warehouses and hiring workers would push it to a loss. This year Amazon’s stock started at $750. What a run!

Here are the numbers. Revenue soared 34% to a record $43.74B, a first for a non-holiday period, as the internet retail giant spread its ambitions with the acquisition of Whole Foods Market Inc. and widened its lead in cloud computing. Profit increased 1.6% to $256M, despite the costs bulging by 35%, a five-year high. I was surprised to know that Amazon employs 541,900 people, an increase from last quarter’s 382,400. Roughly 87,000 employees were added from Whole Foods. Now Amazon commands some 43.5% of e-commerce sales this year, compared with 38.1% last year.

I remember during the dot.com crash, everyone wrote off Amazon. When they ridiculed Bezos for a no-profit company with a bleak future, he jokingly replied, ” I spell profit as ‘prophet'”. He has come a long way with his prophetic vision and masterful execution.

The best addition to Amazon’s two core businesses (books and e-commerce) was the introduction of AWS as the cloud computing infrastructure back in 2004. First came S3 (simple shared storage) when Bezos convinced start-up companies to rent storage at one-hundredth of the cost of buying from big vendors. Then EC2 (Elastic Computing Cloud) was added and that took off in a big way, especially with capital-starved startups with unpredictable computing needs. Pretty soon, Amazon took the credit of being the ‘father of cloud computing’ beating big incumbents like IBM, HP, etc. Now AWS is a huge business growing fast and bringing in about $16B revenue with over 60% profit. AWS is making a difference to the bottom line. Microsoft is trying hard to catch up with its Azure cloud and so is Google with its GCE (Google Computing Cloud). Today’s AWS is a very rich stack with its own database as a service (Redshift, Dynamo, and Aurora), elastic Map-Reduce, serverless offering with Lambda, and much more.There are predictions that AWS could one day be the biggest business for Amazon.

While the pacific north-west remains to be the home of the richest man on earth, the title shifts to Bezos from Gates.

Serverless, FaaS, AWS Lambda, etc..

If you are part of the cloud development community, you certainly know about “serverless computing”, almost a misnomer. Because it implies there are no servers which is untrue. However the servers are hidden from the developers. This model eliminates operational complexity and increases developer productivity.

We came from monolithic computing to client-server to services to microservices to serverless model. In other words, our systems have slowly “dissolved” from monolithic to function-by-function. Software is developed and deployed as individual functions – a first-class object and cloud runs it for you. These functions are triggered by events which follows certain rules. Functions are written in fixed set of languages, with a fixed set of programming model and cloud-specific syntax and semantics. Cloud-specific services can be invoked to perform complex tasks. So for cloud-native applications, it offers a new option. But the key question is what should you use it for and why.

Amazon’s AWS, as usual, spearheaded this in 2014 with a engine called AWS Lambda. It supports Node, Python, C# and Java. It uses AWS API triggers for many AWS services. IBM offers OpenWhisk as a serverless solution that supports Python, Java, Swift, Node, and Docker. IBM and third parties provide service triggers. The code engine is Apache OpenWhisk. Microsoft provides similar function in its Azure Cloud function. Google cloud function supports Node only and has lots of other limitations.

This model of computing is also called “event-driven” or FaaS (Function as a Service). There is no need to manage provisioning and utilization of resources, nor to worry about availability and fault-tolerance. It relieves the developer (or devops) from managing scale and operations. Therefore, the key marketing slogans are event-driven, continuous scaling, and pay by usage. This is a new form of abstraction that boils down to function as the granular unit.

At the micro-level, serverless seems pretty simple – just develop a procedure and deploy to the cloud. However, there are several implications. It imposes a lot of constraints on developers and brings load of new complexities plus cloud lock-in. You have to pick one of the cloud providers and stay there, not easy to switch. Areas to ponder are cost, complexity, testing, emergent structure, vendor dependence, etc.

Serverless has been getting a lot of attention in last couple of years. We will wait and see the lessons learnt as more developers start deploying it in real-world web applications.

Amazon+Whole Foods – How to read this?

Last Thursday (June 15, 2017), Amazon decided to acquire Whole Foods for a whopping $13.7B ($42 per share, a 27% premium to its closing price). On Friday, stock prices of Walmart, Target, and Costco took a hit downwards, while Amazon shares went up by more than 2%. So why did Amazon buy Whole Foods? Clearly Amazon sees groceries as an important long-term driver of growth in its retail segment. What is funny is that a web pioneer with no physical retail outlet decided to get back to the brick-and-mortar model. Amazon has also started physical bookstores at a few cities. We have come full circle.

Amazon grocery business has focussed on Amazon Fresh subscription service so far to deliver online food orders. Amazon will eventually use the stores to promote private-label products, integrate and grow its AI powered Echo speakers, boost prime membership and entice more customers into the fold. Hence this acquisition is the start of a long term strategy. Amazon is known for its non-linear thinking. Just see how it started a brand new business with AWS about 12 years back and now it is a $14B business with a 50%+ margin. It commands a powerful leadership position in the cloud computing business and competitors like Microsoft Azure or Google’s GCE are trying hard to catch up.

The interesting thing to ponder is how the top tech companies are spreading their tentacles. This was a front-page article in today’s WSJ. Apple, a computer company that became a phone company, is now working on self-driving cars, TV programming, and augmented reality. It is also pushing into payments territory challenging the banks. Google parent Alphabet built Android which now runs most PC devices. It ate the maps industry; it’s working on internet-beaming balloons, energy-harvesting kites, and self-driving technologies. Facebook is creating drones, VR hardware, original TV shows, and even telepathic brain computers. Of course Elon Musk brings his tech notions to any market he pleases – finance, autos, energy, and aerospace.

What is special about Amazon is that it is willing to work on everyday problems. According to the author of the WSJ article, this may be the smarter move in the long run. While Google and Facebook have yet to drive significant revenue outside their core, Amazon has managed to create business after business that is profitable, or at least not a drag on the bottom line. The article ends with cautionary note, “Imagine a future in which Amazon, which already employs north of 340,000 people worldwide, is America’s biggest employer. Imagine we are all spending money at what’s essentially the company store, and when we get home we’re streaming Amazon’s media….”

With few tech giants controlling so many businesses, are we comfortable to get all our goods and services from the members of an oligopoly?

Secret of Sundar Pichai’s success

I watched Sundar Pichai’s recent interaction with the students at I.I.T. (Indian Institute of Technology) Kharagpur, India, where he graduated back in 1993. Besides our common country of birth, I had never heard of Sundar until his rapid rise at Google a few years back. I have never met him or listened to him at conferences. So this was the first time, I had a chance to listen to his remarks and his answers to many questions from the audience of 3500 students at his alma mater earlier this week.

Growing up not far from I.I.T. Kharagpur, I was very aware of this institution. It was the first I.I.T. in India established during the 1950s. Other I.I.T’s like at Kanpur, Delhi, Mumbai and Chennai came later. These were the original 5 Indian Institute of Technologies. Lately many new ones have been added.

Sundar did his undergraduate studies in Metallurgy (study about metals). Then how did he switch from that into software? That was one of the questions from a student. He said that he loved Fortran language during his student days and that love for programming continued. The message he was giving was for everyone to pursue their own interest & passion. He mentioned that unlike in India, students at US universities sometimes do not decide their majors, way into their 3rd or 4th year of studies. Sundar’s passion was to build products that would impact a very large number of global users. During his interview at Google, he was asked what he thought of Gmail, which he had never seen nor used. Then the fourth interviewer actually showed it to him. Subsequently, he gave his opinion to the remaining 3 interviewers on what he thought was wrong with Gmail and how to improve it. He emphasized time and again the need to step out of the comfort zone and get an all rounded experience. Today’s students need not be afraid to take some risks and be willing to fail.

Besides technical leadership, Sundar possesses an amazing quality; egoless-ness, so rare to find in Silicon Valley executive community. He said that he truly believes in empowering his team and letting them execute with full trust. This is easier said that done, based on my experience at IBM and Oracle. Large organizations suffer from ego-driven leadership causing great amount of friction and anguish. Sunder’s rise at Google was due to his amazing ability to get teams to work very effectively. From Search, he went to manage Chrome, then he was given Android. His ability to work thru the complexities of products, fiefdoms, and internal rivalries was so evident that he was elevated to the CEO position so quickly. Humility is his hallmark combined with clarity of vision and efficient execution.

He made an interesting comment about the vision at Google. Larry Page said that the moonshot projects are worthwhile because the bar is so high (no competition). Even if you fail, you are still ahead with your knowledge and experience.

It was fun listening to Sundar’s simple and honest answers & remarks.

The resurgence of AI/ML/DL

We have been seeing a sudden rise in the deployment of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL). It looks like the long “AI winter” is finally over.

  • According to IDC, AI-related hardware, software and services business will jump from $8B this year to $47B by 2020.
  • I have also read comments like, “AI is like the Internet in the mid 1990s and it will be pervasive this time”.
  • According to Andrew Ng, chief scientist at Baidu, “AI is the new electricity. Just as 100 years ago electricity transformed industry after industry, AI will now do the same.”
  • Peter Lee, co-head at Microsoft Research said,  “Sales teams are using neural nets to recommend which prospects to contact next or what kind of products to recommend.”
  • IBM Watson used AI in 2011, not DL. Now all 30 components are augmented by DL (investment from $500M – $6B in 2020).
  • Google had 2 DL projects in 2012, now it is more than 1000 (Search, Android, Gmail, Translation, Maps, YouTube, Self-driving cars,..).

It is interesting to note that AI was mentioned by Alan Turing in a paper he wrote back in 1950 to suggest that there is possibility to build machines with true intelligence. Then in 1956, John McCarthy organized a conference at Dartmouth and coined the phrase Artificial Intelligence. Much of the next three decades did not see much activity and hence the phrase “AI Winter” was coined. Around 1997, IBM’s Deep Blue won the chess match against Kasparov. During the last few years, we saw deployments such as Apple’s Siri, Microsoft’s Cortana, and IBM’s Watson (beating Jeopardy game show champions in 2011). In 2014, DeepMind team used a deep learning algorithm to create a program to win Atari games.

During last 2 years, use of this technology has accelerated greatly. The key players pushing AI/ML/DL are – Nvidia, Baidu, Google, IBM, Apple, Microsoft, Facebook, Twitter, Amazon, Yahoo, etc. Many new players have appeared – DeepMind, Numenta, Nervana, MetaMind, AlchemyAPI, Sentient, OpenAI, SkyMind, Cortica, etc. These companies are all targets of acquisition by the big ones. Sunder Pichai of Google says, “Machine learning is a core transformative way in which we are rethinking everything we are doing”. Google’s products deploying these technologies are – Visual Translation, RankBrain, Speech Recognition, Voicemail Transcription, Photo Search, Spam Filter, etc.

AI is the broadest term, applying to any technique that enables computers to mimic human intelligence, using logic, if-then rules, decision trees, and machine learning. The subset of AI that includes abstruse statistical techniques that enable machines to improve at tasks with experience is machine learning. A subset of machine learning called deep learning is composed of algorithms that permit software to train itself to perform tasks, like speech and image recognition, by exposing multi-layered neural networks to vast amounts of data.

I think the resurgence is a result of the confluence of several factors, like advanced chip technology such as Nvidia Pascal GPU architecture or IBM TrueNorth (brain-inspired computer chip), software architectures like microservice containers, ML libraries, and data analytics tool kits. Well known academia are heavily being recruited by companies – Geoffrey Hinton of University of Toronto (Google), Yann LeCun of New York University (Facebook), Andrew Ng of Stanford (Baidu), Yoshua Bengio of University of Montreal, etc.

The outlook of AI/ML/DL is very bright and we will see some real benefits in every business sector.

Linux & Cloud Computing

While reading the latest issue of the Economist, I was reminded that August 25th. marks an important anniversary for two key events:  25 years back, on August 25, 1991, Linus Torvalds launched a new operating system called Linux and on the same day in 2006, Amazon under the leadership of Andy Jesse launched the beta version of Elastic Computing Cloud (EC2), the central piece of Amazon Web Services (AWS).

The two are very interlinked. Linux became the world’s most used piece of software of its type. Of course Linux usage soared due to backers like HP, Oracle, and IBM to combat the Windows force. Without open-source programs like Linux, cloud computing would not have happened. Currently 1500 developers contribute to each new version of Linux. AWS servers deploy Linux heavily. Being first to succeed on a large scale allowed both Linux and AWS to take advantage of the network effect, which makes popular products even more entrenched.

Here are some facts about AWS. It’s launch back in 2006 was extremely timely, just one year before the smartphones came about. Apple launched its iPhone in 2007 which ushered the app economy. AWS became the haven for start-ups making up nearly two-third of its customer base (estimated at 1 million). According to Gartner Group, the cloud computing market is at $205B in 2016, which is 6% of the world’s IT budget of $3.4 trillion. This number will grow to $240B next year. No wonder, Amazon is reaping the benefits – over past 12 months, AWS revenue reached $11B with a margin of over 50%. During the last quarter, AWS sales were 3 times more than the nearest competitor, Microsoft Azure. AWS has ten times more computing capacity than the next 14 cloud providers combined. We also saw the fate of Rackspace last week (acquired by a private equity firm). Other cloud computing providers like Microsoft Azure, Google Cloud, and IBM (acquired SoftLayer in 2013) are struggling to keep up with AWS.

The latest battleground in cloud computing is data. AWS offers Aurora and Redshift in that space. It also started a new services called Snowball, a suitcase-sized box of digital memory which can store mountains of data in the AWS cloud (interesting challenge to Box and Dropbox). IBM bought Truven Health Analytics which keeps data on 215m patients in the healthcare industry.

The Economist article said, “AWS could end up dominating the IT industry just as IBM’s System/360, a family of mainframe computers did until the 1980s.”       I hope it’s not so and we need serious competition to AWS for customer’s benefits. Who wants a single-vendor “lock-in”? Microsoft’s Azure seems to be moving fast. Let us hope IBM, Google, and Oracle move very aggressively offering equivalent or better alternatives to Amazon cloud services.