Category Archives: New Technology

Fast Data

During the 1980s and 1990s, online transaction processing (OLTP) was critical for banks, airlines, and telcos for core business functions. This was a big step-up from batch systems of the early days. We learnt the importance of sub-second response time and continuous availability with the goal of five-nines (99.999% uptime). The yearly tolerance of system outage was like 5 minutes. During my days at IBM, we had to face the fire from a bank in Japan that had an hour long outage resulting in a long queue in front of the ATM machine (unlike here, the Japanese stand very patiently until the system came back after what felt like an eternity). They were using IBM’s IMS Fast Path software and the blame was first put on that software, which subsequently turned out to be some other issue.

Advance the clock to today. Everything is real-time and one can not talk about real-time without discussing the need for “fast data” – data that has to travel very fast for real time decision making. Here are some reasons for fast data:

  • These days, it is important for businesses to be able to quickly sense and respond to events that are affecting their markets, customers, employees, facilities, or internal operations. Fast data enables decision makers and administrators to monitor, track, and address events as they occur.
  • Leverage the Internet of Things – for example, an engine manufacturer will embed sensors within its products, which then will provide continuous feeds back to the manufacturer to help spot issues and better understand usage patterns.
  • An important advantage that fast data offers is enhanced operational efficiency, since events that could negatively affect processes—such as inventory shortages or production bottlenecks—can not only be detected and reported, but remedial action can be immediately prescribed or even launched. Realtime analytical data can be measured against the patterns determined to predict problems, and systems can respond with appropriate alerts or automated fixes.
  • Assure greater business continuity – Fast data plays a role in bringing systems—and all data still in the pipeline—back up and running quickly, before the business suffers from a catastrophic event.
  • Fast data is critical for supporting Artificial Intelligence and machine learning. As a matter of fact, data is the fuel for machine learning (recommendation engines, fraud detection systems, bidding systems, automatic decision making systems, chatbots, and many more).

Now let us look at the constellation of technologies enabling fast data management and analytics. Fast data is the data that moves almost instantaneously from source to processing to analysis to action, courtesy of framework and pipelines such as Apache Spark, Apache Storm, Apache Kafka, Apache Kudu, Apache Cassandra, and in-memory data grids. Here is a brief outline on each of these.

Apache Spark – open source toolset now supported by most major database vendors. It offers streaming and SQL libraries to deliver real-time data processing. Spark Streaming offers data as it is created, enabling analysis for critical areas like real-time analytics and fraud detection. It’s structured streaming API opens up this capability to enterprises of all sizes.

Apache Storm is an open source distributed real-time computation system designed to enable processing of data streams.

Apache Cassandra is an open source low-latency data replication engine.

Apache Kafka is an open source toolset designed for real-time data streaming – employed for data pipelines and streaming apps. Kafka Connect API helps connect it to other environments. It originated at Linked-In.

Apache Kudu is an open source storage engine to support real-time analytics on commodity hardware.

In addition to powerful open source tools and frameworks, there are in-memory data grids that provides a hardware-enabled fast data enabler to deliver blazing speeds to meet today’s needs such as the IoT management and deployment of AI and machine learning and responding to events in real-time.

Yes, we have come a long way from those OLTP days! Fast data management and analytics is becoming a key area for businesses to survive and grow.

Advertisements

When to think of Blockchain?

Blockchain is going thru its hype cycle. It’s not magic nor is it a solution looking for a problem. It is important to know what they can and can’t do. So let us revisit the definition again. Blockchain is a distributed ledgershared by untrusted participants with strong guarantee about accuracy and consistency. Now let us dissect the highlighted words.

  • Ledger: Manual ledgers go back to the 19th. century in which accountants entered transactions by hand. They are list of transactions: items sold/purchased, price, date, etc. Those transactions are dated (timestamped). Ledgers are strictly append only: transactions can be added, but old entries can neither be deleted or modified. Blockchain can have ledger entries that are significantly more complex, but the concept is the same.
  • Shared: Anyone with the appropriate software can put entries into a pool of entries that will eventually be checked for consistency and added to the ledger.
  • Distributed: Blockchains are not centralized. There is no central administration to decide who has access and what rules to follow. Hence there is no single point of control nor single point of failure.Many participants in the blockchain have copies of the entire ledger which gets updated whenever blocks are added. This dis-intermediation was fundamental when the Bitcoin movement started back in 2008.
  • Untrusted Participants: This is the most radical idea of blockchain. In enterprise applications, requiring a certain amount of trust allows some important optimizations, but the concept of “untrusted participants” is fundamental to a blockchain. Anyone can add entries. The protocol that produces agreement among untrusted partners is called BFT (Byzantine Fault Tolerance) or byzantine agreement.
  • Accuracy & Consistency: Despite untrusted participants, blockchain makes strong guarantees about ledger’s accuracy. The replicated copies are not always in agreement, but disagreements are quickly resolved automatically via algorithms and voting.

Blockchain is often a shorthand for “how Bitcoin is implemented”, but its scope is much broader than Bitcoin (like the first app. on blockchain much like email was the first app on the Internet). Blockchain introduces the era of “exchange of values/assets” whereas the Internet gave us the era of “exchange of information”.

If you are building applications that span enterprises and that need to keep accurate records in the presence of untrusted partners, you should be thinking about blockchains.

Blockchain in Healthcare

The application of blockchain technology in the healthcare industry will bring great benefits, most important being accuracy of data and lowering of cost.

Just as a reminder, blockchain technology provides these key facets:

  • a low-cost decentralized ledger approach to managing information (replicated at each node without any central hub),
  • giving simultaneous access to all parties a single body of strongly encrypted data (almost impossible for hackers to get to data),
  • creates an audit trail each time data is changed helping to ensure the integrity and authenticity of the information,
  • patients can see their data and will authorize other parties (doctors, hospitals, insurers).

Current problems in the healthcare industry is all about multiple sources of data for the patients and hence incorrect information which adds to cost. The various entities like hospital, doctor’s office, and insurance all maintain their own database and synchronization becomes a real issue and often causes error. Blockchain is a real solution to these ills. Several examples of applying blockchain are in the development stage.

  • Change Healthcare, a Nashville-based health network has introduced a blockchain system for processing insurance claims. While not all providers in the system are using it yet, the shared ledger of encrypted data represents a “single source of truth”. All involved parties can see the same accurate information about a claim in real time (rather than sending data back and forth). This relieves a patient from having to call multiple parties to verify information (a practice we all are familiar with). Each time the data is changed, a record of it is shown on the digital ledger identifying the responsible party. Any changes also require verification by each party involved, again enforcing the record’s accuracy
  • Last April, a group of companies like Humana Inc., Multiplan Inc., Quest Diagnostics Inc., and United Health Group’s Optum announced a pilot project using blockchain to have online directories of doctors and healthcare providers. Typically doctor groups, hospitals, insurers and diagnostic companies maintain their own online listings of contacts, practices and biographical details. Not only it is expensive, but they have to continually check and verify the accuracy of these directories. Using blockchain, a substantial saving (almost 75%) will occur. The goal of the pilot program is for providers to update their information themselves into the blockchain where all parties in the network can view it.
  • The MIT Media lab is developing a system called MedRec based on blockchain. Patients can manage their own records and give permission to doctors and providers to access and update the records. Success of the system or any such system will depend on large number of providers and doctors opting in to the program.

Most of the early efforts are in the “proof of concept” stage. But the potential of blockchain to help lower the healthcare cost and provide timely accurate information is very promising.

Crypto Hype vs. Blockchain

There is a lot of crypto hype these days from crypto currencies like Bitcoin to fundraising efforts like ICO (Initial Coin Offering) similar to an IPO. All this noise has obscured the real benefits of the underlying technology – Blockchain. The Internet brought us the “exchange of information” over last 3 decades. Blockchain will give us the new era of “exchange of values” or “exchange of assets” without an intermediary via highly secure transactions in a peer to peer network. New ways of transferring real estate titles, managing cargo on shipping vehicles, guaranteeing the safety of food we eat and much more mundane activities will be enabled by Blockchain. An article in today’s WSJ by Christopher Mims covers this in more detail.

Briefly Blockchain is essentially a secure database (or ledger) spread across multiple computers. Everybody has the same record of all transactions, so tampering with one instance of it will be meaningless. “Crypto” describes the cryptography that underlies it, which allows agents to securely interact (e.g. transfer assets) while also guaranteeing that once a transaction has been made, the Blockchain keeps an immutable record of it. This technology is well suited to transactions that require trust and a permanent record for traceability. It also requires the cooperation of many different parties. Here are some examples of actual deployment of Blockchain so far:

  • At Walmart 1.1 million items are on Blockchain helping the company to trace the item’s journey from manufacturer to store shelf. Global shipping company Maersk is tracking shipping containers making it faster and easier to transfer them and get them thru customs. Other companies using Blockchain technology for tracking are Kroger, Nestle, Tyson Foods and Unilever. In all these cases, IBM is providing the Blockchain technology.
  • CartaSense, an Israeli company uses Blockchain database for its customers to track every stage of the journey of a package, pallet or shipping container.
  • Everledge, a company started in 2014 uses a Blockchain-based registry of every certified diamond in the world (already 2.2. million in its registry). By recording 40 different measures of each stone, it is able to trace the journey of a stone from its source to the final sale to a customer.
  • Dubai has declared its goal to make itself a Blockchain powered government in the world by 2020. They want to streamline real estate transactions for faster and easier transfer of property titles. Other assets like birth/death certificates, passports, visa, etc. can also be managed at low cost with better efficiency.

It is a bit early to claim that Blockchain will revolutionize every industry including government, but it has that potential. It poses a tremendous challenge for the hackers to break into. It can impact on how we vote to whom we connect to what we buy.

The New AI Economy

The convergence of technology leaps, social transformation, and genuine economic needs is catapulting AI (Artificial Intelligence) from its academic roots & decades of inertia to the forefront of business and industry. There has been a growing noise since last couple of years on how AI and its key subsets like Machine Learning and Deep Learning will affect all walks of life. Another phrase “Pervasive AI” is becoming part of our tech lexicon after the popularity of Amazon Echo and Google Home devices.

So what are the key factors pushing this renaissance of AI? We can quickly list them here:

  • Rise of Data Science from the basement to the boardroom of companies. Everyone saw the 3V’s of Big Data (volume, velocity, and variety). Data is called by many names – oxygen, the new oil, new gold, or the new currency.
  • Open source software such as Hadoop sparked this revolution in analytics using lots of unstructured data. The shift from retroactive to more predictive and prescriptive analytics is growing, for actionable business insights. Real-time BI is also taking a front seat.
  • Arrival of practical frameworks for handling big data revived AI (Machine Learning and Deep Learning) which fed happily on big data.
  • Existing CPU’s were not powerful for the fast processing needs of AI. Hence GPU (Graphical Processing Units) offered faster and more powerful chips. NVIDIA provided a positive force in this area. It’s ability to provide a full range of components (systems, servers, devices, software, and architecture) is making NVIDIA an essential player in the emerging AI economy. IBM’s neuromorphic computing project provides notable success in the area of perception, speech and image recognition.

Leading software vendors such as Google have numerous projects on AI ranging from speech and image recognition, language translation, and varieties of pattern matching. Facebook, Amazon, Uber, Netflix, and many others are racing to deploy AI into their products.

Paul Allen, co-founder of Microsoft is pumping $125M into his research lab Allen Institute of AI. The focus is to digitize common sense. Let me quote from today’s New York Times, “Today, machines can recognize nearby objects, identify spoken words, translate one language into another and mimic other human tasks with an accuracy that was not possible just a few years ago. These talents are readily apparent in the new wave of autonomous vehicles, warehouse robotics, smartphones and digital assistants. But these machines struggle with other basic tasks. Though Amazon’s Alexa does a good job of recognizing what you say, it cannot respond to anything more than basic commands and questions. When confronted with heavy traffic or unexpected situations, driverless cars just sit there”. Paul Allen added, “To make real progress in A.I., we have to overcome the big challenges in the area of common sense”.

Welcome to the new AI economy!

Vitalik Buterin & Ethereum

Many of you may not have heard of this 23 year old Russian-Canadian, Vitalik Buterin. He is one of those geniuses who started loving computing and Math from an early age. His parents immigrated to Canada from Russia when he was 3 years old. After attending a private high school in Toronto, he joined the University of Waterloo (my alma mater), but dropped out after getting the Peter Thiel fellowship of $100K to pursue his entrepreneurial work in cryptocurrency.

After trying to persuade the Bitcoin community for a scripting language which got no support, he decided to start a new platform to serve cryptocurrency plus any asset like a smart contract. His first seminal paper in 2013 laid the foundation and the same year he proposed the building of a new platform called Ethereum with a general scripting language. In early 2014, a Switzerland company called Ethereum Switzerland GMBH developed the first Ethereum software project. Finally in July-August of 2014, Ethereum launched a pre-sale of Ether tokens (its own cryptocurrency) to public and raised $14M. Ethereum belongs to the same family as the cryptocurrency Bitcoin, whose value has increased more than 1,000 percent in just the past year. Ethereum has its own currencies, most notably Ether, but the platform has a wider scope than just money.

You can think of my Ethereum address as having elements of a bank account, an email address and a Social Security number. For now, it exists only on my computer as an inert string of nonsense, but the second I try to perform any kind of transaction — say, contributing to a crowdfunding campaign or voting in an online referendum — that address is broadcast out to an improvised worldwide network of computers that tries to verify the transaction. The results of that verification are then broadcast to the wider network again, where more machines enter into a kind of competition to perform complex mathematical calculations, the winner of which gets to record that transaction in the single, canonical record of every transaction ever made in the history of Ethereum. Because those transactions are registered in a sequence of “blocks” of data, that record is called the blockchain. Many Bitcoin exchanges use the Ethereum platform.

A New York Times article in January said, “The true believers behind blockchain platforms like Ethereum argue that a network of distributed trust is one of those advances in software architecture that will prove, in the long run, to have historic significance. That promise has helped fuel the huge jump in cryptocurrency valuations. But in a way, the Bitcoin bubble may ultimately turn out to be a distraction from the true significance of the blockchain. The real promise of these new technologies, many of their evangelists believe, lies not in displacing our currencies but in replacing much of what we now think of as the internet, while at the same time returning the online world to a more decentralized and egalitarian system. If you believe the evangelists, the blockchain is the future. But it is also a way of getting back to the internet’s roots”.

Vitalik wrote the idea of Ethereum at age 19. He is the new-age Linus Torvalds who fathered Linux that became the de-facto operating system for the Internet developers.

IBM’s Neuromorphic Computing Project

The Neuromorphic Computing Project at IBM is a pioneer in next-generation chip technology. The project has received ~$70 million in research funding from DARPA (under SyNAPSE Program), US Department of Defense, US Department of Energy, and Commercial Customers. The ground-breaking project is multi-disciplinary, multi-institutional, and multi-national and has a world-wide scientific impact. The resulting architecture, technology, and ecosystem breaks path with the prevailing von Neumann architecture and constitutes a foundation for energy-efficient, scalable neuromorphic systems. The head of this project is Dr. Dharmendra Modha, IBM Fellow and chief scientist for IBM’s brain-inspired computing project.

So why is the Von Neumann architecture inadequate for brain-inspired computing? The Von Neumann model goes back to 1946 where it dealt with 3 things – the CPU, memory and a bus. You move data to and from memory. The bus connects the memory & CPU via computation. It becomes the bottleneck, and also sequentializes computation. So if you have to flip a single bit, you have to read that bit from memory and write it back.

The new architecture is radically different. The IBM project takes inspiration from the structure, dynamics, and behavior of the brain to see if they can optimize time, speed, and energy of computation. Co-locate memory and computation and slowly intertwine communication, just like how the brain does, then you can minimize the energy of moving bits from memory to computation. You can get event-driven computation rather than clock-driven computation, and you can compute only when information changes.

The Von Neumann paradigm is, by definition, a sequence of instructions interspersed with occasional if-then-else statements. Compare that to a neural network, where a neuron can reach out to up to 10,000 neighbors. The TrueNorth (IBM’s new chip) can reach out to up to 256, and the reason for that disparity is because it is silicon and not organic technology. But there’s a very high fan-out, and high fan-out is difficult to implement in a sequential architecture. An AI system IBM developed last year for Lawrence Livermore National Lab had 16 TrueNorth chips tiled in a 4-by-4 array. The chips are designed to be tiled, so scalability is built in as a design principle rather than as an afterthought.

In summary, the design points of the IBM project are as follows:

  • The Von Neumann architecture won’t be able to provide the massively parallel, fault-tolerant, power-efficient systems that will be needed to create to embed intelligence into silicon. Instead, IBM had to rethink processor design.
  • You can’t throw out the baby with the bathwater: even if you rethink underlying hardware design, you need to implement sufficiently abstracted software libraries to reduce the pain of the software developer so that he can program your chip.
  • You can achieve power efficiency by changing the way you build software and hardware to become active only when an event occurs; rather than tying computation to a series of sequential operations, you make it into a massively parallel job that runs only when the underlying system changes.

AI is getting notable success in the area of perception such as speech and image recognition. In the field of reinforcement learning and deep learning, the human brain becomes the primary inspiration. Hence the IBM Neuromorphic chip design becomes a significant foundational technology.