AIOps – Between early adopters and skeptical mainstream

First, let’s get on the same page.

AIOps is a term coined by Gartner in an article on the evolution of IT operations with the recent addition of machine learning capabilities and artificial intelligence. Gartner defines AIOps as, “systems that combine big data and AI or machine learning functionality to enhance and partially replace a broad range of IT operations processes and tasks, including availability and performance monitoring, event correlation and analysis, IT service management, and automation.”

So what does AIOps mean in practice?

The term “AIOps” sounds very sci-fi and futuristic. You half-expect to be talking to cyborgs about it. Look beyond the word and you will see the natural next step for IT operations. With big data becoming an important part of each function within every business, IT managers felt the need to tap into this vast source of information. Once they did, they found huge quantities of useful information and high volumes of (apparently) useless data.

Enter machine learning which looks at data generated by the business, then can discover anomalies and patterns based on complex algorithms that can be further used in automation jobs.

So we’re working with two types of information: one that is unpredictable (anomalies) and one which is predictable (patterns). Automated tasks based on anomalies are more difficult to implement because anomalies are naturally….unpredictable. Instead, someone has to first analyze the anomaly and then -based on the findings – to apply a fix.

Patterns,on the other hand, are often problems, failures and issues which arise time and time again. Patterns are where automation comes into play. Automation is what rounds-up AIOps into the solution that is promising enough to be embraced by early adopters. Because of their nature, we often address patterns with repetitive, and often manual, tasks. Because these tasks are repetitive, we can automate them.





A simplified diagram of the AIOps process

Let’s take a real life scenario. We already established how important data is and as such how important storage has become in recent years. I have faced many situations in which storage became a problem because of over-provisioning or under-provisioning which lead to SQL jobs being stopped, servers running slow, applications stopping and so on and the worst part of it is that it’s not always the first thing that crosses your mind when it happens.

AIOps solves this issue in a very elegant way. A machine learning job is created that analyzez storage usage over a certain period of time (the longer the better). Then it creats a prediction based on that analysis. Nothing fancy so far right? Well using those predictions we can set triggers which will start an automated task to extend disk sizez before the need actualy arises. That went from 0 to 100 really fast didn’t it?

Let’s recap what AIOps does:

  • Collects data, stores, manages, and enhances data
  • Applies Machine Learning
  • Improves a workflow through automation
  • Offers the tools for deep analysis
  • Creates alerts for humans to look at
  • In the long run implement auto-remediation

What AIOps isn’t

First of all it’s not a universal solution for everything and anything. AIOps is still young and needs time to mature – it is at the point where we can talk about it and start gaining experience but it is not yet a technology ready to be applied to all aspects of IT operations. So why do we talk about it?

Because it’s where IT operations is heading. It will become mainstream thanks to its potential to increase IT performance and lower the costs and resources to do it. In the same article, Gartner expects an 8-fold increase in market share for AIOps from 5% currently to 40% by 2022. This is a massive evolution and speaks volumes about the huge opportunities to transform IT operations.

What’s with the early adopters and skeptics?

When talking to people about AIOps, we generally find two groups: enthusiasts/early adopters and the more skeptical, cautious mainstream, just as with any new, emerging technology .

Early adopters love the idea: they can already see themselves driving – sorry, riding in a driverless car while their business is being run by itself from a computer they control from their phone. They may think that AIOps is the universal solution to almost everything technology oriented. Realistically, we have to be honest with our enthusiasts and tone down their expectations…for now. AIOps is not at the point of running a business without human intervention and it might never get there – but it is this kind of forward-thinking that starts ideas like AI, Machine Learning or AIOps and for that enthusiasts we salute you!

For the skeptical mainstream,AIOps is just the latest buzzword. Businesses don’t stand still waiting for you to test and fine-tune your algorithms – and there is a clear unease at the idea of computers taking possibly business critical decisions. . And again we shouldn’t be too hard on our skeptics because they are the ones that make sure that things are happening smoothly while we run around looking to invent the Matrix – and for that we thank them!

Where next?

AIOps is the latest entry in a long list of things to look out for in the years to come but the first proof points are here. While we should be patient and understand that every technology needs to “break in” before becoming part of the IT essentials toolkit, the potential is definitely there and starting to show the first results. Successful AIOps adoptions has lead to reduced time to resolution for infrastructure issues, complete infrastructure visibility across vendors and technologies, reduced down-time for business-critical applications just to name a few.