What is event triage – meaning and uses
Alert triage is the first process that takes place when an incident happens. Traditionally events are analyzed by specialized analysts to determine if a particular event should be escalated for further investigation or it is a false-positive. Due to the large amount of events the triage process can put significant strain on the analysts and it is not a process which can be done efficiently manually.
Alert Triage is one of the main modules of our AI platform. It analyzes security or infrastructure alerts and determines which alerts should be escalated or dropped (which are actual alerts or which are false positives).
The module is based on a neural network that is trained on several data sources in order to detect the false positives. The module is also designed to take user feedback into consideration to improve the mode by enabling users to change the status of alerts thus gaining control over the outcome. Depending on the data and feedback provided by the users the neural network can reduce noise by up to 99%.
Using Siscale AIOps for Alert Triage
Alert triage is one of the biggest problems in today’s IT & Security due to the sheer number of sensors, devices and applications that need to be monitored that continuously generate alerts. Traditional monitoring tools automatically forward alerts when they are encountered and NOC and SOC engineers have to address each one in order to make sure that everything is in order.
Siscale offers an alternative to this approach. By making use of neural networks we built a model that analyzes incoming alerts and classifies them as either false-positives or true-positives. The alerts which are identified as true-positives are then escalated and a ticket is created and assigned to the appropriate team automatically.
How is AIOps the best answer for Alert Triage
AIOps is a term used to describe the use of machine learning together with automation in order to address several shortcomings in today’s IT operations. AIOps collects data from across the infrastructure in a centralized tool. This offers increased visibility as well as the ability to correlate events and apply machine learning such as anomaly detection. This is an enabler for alert triage while at the same time bringing additional benefits such as reducing mean-time-to-resolution or improving team performance.
Gather data from multiple sources and environments
AIOps enables alarm noise reduction by bringing all the required data into a single place where it can be processed and analyzed in order to automate the triage process of alerts from the growing IT infrastructures. To accommodate this we built our platform domain-agnostic meaning that it is capable of ingesting data from any source and in any format.
At the same time the Alert Triage module of the platform is tuned to be applicable to data from any industry not just IT infrastructure data. Alert noise can be reduced by the same algorithm using data from healthcare, manufacturing, retail or any other domain.
Minimize the noise with Machine Learning
SOC alert triage or NOC alert triage is a difficult process due to the sheer number of alerts and data collected from across an organization’s ecosystem and is not something that a human can keep track of easily or efficiently. The best solution to analyzing this data and making informed decisions is to use machine learning algorithms thus reducing the probability of human error and giving SOC and NOC engineers the required information and time to focus on what’s really important: keeping everything running smoothly.
Alerts triage into false-positives/true-positives
The algorithm used to remove noise from day-to-day operations is a combination of neural networks and natural language processing over the text body of the alerts.
Using this analysis the alerts are then passed through a binary classification algorithm to determine which of them are true-positives and which are false-positives and should be escalated.
Alert Triage with Integrated intelligent automation: from alert storms to actionable events
Reducing noise generated by alerts is just a first step in making the life of NOC and SOC engineers easier. Even with a reduced number of alerts they still need to thoroughly investigate the ones that were escalated and this is also time consuming. Our approach to this problem is to have an intelligent automation in the form of process automation.
Correlate and suppress alerts
Process automation follows the regular process that an operator would take in investigating an alert manually but without the need of human intervention. All the checks and information collection which previously was prone to human error is now performed automatically on the escalated alerts bringing precious information to NOC and SOC engineers – information that they require to address all issues post alert triage.
The result of this process automation is that NOC and SOC engineers have fewer alerts to analyze and that the alerts that they need to investigate are correlated with additional information that enables them to efficiently address them.
Supercharge your teams: take the right action
Let’s take an example of a security alert. When a security alert is identified a SOC engineer has to create a ticket, analyze the alert, collect information about the attack and then address the issue. While very important the steps of ticket creation, alert analysis and information collection are time consuming and during these steps the issue itself is not addressed.
Our Alert Triage module not only reduces the number of alert noise but also automates several checks and information collection as well as ticket creation. When the algorithm decides to escalate an alert it also automatically creates a ticket and allocates it to the appropriate team to handle it. Additionally several checks are being made against threat intelligence databases and the information regarding the alert found in these databases are added to the ticket so that your SOC engineer has to only focus addressing the issue.
The advantages of Siscale Autonomous Operations Platform
Siscale AIOps Platform is an AI-driven problem solving platform that can help operation teams across all industries in reducing noise, gaining observability across their environments, correlate events and automatically determine the root cause of incidents.
The platform is highly flexible, being able to collect data from any source and in any format while having integrations with several alerting, automation or management tools to streamline processes and give your engineers the precious time and information they need to keep everything running smoothly.