In an ever increasing complex IT environment making sure that everything is within normal parameters is crucial. Traditional monitoring tools rely on a static threshold approach to determine what kind of behavior is alerted. Typically such a threshold is set to a very high value such as 80%-90% which often means that the system or application is in a critical status. Furthermore static thresholds do not take into considerations drops in metrics such as a very low value for RAM or CPU which can also indicate an issue within the infrastructure.
By applying machine learning ontop of infrastructure data to analyze the patterns in metric evolution we can start to build what is known as a dynamic threshold. A dynamic threshold adapts to the evolution of your metrics over time allowing it to adjust the level of what is normal and what is abnormal. If the CPU usage of your webserver for example has a usual value of around 40% and it spikes up to 65% in a day it is not something that would normaly raise a flag. However with a dynamic threshold this would be considered an anomaly and alerted as such.
A dynamic threshold allows you IT teams to avoid down-time and user experience degradation by being informed of any anomalies before they have a business impact. .