One of our colleagues, Ben, has recently been using one of our application platforms (Splunk) to demonstrate a potential use of Machine Learning, building (and rebuilding) a predictive model from a large dataset. He is a mathematician by training, and knows quite a lot about the algorithms now routinely used for this kind of purpose. He pointed out that most of the techniques that he used in his demo (see below) are actually fairly old. The Random Forest Algorithm, for example, has been around since the mid-1990s. What is relatively recent however is convergence of several phenomena that have made those techniques rather more useful than once they were.
The first is the cost and accessibility of raw computing power. Many of the algorithms categorised as ML are compute-intensive, requiring many iterations of complex calculations. In the relatively recent past, hiring a time slot on a Super Computer was the way to get the job done; they have always been a scarce resource. The principle of Moore’s Law has remained true and computing power has increased by a factor of 210 in the past 20 years. This has made the basic building blocks of the computer less expensive with regards to PC type processors and GPUs, which envitably led to an explosion in clustering these entities into Super computer-like performance. You can now rent an Amazon Web Services EC2 (other cloud providers are available) with 128 virtual CPUs and 2 TB of memory for less than $20 per hour with no setup fees. On demand.
Access to low cost computing power is one part of the problem solved; but you do have to know what to do with it. It is one thing to have a resident maths genius in house to do this kind of work, but another altogether for a Marketing Department to be able to crunch these numbers without access to one. That requires a solution that investigates the data, decides on the most appropriate approach, collates the dataset, cleans it up then runs a preconfigured ML algorithm, over and over, to create a continually improving model. With application like Splunk, all but the choice of approach can be made to work with minimal effort.
-
5 December 2024
Onboarding Azure Data into Splunk
-
27 November 2024
5 Ways Cribl Can Enhance Your Splunk – or any SIEM
-
6 November 2024
Why Is Understanding Your Data So Important?
See how we can build your digital capability,
call us on +44(0)845 226 3351 or send us an email…