By Jay Jacobs
Let’s get two things out there up front.
First, when a vulnerability like CVE-2021-44228 (Log4Shell) comes along, there are two reasons people will check a scoring system, either to point out how dumb it is for being too low or to find more support for what everyone already knows: this vulnerability is bad. In other words, for a vulnerability like CVE-2021-44228, there is little benefit from a scoring system like EPSS. Clearly, the vulnerability required (still requires) immediate response regardless of whatever any scoring system said. EPSS is more valuable when you don’t know if something is actively being exploited or not.
Secondly, CVE-2021-44228 is in rarefied air among vulnerabilities. Holding up a vulnerability like this isn’t going to represent the other tens of thousands of vulnerabilities that didn’t generate the news articles and tweet storms. Most vulnerabilities do not receive this level of dissection and scrutiny. Vulnerabilities over the last few years have an average of 2 references (URLs), CVE-2021-44228 has over forty. As far as testing a prioritization strategy, this vulnerability is about handling an outlier, not a representative example.
Having set that up, we are now going to (kind of) violate the second point above. We want to use the Log4Shell vulnerability as a use case, but not to gauge how well or poorly EPSS did on it. Instead we want it as a use case precisely because it was so heavily dissected and thus, memorable. Many people have intimate knowledge of how reality unfolded in early December of 2021, and they can compare their own knowledge to what EPSS saw and how EPSS handled the information coming in about CVE-2021-44228. Hopefully this will peel back the covers a bit and help build your understanding of what EPSS is and what it is not.
The Exploit Prediction Score System (EPSS) is a fully automated vulnerability scoring system and estimates the probability of exploitation activity in the next 30 days. And even though EPSS may be labeled by some as “advanced artificial intelligence”, it may appear at times to be neither advanced nor intelligent. It only knows about what is in the data it’s consuming and parsing on a daily basis. It won’t pick up all the same clues human analysts would find, but it will go through millions of data points faster than any analyst could and it will discover complex patterns and make connections with similar events more consistently than any human could.
Within EPSS, the value of any single point of information isn’t set in stone, in other words, the variables will have different importance depending on the context and specifics of each vulnerability. There are interaction effects between variables and the “weight” of a variable is dependent on other variables. The importance of some variables can change with time, or be influenced based on the presence of other attributes. Additionally the importance of any single value can change over its range. For example, as we will see in a bit, the number of references added to the CVE has heavy influence early on in this vulnerability, but as more and more references are added, the effect wanes and ends up with no noticeable impact on the score in this case even though more references are being added.
This is the summary of what EPSS observed and how it reacted to Log4Shell:
Some of the details are missing from that plot, so this is the daily break down of what was known about this vulnerability and when:
There is a lot to unpack there, but let’s walk through some key points.
Hopefully this write up shines more light on what EPSS is and how it works. It’s an automated system that is watching and learning about vulnerabilities and is put into action by the publication of a CVE on the official CVE List. It also has shortcomings and is not intended to be a replacement for human intelligence. Like we said, when it comes to a well-researched vulnerability like CVE-2021-44228, there is little benefit from a scoring system like EPSS. Instead, the strength of EPSS lies in the thousands of vulnerabilities that haven’t made the headlines or that may not show up in any top 10 list. It’s collecting and compiling data daily and scoring over 170,000 vulnerabilities with speed and measurable accuracy.