Exploit Prediction Scoring System (EPSS)

EPSS vs Log4Shell

By Jay Jacobs

Let’s get two things out there up front.

First, when a vulnerability like CVE-2021-44228 (Log4Shell) comes along, there are two reasons people will check a scoring system, either to point out how dumb it is for being too low or to find more support for what everyone already knows: this vulnerability is bad. In other words, for a vulnerability like CVE-2021-44228, there is little benefit from a scoring system like EPSS. Clearly, the vulnerability required (still requires) immediate response regardless of whatever any scoring system said. EPSS is more valuable when you don’t know if something is actively being exploited or not.

Secondly, CVE-2021-44228 is in rarefied air among vulnerabilities. Holding up a vulnerability like this isn’t going to represent the other tens of thousands of vulnerabilities that didn’t generate the news articles and tweet storms. Most vulnerabilities do not receive this level of dissection and scrutiny. Vulnerabilities over the last few years have an average of 2 references (URLs), CVE-2021-44228 has over forty. As far as testing a prioritization strategy, this vulnerability is about handling an outlier, not a representative example.

Having set that up, we are now going to (kind of) violate the second point above. We want to use the Log4Shell vulnerability as a use case, but not to gauge how well or poorly EPSS did on it. Instead we want it as a use case precisely because it was so heavily dissected and thus, memorable. Many people have intimate knowledge of how reality unfolded in early December of 2021, and they can compare their own knowledge to what EPSS saw and how EPSS handled the information coming in about CVE-2021-44228. Hopefully this will peel back the covers a bit and help build your understanding of what EPSS is and what it is not.

How EPSS works

The Exploit Prediction Score System (EPSS) is a fully automated vulnerability scoring system and estimates the probability of exploitation activity in the next 30 days. And even though EPSS may be labeled by some as “advanced artificial intelligence”, it may appear at times to be neither advanced nor intelligent. It only knows about what is in the data it’s consuming and parsing on a daily basis. It won’t pick up all the same clues human analysts would find, but it will go through millions of data points faster than any analyst could and it will discover complex patterns and make connections with similar events more consistently than any human could.

Within EPSS, the value of any single point of information isn’t set in stone, in other words, the variables will have different importance depending on the context and specifics of each vulnerability. There are interaction effects between variables and the “weight” of a variable is dependent on other variables. The importance of some variables can change with time, or be influenced based on the presence of other attributes. Additionally the importance of any single value can change over its range. For example, as we will see in a bit, the number of references added to the CVE has heavy influence early on in this vulnerability, but as more and more references are added, the effect wanes and ends up with no noticeable impact on the score in this case even though more references are being added.

The EPSS timeline of CVE-2021-44228

This is the summary of what EPSS observed and how it reacted to Log4Shell:

The score for Log4Shell varied over time

Some of the details are missing from that plot, so this is the daily break down of what was known about this vulnerability and when:

Early December:

Lots of activity around this vulnerability, but EPSS is completely unaware of it. The official CVE, CVE-2021-44228 was published on December 10th, 2021. EPSS published it’s daily scores on the morning of December 10th without any reference to the CVE, so it was published later in the day.

December 11, 2021: EPSS 0.355

CVE published with 8 URLs listed as references in the original CVE.
At least one exploit was already identified on GitHub,
The scanner Intrigue has a module to scan for this CVE
Based on the descriptive text, it’s tagged as “Remote”, “Code Execution” and “Configuration”

December 12, 2021: EPSS 0.355

The scanner Nuclei added a module to scan for this CVE

December 13, 2021: EPSS 0.384

Two more URLs added as references in CVE (10 total)

December 14, 2021: EPSS 0.30

Nine more URLs added as references in CVE (19 total)
CVE description is modified and tag of “remote” is dropped (this is probably why the score dropped on this day)
CVSS added to NVD, with base_score of 10: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H
Apache added as vendor in CPE information

December 15, 2021: EPSS 0.633

Exploit Code published to ExploitDB
Six more URLs added as references in CVE (25 total)

December 16, 2021: EPSS 0.633

Three more URLs added as references in CVE (28 total)

December 17, 2021: EPSS 0.944

Metasploit added module, “auxiliary/scanner/http/log4shell_scanner”
One more URL added as reference in CVE (29 total)
NVD added Cisco, Debian, Fedora project, Intel, Netapp, Siemens and Sonicwall in CPE information

December 18, 2021: EPSS 0.633

Metasploit module removed
URL references are being added into the CVE without effect on EPSS score

January 12, 2022: EPSS 0.944

Metasploit added module, “exploit/multi/http/log4shell_header_injection”
Has since maintained this score for several weeks now

There is a lot to unpack there, but let’s walk through some key points.

First, notice something that isn’t in the chart or list mentioned above. There was activity before the CVE was officially published and none of it was registered with EPSS. EPSS scoring is only triggered on CVE publication. This isn’t suggesting that only published CVEs are important, we are just tackling what we can and iterating improvements from there. Pre-CVE or non-CVE vulnerabilities are important, but we are tackling one challenge at a time and hopefully we can release more and better features in the future.
The publication and sharing of exploit code greatly increases the probability of observing exploitation. All of the big increases in exploitability are from exploit code being made available. Keep in mind, that isn’t subjective weighting. The EPSS algorithm learns from exploitation activity, and it clearly found a strong relationship between exploit code being published in GitHub, Exploit DB and metasploit and observed exploitation activity. This also aligns with recent research
As mentioned above, variable importance isn’t static. Look at the chart below of the number of references in the published CVE over time. But first understand that overall, the typical CVE has about three references listed with it (in the official CVE list), but in the last few years that’s dropped to about two references. This CVE starts out with 8 references on the first day and within a week it was up to 19 references. The point here is that each reference isn’t affecting the exploitability an equal amount. The early few increases are having a lot more influence and changes after maybe two dozen, it doesn’t really matter any more. Another example here is the EPSS score jump from 0.633 to 0.944 when code is published in metasploit. Other CVEs published in metasploits may not see the same jump of 0.311 points. In this case, there was already exploit code published in both GitHub and ExploitDB, that most likely influenced the weight of appearing in metasploit. But just keep in mind that the jumps and increases

growth of references listed in the CVE

Conclusion

Hopefully this write up shines more light on what EPSS is and how it works. It’s an automated system that is watching and learning about vulnerabilities and is put into action by the publication of a CVE on the official CVE List. It also has shortcomings and is not intended to be a replacement for human intelligence. Like we said, when it comes to a well-researched vulnerability like CVE-2021-44228, there is little benefit from a scoring system like EPSS. Instead, the strength of EPSS lies in the thousands of vulnerabilities that haven’t made the headlines or that may not show up in any top 10 list. It’s collecting and compiling data daily and scoring over 170,000 vulnerabilities with speed and measurable accuracy.