A recent RAND report, released just two days after Wikileaks opened its Vault 7 that detailed the CIA’s entire stockpile of vulnerabilities and their suite of cyber tools (also referred as exploits), seeks to establish a protocol and the advantages of state intelligence agencies maintaining classified vulnerability stockpiles. However, this report was let down by a poor data set and flawed assumptions. This is important because the strategic competition amongst Russia, China and the United States, as well as other major powers, continues unabated, with major consequences. And if US agencies deliberately maintain vulnerabilities it opens all of us up to exploits by other states and non-state actors.
These exploits can range in use and consequence and developers are engaged in an active arms race with the hackers who try to undermine the integrity of their software. This can have a wide range of consequences from someone’s system being locked away by a third party (Ransomware) to a complete failure of critical systems like the power grid or a large financial institution like Wall Street that might impact millions of people and disrupt a country’s normal functioning.
The RAND report aims to consider when and how a software vulnerability stockpile should be disclosed by the US intelligence community once they have discovered it. It targets policy makers and non-experts and aims to provide a baseline that could justify the stockpiling of zero day vulnerabilities (bugs that exist in a software that can allow some form of manipulation; zero day means they are not known to the public and hence are “alive”). It also considers how long should an agency stockpile them for and, provide an initial cost/benefit analysis involving the cost to develop exploits for such vulnerabilities and lifetime of that exploit.
The power of the Vault 7 release, along with the news that the Russians and Chinese had probably already had full access to the database of vulnerabilities held by the CIA, was to cause many people to doubt the CIA’s ability to consider such a cost benefit analysis in the interests of the vulnerable public.
The RAND report itself fails on many fronts; the premise, the data and its analysis.
The authors of the report follow four avenues of discussion:
The life status of the vulnerability (if it is “alive” and therefore exploitable and unknown to the public, or “dead” therefore known and/or not exploitable).
The longevity of the vulnerabilities (how long they last on average before being found or patched).
Collision rate (how often other researchers find the same vulnerability).
Cost of developing an exploit compared to the cost to counter/patch a vulnerability.
This list misses one critical aspect:
What might the consequences be of leaving vulnerabilities live?
This leaves the discussion incomplete, and the cost-benefit analysis meaningless.
In addition, some their chosen points of discussion also have problems in their definition and use. In particular, their definition that the life-span of an exploit is the time between a vulnerability being discovered (its ‘birth’) and being killed. This would be more meaningful if instead they considered the time between the creation of an exploit (also known as time of maturity of the vulnerability) and its being remedied. This is because a vulnerability can be left unexploited for a long period because a) it is not exploitable, b) it is too costly to exploit or c) there is no interest in doing so at the moment.
The RAND report calculates the collision rate based only on the overlap between the vulnerabilities known by the data providers and the vulnerabilities known to the public. This fails to account for opposing agencies rate of collision (Russian and Chinese secret services for example), which removes the most important players from the picture. The authors themselves admit that their use of public groups as a proxy to hostile agencies is “weak” (page 12 of the report) but do not seem to include them anywhere in their analysis. Without knowing this collision rate, it is my opinion that the discussion about vulnerability disclosure should be focused around the damage that these vulnerabilities can cause to us if used by other state and non-state actors.
The dataset used in this report is too small (n=207) for safe conclusions, leading to huge standard deviations (which indicate how much the datapoints are distant from the mean). Several of the datapoints (roughly 40%) were removed for various reasons leaving the total number of cases for analysis a mere 127. The data is also incoherent with regards to time. Some of the temporal data was based on recollection . This may be what lay behind the evident discrepancies between their results and expert testimony on the average lifespan of an exploit and their data.
The collision rate per year of 5.7% appears too low, and fails to consider important third parties working on the same platforms. The average value of the lifetime of an exploit is given as 6.9 years. This seems like a lot until you consider a couple of things. First, the median (better for statistical analysis since they remove outlying values) is 5.07 years. Secondly, they consider the lifetime of an exploit from birth to death, rather than from maturity (when an exploit might be used). And finally, some of their temporal data is based on individual recollection which can skew the data.
This report does not fulfil the scientific specifications that RAND Corp set for themselves. It lacks a critical approach to the data and is built on assumptions that are inadequate for the issue in question. This approach could easily misinform a non-specialist target audience of non-experts and policy-makers, and cause millions in damage and perpetuate the climate of lack of understanding.
The report, coming two days after the Vault 7 release, claims that, “No knowledge of leaks from websites like WikiLeaks are responsible for killing vulnerabilities”. And yet the CIA just got its entire vulnerabilities stockpile released to the public, a treasure trove of a dataset, one that could have a huge impact on vulnerabilities to critical infrastructure and services. Wikileaks claims that the CIA had lost control of its exploits for some time, which means that the collision rate with the Russians and Chinese has been close to 100% for some time now and that ALL the CIA vulnerabilities contained in the Wikileaks release are now dead.