Society of Information Risk Analysts


<< First  < Prev   1   2   Next >  Last >> 
  • 2012-03-04 01:22 | Marcin Antkiewicz

    This blog entry was originally written by Jay Jacobs (@jayjacobs), I am just migrating the post to the new SIRA site. 

    With this year’s RSA conference still close in the rear view mirror, I felt I had to write about something that stuck in my mind as I went through the week.  I found repeated confirmation to something Doug Hubbard wrote, “I have never seen a single objection to using risk analysis in any profession that wasn’t based on ignorance of what risk analysis is and what it can do.”  Keep in mind, ignorance isn't meant to be derogatory or insulting.  It simply means a lack of knowledge or uninformed.  Many of the people I heard object to risk analysis were incredibly smart in many ways, yet they were presenting objections from an uninformed position of risk analysis.   It makes me wonder if we have to reduce ignorance about this field before we are able to successfully reduce uncertainty around our exposure to loss.  

    One such objection I encountered was during a conversation I had on the first night I was in San Francisco.  After some lively back-n-forth with a colleague on the efficacy of risk management and some healthy skepticism from this person, I received this challenge: “suppose I go into a casino with $100,000, what am I going to come out with?”  The assumption on his part was that analysis would provide a single number (perhaps an average loss) and whatever the answer was, it’d be wrong.  But my response was simply, “what if I produce a distribution?”  And so, for this challenge I have whipped out the first graph.  It is based on some very specific assumptions though.  I made the assumption that the gambler would chose a (single) game for their visit and they were consistent in their gambling.  Not that it mirrors reality, it just makes this example much easier (and this is just an example).   Screen shot 2012-03-03 at 6.22.46 PM

    Here are the other assumptions:

    • American Roulette (with the “0” and “00”)
    • Bets were one of the red/green, even/odd options (which wins about 47.3% and pays 1 to 1)
    • The gambler played for about 4 hours at a lively pace of 60+ games per hour (for a total of 250 rounds)
    • The gambler consistently wagered $5 per round

    I know these would probably not match reality, but that’s okay.  The assumptions are clear and the analysis could be updated as the assumptions are updated.   Point being, it only matters that the analysis matches the assumptions and that we can update the the assumptions (and the model).  The analysis could be redone with the odds from any game, or the analysis could combine multiple games played during a casino visit.  I’m just trying to keep this simple.

    My answer to the original question would start out with “given the assumptions…” and say something like this:

    • The gambler would leave with less money than they started with around 78% of the time
    • About half the time, the gambler would lose more than $70
    • 10% of the time, the gambler would lose more than $170
    • 10% of the time, the gambler would win more than $40

    Since we have to answer similar questions regardless of methods used (everyone makes risk-based statements regardless of what they call it), I challenged my skeptical colleague to answer the question in his way and his answer was simple.  He would advise the gambler to look at the casino itself, because it logically means they take more money than they give out.   While true, there is a large amount of uncertainty in that statement and lacks any feedback or ability to learn over time.  We can do better.

    A Better Gambling Story

    Pure games of chance (like roulette) have loads of variability and almost zero uncertainty since it’s in the house’s interest to make the games as unbiased as possible.  This makes them ripe for some simple models and allows us to create a better story.   Plus, by telling a gambler that they’ll lose more than they win isn’t very helpful, and may erode (or fail to build) trust, especially when the gambler walks away with money. 

    My solution is to model visits the casino repeatedly and see how the gambler does (a method known as Monte  Carlo).  I set up the model to play Roulette 250 times per visit, betting on the 1 to 1 payout options, and record the offset in cash for the gambler over the visit, then I set the model to run 10,000 times.  Finally, I made a pretty picture (with red, yellow and green of course) that showed the trends over the 250 iterations (left to right).  By looking at this we final-10k2can get a sense of the individual stories here.  For example, there’s a red line that hovers around $50 early in the visit (around rounds 40-70) and then ends up dipping down for a loss around $200 (sound familiar?)  Overall though, this should help inform the simple roulette gambler

    So, can I tell my colleague exactly how much he’ll walk out of the casino with?  Absolutely not, nobody can.  But given the correct assumptions we can make some statements of probability that reduce the gambler’s uncertainty better than other methods (including the de facto unaided intuition).  This is an important point: it’s not that statistics and math is going to convert lead into gold, but it will be better and more consistent than alternatives.  We cannot lose sight of that.  It will always be possible to poke the models (and some really deserve to be poked), but we should not tear down a solution just to replace it with something worse. 

  • 2011-12-13 01:29 | Marcin Antkiewicz

    This blog entry was originally written by Patrick Florer, I am just migrating the post to the new SIRA site. 

    (This is the first of three posts)

    Most of the people in SIRA have heard of the PERT and BetaPERT distributions.  Many of us use them on a daily basis in our modeling and risk analysis.  For this reason, I think it is important that we understand as much as we can about where these distributions came from, what some of their limitations are, and how they match up to actual data.

    The PERT Distribution:

    The PERT approach was developed more than forty years ago to address project scheduling needs that arose during the development of the Polaris missile system.  With regard to its use in scheduling, we can agree that the passage of time has a linear, understandable nature (leaving quantum mechanics out of the discussion, please) that might be appropriate for estimates of task completion times.  The Polaris missile program probably wasn’t the first big project these people had done (DoD and Booz Hamilton), so we can also assume that the originators of PERT had both experience and data to guide them when they constructed the function and created the math that they did. 

    The BetaPERT distribution was developed by David Vose in order to provide more flexibility in allowing for certainty or lack of certainty in PERT estimates.  Vose added a fourth parameter, called gamma, that impacts the sampling around the most likely value and consequently controls the height of the probability density curve.  As gamma increases, the height of the curve increases, and uncertainty decreases.  As gamma decreases, the opposite happens.  Some people use gamma as a proxy for confidence in the most likely estimate.

    For additional information about the PERT and BetaPERT distributions and how to use them, please see the excellent SIRA blog post that Kevin Thompson wrote a few weeks ago.

    (In order to keep things simple, from this point forward, unless there is a reason to make the distinction, I will use PERT to mean both PERT and modified/BetaPERT.)

    What’s the problem?

    Most of us have been taught that PERT distributions are appropriate tools for taking estimates from subject matter experts (SME) and turning them into probability distributions using Monte Carlo simulation.  As many of you know, this is very easy to do.  The graphics and tables look very nice, informative, and even a bit intimidating.

    But how do we really know that the distributions we create have any validity?

    Just because they may have worked in project scheduling, why should we believe that the distribution of loss magnitude, frequency, or anything else actually corresponds to the probability distribution that a PERTfunction can create?  Even if these distributions are useful and informative, might there be circumstances where we would be better served by not using them?  Or by using other distributions instead?

    I will address three of these issues below.

    In case you don’t feel like reading the whole post, I will tell you right now that:

    1. Yes, there are circumstances where PERT distributions do not yield good information.
    2. In a series of tests with a small data set, PERT distributions DID seem to correspond to reality – closely enough, in my opinion, to be useful, informative, and even predictive.
    3. Depending upon what we are trying to model, there are other distributions that might be even more useful than PERT.
    4. It’s a continual learning process – I want to encourage everyone to keep studying, experimenting, and sharing when possible.

    When the PERT distribution doesn’t work

    One of the assumptions of the PERT approach is that the standard deviation should represent approximately 1/6th of the spread between the minimum and maximum estimates.  When I look at the PERT and BetaPERT math, I can see this at work.  (for a full explanation, see  I have also read, and have demonstrated in my own experience, that PERT doesn’t return useful results when the minimum or maximum are very large multiples of the most likely value.

    For example, try this with OpenPERT or any other tool you like:

    Min = 1

    Most Likely = 100

    Max = 100,000,000

    gamma/lambda = 4

    run for 1,000,000 iterations of Monte Carlo simulation, just to be fanatical about sampling the tail values.

    (BTW, this is not a theoretical example – these data were supplied by a SME as part of a very large risk analysis project I was involved with two years ago – some list members who were involved in that project may remember this scenario.)

    I think that you will find that the mode (if your program calculates one), the median, and the mean are all so much greater than the initial most likely estimate as to be useless.  In addition, I think that you will find that the maximum from the MC simulation is quite a bit lower than the initial maximum estimate.

    Here are my results:

    (Please note – if you do this yourself, you won’t get exactly the same results, but, if you run 1,000,000 iterations, your results should be close)


    Min = 10

    Mode = 40,058,620

    (This is very interesting – in a distribution this skewed, you would expect the mode < median < mean: maybe this is an example of why Vose considers the mode to be uninformative?)

    Median = 12,960,195

    Mean = 16,675,377

    Max = 94,255,562



    Min = 6

    Mode =    ModelRISK doesn’t calculate a Mode – see Vose’s book for the explanation of why not

    Median = 12,923,895

    Mean = 16,654,354

    Max = 93,479,781


    What’s the takeaway here? 

    That there are some sets of estimates that PERT distributions don’t handle very well.  When we encounter large ranges that are highly skewed, we may need to re-think our approach, or ask the SMEadditional questions.


    To be continued …

  • 2011-12-13 01:27 | Marcin Antkiewicz

    This blog entry was originally written by Patrick Florer, I am just migrating the post to the new SIRA site. 

    (this is the second of three posts)

    Does the shape of PERT distribution match up to an actual data set?

    At the beginning of this post (Part 1), I raised the question about the validity of the PERT function and whether the distributions it creates correspond to anything.

    What follows are the results of an attempt to answer this question using a small data set extracted from a Ponemon Institute report called “Compliance Cost Associated with the Storage of Unstructured Information”, sponsored by Novell and published in May, 2011.  I selected this report because, starting on page 14, all of the raw data are presented in tabular format.  As an aside, this is the first report I have come across that publishes the raw data - please take note, Verizon, if you are reading this!

    Here is a histogram of the 94 actual observations, created using the standard functionality in Excel (Data\Data Analysis\Histogram) and tweaked a bit to show probability instead of frequency.

    As you can see, the histogram is suggestive of a positively-skewed distribution - with some exceptions – there are several peaks and valleys.  What these peaks and valleys mean is unclear – it could simply be observations that are missing – the study size was small:  N = 94 organizations.  Or they could be real – only more observations would tell us.

    At this point, I asked myself – what if the Ponemon study had captured and had published minimum, maximum, and most likely values instead of single point estimates?  If it had, then we could have constructed a more informative histogram.

    In an attempt to simulate what things might have looked like, I took the Ponemon study raw data, computed minimum and maximum values for each of the 94 data points, and then ran a Monte Carlo simulation, using the following parameters:

    Most Likely = the actual reported cost estimate provided by the report.

    Min = Most Likely  x  a random number between 0 and 1

    Max = Most Likely  x  ((1 + a random number between 0 and 1)  x  Most Likely))

    gamma/lambda was set to 4 for all.

    Since true minimum and maximum values were not reported by the study, I decided that using a random number as a multiplier to calculate both the minimum and the maximum values seemed as defensible as anything else for the purpose of my simulation.

    I then ran 10,000 iterations of Monte Carlo simulation for each of the 94 BetaPERT functions, which resulted in 940,000 total estimates.  Using 940,000 data points, standard functionality in Excel (Data\Data Analysis\Histogram), and a tweak to show probability instead of frequency, I created the following histogram:

    This histogram is even more suggestive of a positively-skewed distribution.

    But the same questions remain:  Are the dips and valleys representative of missing observations, or are they real?  And, how well would a BetaPERT function predict the shape of this histogram?  How well would any other probability function perform, for that matter?  And, perhaps most importantly, what, if anything, can we extrapolate about other compliance cost data sets from this one?

    So, it was time for another experiment or two!

    To be continued …

  • 2011-12-13 01:25 | Marcin Antkiewicz

    This blog entry was originally written by Patrick Florer, I am just migrating the post to the new SIRA site. 

    (this is the third post of three)

    Experiment #1 – how well does BetaPERT predict the actual data?

    Using the overall minimum and maximum for the 94 observations, I ran another Monte Carlo simulation using these parameters:

    Min = 0

    Most Likely = 2,000,000 (derived as described below)

    Max = 7,500,000

    gamma/lambda = 4

    Monte Carlo iterations = 100,000

    Excel could not calculate a mode because all 94 values in the Ponemon study were unique.  Using a value for the bins = 76 (0 through 7,500,000 binned by 100,000), I obtained a value of 2,000,000 from the histogram, which I used as the mode/most likely estimate.

    Here is the histogram that was created by ModelRISK, using the VoseModPERT function:

    As you can see, the shape of the histogram created by the BetaPERT function is similar to the histogram for the actual data.  But is it similar enough to be believable?  A comparison of values at various percentiles tells a better story:

    With the exception of the minimum estimate, where the variance is due to using 0 as a minimum estimate instead of 378,000, and the estimates at the 5th and 10th percentiles, the remaining variances are within +/-20%.   In fact, all of the variances between the 1st and 99th percentiles are positive, which means that, up to the 99th percentile, the BetaPERT function has over-estimated the values.

    Is this close enough, especially the 8% underestimate at the 99th percentile?  For me, probably so, because we already know that we are going to have trouble in the tails with any kind of estimate.  But for you, maybe not - you be the judge.

    Experiment #2 – is there another distribution that predicts the actual data more closely than BetaPERT?

    The ModelRISK software has a “distribution fit” function that allows you to select a data set for input and then fit various distributions to the data.  Using the 94 compliance cost values from the Ponemon study as input, I let ModelRISK attempt to fit a variety of distributions to the data.

    The best overall fit was a Gamma distribution, using alpha = 3.668591 and beta = 588093.8.  The software calculated these values – I didn’t do it and would not have known where to start.

    Here is the histogram:

    And here is a comparison based upon percentiles:

    This gamma distribution is even a better fit than the BetaPERT distribution.  Except at the extreme tails, the variances fall within +/- 11% - closer than the +/- 20% of the BetaPERT.

    From a theoretical point of view, this is interesting.  But from a practical point of view, it is problematic.  It’s one thing to fit a distribution to a data set and derive the parameters for a probability distribution.  But it’s quite another matter to know, in advance, which distribution might best predict a data set and what the parameters should be for that distribution.  In addition, the Gamma distribution is typically used for creating distributions that describe the random occurrence of events during a time-frame.  I am not sure how appropriate its use might be to describe a distribution of loss magnitudes or costs – I plan to find out!


    Concluding remarks:

    While tests on a single, small dataset do not provide conclusive proof for the ability of the PERT and other distributions to match up to actual data, they do provide encouragement and motivation for further testing.

    It would be useful to perform tests like these on larger datasets.  Perhaps one of you has access to such data?  If so, how about doing some tests and writing a blog post?

    It might also be possible to use the “Total Affected”/Records exposed data from datalossdb to test the ability of BetaPERT to model reality.  I would invite anyone interested to give it a try.

    As we build our experience fitting parametric distributions to different data sets, our knowledge of which distributions to try in which circumstances will surely grow, and lead, hopefully, to more useful and believable risk analyses.

  • 2011-10-20 02:30 | Marcin Antkiewicz

    This blog entry was originally written by Kevin Thompson (@bfist), I am just migrating the post to the new SIRA site. 

    The other day I was thinking about how one might go about expressing the aggregate information risk of an organization. Lately I find myself favoring a balanced scorecard approach to expressing risk, but even after breaking risk scenarios into several groups, you still need to aggregate them in some way. So how would one go about expressing the combined distribution of loss from several loss risk scenarios?

    Let’s make up a story to illustrate. Lets say that you are going to go to your favorite amusement park in Minnesota. Your mom is going to her favorite amusement park in California and your dear sweet grandmother is going to her favorite amusement park in Florida. Each of you is going to bring $100 with you. You all have an equal risk of being robbed, but your mother and grandmother are so paranoid that they wont go unless you promise to reimburse them for any robbery they suffer. If any of you are robbed you know that the robber will get at least $1 and no more than $100. So you could express each risk scenario as a uniform distribution with a minimum of $1 and a maximum of $100. But what is your total risk exposure?

    Your total risk exposure is $300 because you have three risks with a maximum loss of $100. But we usually do not want to express what is possible, we want to express what is probable. You might want to express risk out to the 95th percentile or the 99th percentile. The total exposure number is actually quite rare. How rare? There is a 1% chance that one of you will be robbed for $100. Since these events are independent, the probability of two of you being robbed for $100 is 1% of 1% or .01%. The probability of all three of you being robbed for $100 is .0001% or once every million trips to the amusement park. And even that is assuming that all three of you will certainly be robbed!

    The same effect happens on the minimum side of the aggregate risk distribution for the same reason. I ran a monte carlo simulation with these three scenarios with 5000 samples. Even after 5000 samples I couldn’t get a minimum value less than 11. In fact, only 10% of the simulations had less than $85 for the aggregate loss. What you’re seeing here is the Central Limit Theorem at work. When we combine three uniform distributions, we get a result that starts to look more like a normal distribution. The graphic here shows the outcome of adding these three loss distributions together. The effect becomes even more pronounced when the distributions aren’t uniform. If we had used a Pert distribution to describe each of the losses with a most likely loss of $30 then the effect is even greater. I never saw a loss less than $22 or more than $228. If you wanted to have enough cash to cover your losses for 95% of the trips to the amusement park, I would only need to have $164.

    Aggregate of three uniform distributions

    So if you’re going to be aggregating the loss from several risk simulations then the tool of choice is not a calculator, but another simulation. The aggregate risk becomes a simulation where each of the inputs is the output of a risk scenario. That way you can more accurately express what is probable rather than what is possible.

  • 2011-04-18 03:31 | Marcin Antkiewicz

    This blog entry was originally written by Bob Rudis (@hrbrmstr), I am just migrating the post to the new SIRA site. 

    All the following newly-minted risk assessment types have been inspired by actual situations. Hopefully you get to stick to just the proper OCTAVE/FAIR/NIST/etc. ones where you practice.

    • HARA :: Half-Assed Risk Assessment — When you are not provided any semblance of potential impact data and a woefully incomplete list of assets, but are still expected to return a valid risk rating.
    • CRA :: Cardassian Risk Assessment — When you are provided the resultant risk rating prior to beginning your risk assessment. (It's a Star Trek reference for those with actual lives)
      "We're going to do x anyway because we don't believe it's a high risk, but go ahead and do your assessment since the Policy mandates that you do one."
    • IRA :: Immediate Risk Assessment — This one has been showcased well by our own Mr. DBIR himself on the SIRA podcasts. A risk assessment question by a senior executive who wants an answer *now* (dammit)! It is often phrased as "Which is more secure, x or y?" or "We need to do z. What's the worst that can happen?". You literally have no time to research and - if you don't know the answer - then "Security" must not be very smart.
    • IRAVA :: In Reality, A Vulnerability Assessment — When you're asked to determine risk when what they are *really* asking for what the vulnerabilities are in a particular system/app. Think Tenable/Qualys scan results vs FAIR or OCTAVE.
    • IOCAL :: I Only Care About Likelihood — This is when the requester is absolutely fixated on likelihood and believes wholeheartedly that a low likelihood immediately means low risk. Any answer you give is also followed up with "have we ever had anything like x happen in the past?" and/or "have our competitors been hit with y yet?"
    • AD3RA :: Architecture Design Document Disguised As A Risk Assessment — When you are given all (and decent) inputs necessary to complete a pretty comprehensive risk assessment but are then asked to include a full architecture design document on how to mitigate them all. The sad truth is, the project team couldn't get the enterprise architects (EA) to the table for the first project commit stage, but since you know enough about the technologies in play to fix the major problems, why not just make you do the EA dept's job while you are just wasting time cranking out the mandatory risk assessment.
    • WDRA :: Wikipedia Deflected Risk Assessment — When you perform a risk assessment, but a manager or senior manager finds data on Wikipedia that they use to negate your findings. (Since - as we all know - Wikipedia is the sum of all correct human knowledge).

    If you are also coerced into performing an insane risk assessment that doesn't fit these models, feel free to share them in the comments.

  • 2011-01-14 01:32 | Marcin Antkiewicz

    This blog entry was originally written by Jay Jacobs (@jayjacobs), I am just migrating the post to the new SIRA site. 

    I am hoping that we, the participants, onlookers and critics of SIRA, will do something.

    I’m not expecting anything spectacular, flashy or impressive but I would like something – because something is missing. The problem is, I can’t identify what exactly that something should be because I can’t fathom it yet. I look around at how information risk management is being done now and I see people struggling — not just struggling to implement good security, but also struggling to prioritize or worse, not recognizing when to stop.

    There is a tipping point in information security when we will tip from not spending enough resources to spending too much. From what I can see, we have no idea when that occurs and we may be flip-flopping between the two already. Because the problem is, we struggle to define what effective security looks like. Take a handful of security experts, and ask them individually about the efficacy of say, anti-virus, network segmentation or passwords and listen to the words. Just soak in the variety of opinions and then ask each one a critical follow up question: “and how do you know?” The point isn’t who’s right or wrong, but just to acknowledge the variety of what constitutes efficacious security (yeah, that phrase just happened). Now multiply that variety by the number of people in an infosec role and we will see some really interesting emerging trends.

    If we lived in a world lacking an evidence-based approach, we’d get a bunch of frameworks from really impressive-sounding groups saying they’ve got the answer. We’d see a flood of similar-but-unique “best” practices and standards that claim to work everywhere. To top it off, we would be overwhelmed with failures (and successes) in all shapes and sizes and we’d struggle to recognize their significance. Meaning, if we lived in a world lacking an evidence based approach, we would be slave to the loudest voice, the scariest story, or the catchiest magazine article because those shape our perception and information security is currently based entirely on conventional wisdom.

    Now, I’m not looking for SIRA to solve the problem of efficacious security (that was the last time, promise). Don’t get me wrong, I’d take it, but I’ve seen too many attempts leaps to the end-goal end up dying on a forgotten network share. That is not what I’m hoping for. What I am hoping for is to simply move things forward, even a little. Perhaps we can improve communication techniques, apply some relevant statistics method in a new way or figure out some way to inch this profession forward.

    In short, I want us to create something to build on.

    To that end, SIRA will be trying to create spaces for building: monthly conference calls, a mailing list for discussions, an open blog where thoughts can be formalized and a journal where ideas can be researched and communicated. And oh yeah, we’re doing a podcast because Alex Hutton says some funny stuff. All I’d ask of folks is to show up and to not be afraid to speak up, ask questions and even look silly from time to time (and show respect for those looking silly). The important thing here is to have those conversations and do something, even if that something turns out to be finding all the ways not to proceed… because we can build on that.

<< First  < Prev   1   2   Next >  Last >> 

©2010-2023 Society of Information Risk Analystsa 501(c)(3) non-profit organization. Our Privacy Policy.

Powered by Wild Apricot Membership Software