Blog

<< First  < Prev   1   2   Next >  Last >> 
  • 2011-12-13 01:25 | Marcin Antkiewicz (Administrator)

    This blog entry was originally written by Patrick Florer, I am just migrating the post to the new SIRA site. 

    (this is the third post of three)

    Experiment #1 – how well does BetaPERT predict the actual data?

    Using the overall minimum and maximum for the 94 observations, I ran another Monte Carlo simulation using these parameters:

    Min = 0

    Most Likely = 2,000,000 (derived as described below)

    Max = 7,500,000

    gamma/lambda = 4

    Monte Carlo iterations = 100,000

    Excel could not calculate a mode because all 94 values in the Ponemon study were unique.  Using a value for the bins = 76 (0 through 7,500,000 binned by 100,000), I obtained a value of 2,000,000 from the histogram, which I used as the mode/most likely estimate.

    Here is the histogram that was created by ModelRISK, using the VoseModPERT function:

    As you can see, the shape of the histogram created by the BetaPERT function is similar to the histogram for the actual data.  But is it similar enough to be believable?  A comparison of values at various percentiles tells a better story:

    With the exception of the minimum estimate, where the variance is due to using 0 as a minimum estimate instead of 378,000, and the estimates at the 5th and 10th percentiles, the remaining variances are within +/-20%.   In fact, all of the variances between the 1st and 99th percentiles are positive, which means that, up to the 99th percentile, the BetaPERT function has over-estimated the values.

    Is this close enough, especially the 8% underestimate at the 99th percentile?  For me, probably so, because we already know that we are going to have trouble in the tails with any kind of estimate.  But for you, maybe not - you be the judge.

    Experiment #2 – is there another distribution that predicts the actual data more closely than BetaPERT?

    The ModelRISK software has a “distribution fit” function that allows you to select a data set for input and then fit various distributions to the data.  Using the 94 compliance cost values from the Ponemon study as input, I let ModelRISK attempt to fit a variety of distributions to the data.

    The best overall fit was a Gamma distribution, using alpha = 3.668591 and beta = 588093.8.  The software calculated these values – I didn’t do it and would not have known where to start.

    Here is the histogram:

    And here is a comparison based upon percentiles:

    This gamma distribution is even a better fit than the BetaPERT distribution.  Except at the extreme tails, the variances fall within +/- 11% - closer than the +/- 20% of the BetaPERT.

    From a theoretical point of view, this is interesting.  But from a practical point of view, it is problematic.  It’s one thing to fit a distribution to a data set and derive the parameters for a probability distribution.  But it’s quite another matter to know, in advance, which distribution might best predict a data set and what the parameters should be for that distribution.  In addition, the Gamma distribution is typically used for creating distributions that describe the random occurrence of events during a time-frame.  I am not sure how appropriate its use might be to describe a distribution of loss magnitudes or costs – I plan to find out!

     

    Concluding remarks:

    While tests on a single, small dataset do not provide conclusive proof for the ability of the PERT and other distributions to match up to actual data, they do provide encouragement and motivation for further testing.

    It would be useful to perform tests like these on larger datasets.  Perhaps one of you has access to such data?  If so, how about doing some tests and writing a blog post?

    It might also be possible to use the “Total Affected”/Records exposed data from datalossdb to test the ability of BetaPERT to model reality.  I would invite anyone interested to give it a try.

    As we build our experience fitting parametric distributions to different data sets, our knowledge of which distributions to try in which circumstances will surely grow, and lead, hopefully, to more useful and believable risk analyses.


  • 2011-10-20 02:30 | Marcin Antkiewicz (Administrator)

    This blog entry was originally written by Kevin Thompson (@bfist), I am just migrating the post to the new SIRA site. 

    The other day I was thinking about how one might go about expressing the aggregate information risk of an organization. Lately I find myself favoring a balanced scorecard approach to expressing risk, but even after breaking risk scenarios into several groups, you still need to aggregate them in some way. So how would one go about expressing the combined distribution of loss from several loss risk scenarios?

    Let’s make up a story to illustrate. Lets say that you are going to go to your favorite amusement park in Minnesota. Your mom is going to her favorite amusement park in California and your dear sweet grandmother is going to her favorite amusement park in Florida. Each of you is going to bring $100 with you. You all have an equal risk of being robbed, but your mother and grandmother are so paranoid that they wont go unless you promise to reimburse them for any robbery they suffer. If any of you are robbed you know that the robber will get at least $1 and no more than $100. So you could express each risk scenario as a uniform distribution with a minimum of $1 and a maximum of $100. But what is your total risk exposure?

    Your total risk exposure is $300 because you have three risks with a maximum loss of $100. But we usually do not want to express what is possible, we want to express what is probable. You might want to express risk out to the 95th percentile or the 99th percentile. The total exposure number is actually quite rare. How rare? There is a 1% chance that one of you will be robbed for $100. Since these events are independent, the probability of two of you being robbed for $100 is 1% of 1% or .01%. The probability of all three of you being robbed for $100 is .0001% or once every million trips to the amusement park. And even that is assuming that all three of you will certainly be robbed!

    The same effect happens on the minimum side of the aggregate risk distribution for the same reason. I ran a monte carlo simulation with these three scenarios with 5000 samples. Even after 5000 samples I couldn’t get a minimum value less than 11. In fact, only 10% of the simulations had less than $85 for the aggregate loss. What you’re seeing here is the Central Limit Theorem at work. When we combine three uniform distributions, we get a result that starts to look more like a normal distribution. The graphic here shows the outcome of adding these three loss distributions together. The effect becomes even more pronounced when the distributions aren’t uniform. If we had used a Pert distribution to describe each of the losses with a most likely loss of $30 then the effect is even greater. I never saw a loss less than $22 or more than $228. If you wanted to have enough cash to cover your losses for 95% of the trips to the amusement park, I would only need to have $164.

    Aggregate of three uniform distributions

    So if you’re going to be aggregating the loss from several risk simulations then the tool of choice is not a calculator, but another simulation. The aggregate risk becomes a simulation where each of the inputs is the output of a risk scenario. That way you can more accurately express what is probable rather than what is possible.


  • 2011-04-18 03:31 | Marcin Antkiewicz (Administrator)

    This blog entry was originally written by Bob Rudis (@hrbrmstr), I am just migrating the post to the new SIRA site. 

    All the following newly-minted risk assessment types have been inspired by actual situations. Hopefully you get to stick to just the proper OCTAVE/FAIR/NIST/etc. ones where you practice.

    • HARA :: Half-Assed Risk Assessment — When you are not provided any semblance of potential impact data and a woefully incomplete list of assets, but are still expected to return a valid risk rating.
    • CRA :: Cardassian Risk Assessment — When you are provided the resultant risk rating prior to beginning your risk assessment. (It's a Star Trek reference for those with actual lives)
      "We're going to do x anyway because we don't believe it's a high risk, but go ahead and do your assessment since the Policy mandates that you do one."
    • IRA :: Immediate Risk Assessment — This one has been showcased well by our own Mr. DBIR himself on the SIRA podcasts. A risk assessment question by a senior executive who wants an answer *now* (dammit)! It is often phrased as "Which is more secure, x or y?" or "We need to do z. What's the worst that can happen?". You literally have no time to research and - if you don't know the answer - then "Security" must not be very smart.
    • IRAVA :: In Reality, A Vulnerability Assessment — When you're asked to determine risk when what they are *really* asking for what the vulnerabilities are in a particular system/app. Think Tenable/Qualys scan results vs FAIR or OCTAVE.
    • IOCAL :: I Only Care About Likelihood — This is when the requester is absolutely fixated on likelihood and believes wholeheartedly that a low likelihood immediately means low risk. Any answer you give is also followed up with "have we ever had anything like x happen in the past?" and/or "have our competitors been hit with y yet?"
    • AD3RA :: Architecture Design Document Disguised As A Risk Assessment — When you are given all (and decent) inputs necessary to complete a pretty comprehensive risk assessment but are then asked to include a full architecture design document on how to mitigate them all. The sad truth is, the project team couldn't get the enterprise architects (EA) to the table for the first project commit stage, but since you know enough about the technologies in play to fix the major problems, why not just make you do the EA dept's job while you are just wasting time cranking out the mandatory risk assessment.
    • WDRA :: Wikipedia Deflected Risk Assessment — When you perform a risk assessment, but a manager or senior manager finds data on Wikipedia that they use to negate your findings. (Since - as we all know - Wikipedia is the sum of all correct human knowledge).

    If you are also coerced into performing an insane risk assessment that doesn't fit these models, feel free to share them in the comments.


  • 2011-01-14 01:32 | Marcin Antkiewicz (Administrator)

    This blog entry was originally written by Jay Jacobs (@jayjacobs), I am just migrating the post to the new SIRA site. 

    I am hoping that we, the participants, onlookers and critics of SIRA, will do something.

    I’m not expecting anything spectacular, flashy or impressive but I would like something – because something is missing. The problem is, I can’t identify what exactly that something should be because I can’t fathom it yet. I look around at how information risk management is being done now and I see people struggling — not just struggling to implement good security, but also struggling to prioritize or worse, not recognizing when to stop.

    There is a tipping point in information security when we will tip from not spending enough resources to spending too much. From what I can see, we have no idea when that occurs and we may be flip-flopping between the two already. Because the problem is, we struggle to define what effective security looks like. Take a handful of security experts, and ask them individually about the efficacy of say, anti-virus, network segmentation or passwords and listen to the words. Just soak in the variety of opinions and then ask each one a critical follow up question: “and how do you know?” The point isn’t who’s right or wrong, but just to acknowledge the variety of what constitutes efficacious security (yeah, that phrase just happened). Now multiply that variety by the number of people in an infosec role and we will see some really interesting emerging trends.

    If we lived in a world lacking an evidence-based approach, we’d get a bunch of frameworks from really impressive-sounding groups saying they’ve got the answer. We’d see a flood of similar-but-unique “best” practices and standards that claim to work everywhere. To top it off, we would be overwhelmed with failures (and successes) in all shapes and sizes and we’d struggle to recognize their significance. Meaning, if we lived in a world lacking an evidence based approach, we would be slave to the loudest voice, the scariest story, or the catchiest magazine article because those shape our perception and information security is currently based entirely on conventional wisdom.

    Now, I’m not looking for SIRA to solve the problem of efficacious security (that was the last time, promise). Don’t get me wrong, I’d take it, but I’ve seen too many attempts leaps to the end-goal end up dying on a forgotten network share. That is not what I’m hoping for. What I am hoping for is to simply move things forward, even a little. Perhaps we can improve communication techniques, apply some relevant statistics method in a new way or figure out some way to inch this profession forward.

    In short, I want us to create something to build on.

    To that end, SIRA will be trying to create spaces for building: monthly conference calls, a mailing list for discussions, an open blog where thoughts can be formalized and a journal where ideas can be researched and communicated. And oh yeah, we’re doing a podcast because Alex Hutton says some funny stuff. All I’d ask of folks is to show up and to not be afraid to speak up, ask questions and even look silly from time to time (and show respect for those looking silly). The important thing here is to have those conversations and do something, even if that something turns out to be finding all the ways not to proceed… because we can build on that.


<< First  < Prev   1   2   Next >  Last >> 

© Society of Information Risk Analysts 2018, a 501(c)6 non-profit organization.

Powered by Wild Apricot Membership Software