hand holding dunce cap hat in front of chalkboard or blackboard

Should We Trust your Statistics?

And, not to pick on you, but speaking of definitions, “collage” according to Webster is: “an artistic composition made of various materials (such as paper, cloth, or wood) glued on a surface.” “College” is: “an independent institution of higher learning offering a course of general studies leading to a bachelor’s degree,” or “an organized body of persons engaged in a common pursuit or having common interests or duties.” I hope you’re not paying a lot of tuition to teach your kids how to glue pictures to cardboard (I’d buy “it’s a typo” except you used it twice, and that impacts the forcefulness of the point you’re making; if you misused “collage” when you meant “college,” does that also call into question the accuracy of your statistics?).
Elrod

Yes, it does call into question the accuracy of my statistics.

I do make mistakes. This is one of the reasons I often put citations in my articles. That is so you can check my work.

It is also why I put my math in documents. I don’t just give you a number. I tell you how I got to that number. I.e. I show my work.

P(A)=fN

Where P(A) is the probability of an event (A) occurring, f the frequency of the event, and N is the total number of occurrences.

So if there is a 1 in 5 chance, the probability is 15=0.200.

The probability of events A and B happens is P(A and B)=P(A)×P(B).

Using De Morgan’s Law, we know that NOT (A or B) is equal to NOT A and NOT B. When addressing the question of rape, we are looking for the probability of a woman NOT being raped in year 1 AND of not being raped in years 2, and so forth. This if the probability of being raped is 1 in 4 while in collage, that means that we have NOT(P(rape(y1)) or P(rape(y2)) or P(rape(y3)) or P(rape(y4)) = 3/4 = 0.75. Y1 through y4 represent years at collage. We are assuming a four-year collage.

P(rape(yN)) is fixed at some value, for the sake of argument and ease of calculation.

P(rape(Y))4=0.75 P(rape(Y))=0.754 P(rape(Y))=0.930604859

Now that we know what the probability of a woman not being raped, per year, while in collage. We can restate it as the probability of a woman being raped. That is simply 10.930604859 or 0.06939514. Converting to a percentage, that gives us a 6.94% chance of a woman being raped per year at collage.

We want to convert this to per capita using 100K. This is simply multiplying the percentage by 100,000 which gives us 6939 per 100,000 women attending collage.

You can verify the formulas used at —How To Calculate Probability: Formula, Examples and Steps, Indeed Career Guide, (last visited Aug. 4, 2024).

So what about the other direction? I used two sources. One was found using “rapes per capita by state” and the other was “rapes per capita by country”. The value given for rapes per capita by states for the US was 40 per 100k. The per country gave us 41.77 per 100k. This being close enough to 40 that I choose to use the 40 per 100k as being “good enough”.
Rape Statistics by Country 2024, (last visited Aug. 4, 2024)

Using 40/100000 gives us P(rape(Y))=0.0004. This gives the probability of not being raped as 0.9996. Using our formula for multiple occurrences and using a 50-year span, we get 0.999650=0.9802. This means that the probability of a woman being raped over the course of 50 years is 0.0198 or 1.98%.

As Elrod stated, this all depends on your definition of rape. Definitions matter. As an example, in some countries, like the UK, it is not a murder unless the person is convicted of murder. So, again as Elrod said, a man with 6 bullet holes in the back of his head is just a dead person, not a murder victim, until and unless a person is convicted of the crime.

Rape is much the same. Different places have different definitions. In particular, the US statistics I used were “forcible rape”. This has a better definition than just the word “rape”.

All of the above is just to get to the following paragraph.

I struggle with dyslexia. The result of this is that once I type a word, it always looks correct to me. Or almost always. Spell checkers go a long way to fixing simple misspellings. I have to work to misspell a word.

I also pay for a plugin called LanguageTool. This does grammar analysis as well. Unfortunately, if the word I am using is grammatical correct, LanguageTool often does not catch my errors.

In the course of an article, I will expect between 10 and 100 error corrections. I apologize for those that get through.

Here is a word that I hope you do not struggle with, sweet and sweat. One of those words means a nice thing to eat, filled with yummy sugar like flavor. The other is what happens when you exercise.

I don’t think you want me to give you a sweat tart on Halloween.

I believe I have that correct, I would have to look up the word in a dictionary in order to double-check it.

So please, if I make a mistake, call me on it. If I don’t give you the references, it is likely because I didn’t bother to click the buttons to make a citation, I was lazy. Call me on it.


Comments

6 responses to “Should We Trust your Statistics?”

  1. pobodys nerfect….. as in the other blog we all followed, if I get the gist of what is being said I don’t nitpick over spelling n such. the world is coming apart at the seams. people are 1 billionth of an inch from goin full on phsycho at the slightest thing… misspelling of words is the least of our worries..

  2. The phrase ” Lies, damn lies and statistics” exists for good reason.

  3. Elrod Avatar

    I didn’t mean to cause such angst, but there’s “accuracy” and then there’s ACCURACY, and I’ve been known to be pedantic over stuff, word choice and mathematical computations in particular, that some think is just not that important.

    The Left, in particular, has so thoroughly corrupted language – quite deliberately, in fact – that it’s difficult to accept nearly anything that comes without very clear specifications because that corruption of language has percolated through the entire spectrum. It’s impossible to read what one might suppose is a “correct and true” account when even the non-Left adopts squishy words and their attendant inaccurate definitions.

    Where I come from the phrase “I do not have those data” carries a great deal of weight, and applies to more than just not knowing, it also applies to not knowing enough; .03937 is inches per millimeter to five decimal places, and when dealing with a machinist that is nearly always more than sufficient accuracy because the 7 is followed by two zeroes; in my former world ten decimal places – .0393700787 – wasn’t enough because reversing the math gives you .9999999999…. millimeter, not 1.0. When dealing with measurements in fractional microns – one millionth of a meter – it’s better to do everything in metric rather than trying to convert because the conversion will be less than perfectly accurate and the opportunity for an error is too great, which is why when the conversion formula used is omitted it behooves the recipient to redo the computation to be sure (I am thankful that I retired when we were still working mostly in microns and before the complete migration to nanometers occurred; my “old” digital micrometers that displayed .00001 increments – one hundred thousandth – are, today, about in the same category as Home Depot C clamps because the finer dimensions of today cannot be measured mechanically, they must be measured optically).

    In carpentry there’s “leave the line” and “take the line” referring to the fineness of the cut, which is the width of the pencil line marking it which is usually about 1/32 of an inch, or a bit more than 1/64 if the pencil is very sharp. That’s excessively accurate for framing but falls short in cabinet and furniture work, which why those artisans “cut to the line” and plane for the fit (precision automation helps a great deal in achieving that fit but a very critical eye can easily tell the difference between that and fine hand work).

    Language, especially spoken language, is riddled with inconsistencies and opportunities for error; “I love you” can be reassuring if whispered in someone’s ear, or if forced between clenched teeth, a dire warning, and nearly everyone is aware of email’s limitations in conveying tone that is easily expressed verbally. English, in particular, has built-in sloppiness; “I have read the red book but didn’t read it completely while I sat among the reeds.” Written, it requires knowing the difference, spoken, the difference is inconsequential.

    A couple days ago I was reading an article online and partway through had to go back to the beginning because it suddenly struck me that I wasn’t sure what the writer was trying to convey – he misused “straight” and “strait” (“straight” from Webster is “free from curves, bends, angles, or irregularities,” or “lying along or holding to a direct or proper course or method” and today, also refers to sexual proclivities; “strait” is “a comparatively narrow passageway connecting two large bodies of water.” The writer had used “straight” when he actually meant “strait.” A “strait” can also be “straight,” geographically speaking, but the particular manner of the writer’s misuse called into question the veracity of his entire point; did he really fully understand what he was trying to convey, or was he “doing a Kamala?”

    In comments (above) curby decries obsessing over mispellings, and most of the time he’s right, it’s no big deal (it does, however, convey information about the writer), but the difference between a line that doesn’t curve and a water passage between lakes is one letter – the “g” – and if one is navigating with map and compass (or chart and compass) that minute difference could turn out to be important.

    1. Chris Avatar

      No real angst. I happen to agree with you.

      If the writer is making errors then we should be verifying what the are saying. Even if they are not, trust but verify.

      Though I would like to clarify, in your example, would a minute of arc really make a difference?

      I really do want you (singular) to continue to hold me to the highest standards. I want you (plural) to feel free to question my opinions, sources, analysis.

      Thank you

  4. Elrod Avatar

    “Would a minute of arc really make a difference?”

    Well, it made one Hell of a difference to Donald Trump and the country on June 13th.

    But, I’m not sure to exactly what you’re referring to RE: minute of arc. If it’s a “straight strait” then probably not, but they are two different words with different definitions.

  5. Elrod Avatar

    And, speaking of mistakes….I wrote “June 13th” above when the event occurred on July 13 (once it gets hot here in the summer all the months look alike….)

    Mea culpa.