## Sunday, 12 November 2017

### Lies, damned lies and,,,,,,,

Many years ago as a snot nose student I undertook a course in statistics. It was a compulsory course designed to help in my further research as a professional biologist. I'm sure you have heard the expression: 'lies, damned lies, and statistics. But statistics is just a tool and can be used for good or ill. Where patterns need to be divined from large complex data sets or whether the significance of a particular finding is relevant or just random 'noise', then stats can be a wonderful, nay beautiful tool.

During the whole of my professional career, I have never had to calculate any statistical method. I simply input my data into a stats programme and press a couple of buttons. If I need advice concerning the statistical tool required in any given situation then I consult the department's statistician for expert guidance. As an aside, our statistician is a strange old cove. He sports long hair and a beard and wanders about the building barefoot. What his designated health and safety officer has to say about the matter, I have no idea. In essence, the statistician is an ageing hippie and a caricature. He has a Che Guevara poster on the wall next to a 'ban the bomb' sign.

The point I'm trying to make in my characteristic and rambling way is that I don't have to be a statistical 'wunderkind' to apply stats. Although, perhaps I should have been more assiduous in my studies. My son's girlfriend has a stats degree and now works in a big bank crunching numbers for investment portfolios. At 28 she earns twice as much as the Flaxen Haired One and receives an annual bonus. I'm starting to digress.

This rather lengthy introduction is just a means to set the background for today's topic: Bayesian theorem. Bayesian statistics has applications in a wide set of disciplines and is even intuitively used, by everyone, in everyday decisions.

Prosaically stated: Baye's theorem enumerates how risk/probability starts with a base knowledge/data status and then enables the layering of new information which mitigates or increases risk. From a prior premise, additional information can be added sequentially thus altering the odds or final outcome. In a way, this is how we make decisions in real life scenarios, although not as mathematically precise. For those of a mathematical inclination, I've placed the Baye's formula below.

a prior
Bayes' theorem is stated mathematically as the following equation:[2]
${\displaystyle P(A\mid B)={\frac {P(B\mid A)\,P(A)}{P(B)}},}$
where ${\displaystyle A}$ and ${\displaystyle B}$ are events and ${\displaystyle P(B)\neq 0}$.
• ${\displaystyle P(A)}$ and ${\displaystyle P(B)}$ are the probabilities of observing ${\displaystyle A}$ and ${\displaystyle B}$ without regard to each other.
• ${\displaystyle P(A\mid B)}$, a conditional probability, is the probability of observing event ${\displaystyle A}$ given that ${\displaystyle B}$ is true.
• ${\displaystyle P(B\mid A)}$ is the probability of observing event ${\displaystyle B}$ given that ${\displaystyle A}$ is true.

I'll just outline a practical example which has relevance to my line of work.
The carrier status of the disorder, cystic fibrosis (CF) is 1 in 20 in the North Western European population. Carriers are just that and are free from the disease. The chance of two carriers coming together to produce issue is 1/20 x 1/20 which equals 1 in 400. As CF is a recessive condition, the chances of them having a CF-affected child is 1/4 x 400 which equals 1 in 1,600. This is a risk-based solely on population data. How could this risk be modified? Perhaps in a particular instance, we ascertain that a relative, say a grandparent was affected by CF. Now we can factor this information to produce a modified risk structure for this individual. If we do the maths we will see that the risk will be higher than population risk. I will not reveal the mathematical reasoning here, however, it will be interesting to see if any of my readers can be arsed to work it out. As my audience demographic leans toward (sometimes totters) to the smarter end of the spectrum it will be interesting to see whether someone will rise to the challenge.

In our everyday life, we use Bayesian stats to further refine our judgement, as previously stated, not in any precise mathematical way, but in a subliminal and perhaps invisible way. But still, it is there. If we are prudent in our decisions we base it on incoming data. We then modify our response, accordingly. This is the essence of Baye's theorem. If we are logical in our thought processes, which consistently we are not, then maybe we can make sense, sometimes at least, in our insistent and chaotic world.  Please do not hold your breath.

#### 1 comment:

1. This comment has been removed by a blog administrator.