Statistical Distributions- A Brief Introduction


What is a Probability Distribution?

Distribution is a collection of probabilities associated with an event and event is an outcome. A probability distribution is a table or an equation that links each outcome of a statistical experiment with its probability of occurrence. It is broadly based discrete and continuous variables.


What is a Binomial Distribution?

Binomial Distribution is used when trials are independent, fixed and random. It is used where only two outcomes are present. Probability of success is uniform here. It is used in defective analysis in industries. It is also known as Bernoulli Process.


The mathematical expression for the same is given below:


Here mean= np and standard deviation= sqrt of np(-p).

What is a Poisson Distribution?

Poisson Distribution is a discrete distribution. The number of trials here is not fixed. The probability of an event is an interval. It is used in defect analysis. Eg: Number of cars arriving on a highway, number of pixels in an image.


The mathematical expression for the same is given below:

Here mean= m and standard deviation= sqrt of m

What is a Normal Distribution?

It is the most used distribution in the field of statistics and data analysis. It is also known as a Gaussian Distribution. It is a continuous distribution and shows a bell curve.
Here mean, median and mode coincides. The distribution is symmetric about its mean. The tail of the distribution runs parallel to the X-axis without touching it.


Here 68% of the data lies between 1 std dev away from the mean,  95% of the data lies between 2 std dev away from the mean,  99.7% of the data lies between 3 std dev away from the mean.

The mathematical expression for the same is given below:


The expression is also called as Normal Probability Density Function. The result is referred to as the central limit theorem.

To standardize Standard Normal Distribution is used and finally a z score formula was derived given below:

Summary!

This article mainly focused on three very important types of distributions which are widely used in the field of data science. All distributions are used in different conditions. Normal Distribution is the most useful in practical use.






Comments

Brands Worked with or Featured On

Brands Worked with or Featured On

Popular Posts