Getting Started With Statistics
Why Statistics?
Business Statistics is the science of good decision making which
can be used in multiple disciplines and work areas. In today’s world data has
become an essential part for a company’s growth. There are tons of valuable
data that can be processed into meaningful insights. There are also large data
storage capacity that can be processed using cloud and parallel computing.
What are the Methods
In Statistics?
There are three methods in Statistics:-
- Classification-
This method is used when data is classified into different buckets or
units
- Pattern
Recognition-
This method is used for certain shapes, data forms that appear in common.
Eg: Histogram, Boxplot, Scatter plot etc. Used to separate pattern from
noise
- Association-
This method is used for correlation and seeing relationship between
variables
What are the Four
Pillars of Statistics?
There are basically two types- Descriptive and Inferential,
Inferential is further divided into Diagnostic, Predictive & Prescriptive
- Descriptive
Statistics-
Statistics which defines the story and as is picture of the data
- Diagnostic
Statistics-
Statistics which define why the numbers and data looks like this
- Predictive
Statistics-
Statistics which defines what the data will look like in the future
- Prescriptive
Statistics- Statistics
which defines what should be recommended after analysis
Population- It is a universe of
possible data for a specified object which is not observed. Eg: People who have
or will visit a website
Parameter- It is a numerical value
associated with a population which is not observed. Eg: Average amount of time
people spend on a website
Sample- It is a selection of
observation from a population which is observed. Eg: People who have visited a
website on a specific day
Statistic- It is a numerical value
associated with an observed value. Eg: Average amount of time people spend on a
website on a specific day.
What are the Data
Sources?
There are majorly two sources:-
- Primary
Data-
Data that is collected continuously which is fresh
- Secondary
Data-
Data which is already there, archived or published
What are the types of
Data?
Qualitative Data- Data which cannot be
measured, can be used using some label or ASCII codes. Eg: Gender, Religion,
Place of Birth. It is of four types:-
- Nominal-
Names of things. Eg: Hair color, zip codes
- Binary-
Is Nominal but only two states(0 and 1) which is further divided into
Symmetric Binary(both outcomes equally important, eg: Gender) and
Asymmetric Binary(both outcomes not equally important, eg: Medical Test
- Ordinal-
Values
have meaningful order but magnitude between successive values is not
known. Eg: Grades, Army Rankings
Quantitative Data- Data that can be measured
and which is further divided into Discrete and Continuous
- Discrete-
Data
that can take certain values, discontinuities. Works in a counting
process. Eg: No of rooms in a hotel
- Interval-
Data
that is measured on a scale of equal sized units, values that have order.
Eg: Temperature, Calendar Date
- Ratio- Data that has an inherent 0 point, speak of values as being an order of magnitude. Eg: Temperature in K, Monetary Quantities
What are the types of
Data Sets?
There are 4 types of data sets:-
- Record-
Eg: Matrix, Transaction Data, Relational Records, Cross tabs
- Graph
Network-
Eg: Molecular Structures
- Ordered-
Eg: Video data, Time Series
- Spatial,
Image & Multimedia
Comments
Post a Comment