Statistics
Data
=>Samplescollected
=> population
all
:
--
data
possible
Quantitue
Qualitative
data
·
salistics -
>
Continuous (temp , Hight
>
describe a sample or population .
Descriptive -
Inferential
>
-
estimate from
>
-
a
sample
to
a
Discrete (Students
·
larger population .
Descriptive
population
=>
>
-
(parameters)
Mean (M)
=>
Sample
-
(Estimator)
Variance (64
Variance (S3
Standard deviation(s)
Standard deviation (6)
How data
=>
comes
Cumulative
1) Raw
Stem and leaf
3)
frequency distribution
4) Grouped,
Some
+
+
Some
+
Definitions
m
x
1
Definitions
↳ Variable
↳
Frequency
-
2)
Mean (5)
Nominal
↳ Ordinal
categorical
>
-
Categorical
>
-
Variables
e
quantitare
or
> No
measures
=
with
an
inherent rank
of variability
compared for
or
equality
Cmild/moderate/sever) or (greater/less
↳
↳
Interval
Ratio
>
-
Ordered
with
-
No true
zero
>
-
Valid
(+ 1
>
-
in
Interval
meaningful differences
-
Valid
( +,
in
-
Data presentation
↳ Graphical
(2
/f)
Calendar
dates
S
properties +true zero
&>
=>
= Bar diagram
,
Lage
X
,
weight ,night
,
temp in 1)
=
,
)
Histogram
,
Piechart
↳ Right
·
↳ Numerical I stem and leaf
,
Box-plot
skewed data
Data is best when it's
symmetrical
↳ Mean-median-mode
mean
Measures
of central
1) Mean *
tendency
frequency distribution
5
:
·
Interver
EXi
:
*
Mode :
The
For
↳
that
↳ value
grouped Frequency distribution
of the
lower term
highest frequency
in an
>
-
IQR
=
*
takes
of
used mostly in
5
3
is
median
23345
1
* When
we
if
a
whole
have
we
don't
care
we
take next integer
num
Kent kind
KY.
:
the data
lie below
Q. RP
when youHave
intervals
whole num and
fraction
Pr
-
of
percentile &
For
.
grouped frequency
if it's a whole number
2
find C.F
and URB
> use as it
-
(F you got doesn't have U R B
.
.
use
Pr
is
Similar
triangles
=
(k/100) xn
↳ traction
↳
~
t
0
#
percentiles
Inter
=
N
34
Pr-
substrat a
-
nonsynometrical
↓
N
>
↳
of the data
5 H
.
H
↳ Ans is fraction
%
to
values
add 0 5
we
&
extremescreen
-
the data
sensitive
size
Not sensitive
to
E
75 %
Normal
distribution
Smallest Value
-
we
↳RB
quartile range (IQR)
Q3-O
=>
.
(4)
Fraction takes the next Integral
The Inter
U R. B
whole
-
frequent
single
X max-Xmin
>
-
order
thy
number
a
Influenced by Sample
*
-
biggest value
Larger ranged
sothp
the middle value
whole
Range :
Larger sample &
median
The
measure of0
spread
2
with
one
more
sensitive to exteme values
Rough
upper term
+
*
extreme
observed most frequently
is
it
symmetrical
·
i
of deviation of values is zero
* Sum
use
unless data is badly skewed
the best average for
peak
ordinal
we
frequency distributions that have
ratio
if
total
- (lower term +upper term) /2
Grouped frequency
Ei E
of
group
next num
ki
o
tha
-
Outliers
-
Any Value less than Q1-1 5
.
+
IQR
IQR
OR
Q3- Q
than Q3 + 1 5 * IQR
More
,
I
,
a
a
Es
Max
↓
median
Measurement
Average of the squares of the deviations of the observations from
the
symbols
* sensitive
19th
.
Methods of Variability
Variance :
goth
Q3 + 1 SIQB]
,
.
25th
j
Min
.
[Q -1 S IQR
In
=
S"
to exteme
(X
=
-
,
value
* ) +... + (Xn- * )
>
[x- (EX)"
OR
n 1
n
-
p
Best with
their mean
n(n
1
-
-
1)
symmetrical
Standard Deviation :
square root
of the variance
*
① Grouped
1- m
lower
=
② Frequency
frequency
+upper
=
1-
c
2
[F(X * )
2
.
2
I
3-
deviation value
-
4
-
:
Ex
:
(X-* )
-
-
3
-
5
=
[f(X
multiply by frequency
Sum all
distribution
n
-
deviations
shortcut
5-S : Ef(X-II
n
-
1
,
S
:
formula
5
S2
=
EfX
n
-
1
-
(Efx)2
n(n 1)
-
-
1
X)
100th
Deciles
Percentiles
and
ordered and
e
is devided into
Deciles : if data
10
then the cut points
parts
are
called
Deciles
Percentiles : if data is ordered and devided into 100 parts then the cuts
↳
are
called
perentiles
th
Coefficient of variation
=
&*100
Number Summary
Five
↳ min
,
Q1
Q
,
Q3
,
,
max
Linear transform "Coding"
Let
y
=
(Range IQR
ax + b
,
,
S
,
5)
-
Shape of the Distribution
ma
Normal
Y
Random variable has
can
- The
is not at
mean
the
center
# Q2
< Mode
=
>
=
Infin
* Pearson's
skewness
coefficient
=
mean-mode/3(mean-median
SD
* Skewness
have the same
mean
SD but different skewness
* > Q2> Mode
MEAN MODE MEDIAN
=
and
of data
Skewed
&
=
* Two sets
Value
> 0 2
.
or
< -0 2
.
Indicates
severe skewness
SD
skew
to
right
Skew to left
-
Chebyshev's
theory
k > /
t
*
for
lie
-
set of data at least
between
* At
2 k
Y k*5
any
+
*
S
(X-E S , Y +*
*
most
2
lie
1-1
S (
outside
data