Uploaded by Carson Hood

stat formulas

advertisement
Key Formulas
From Larson/Farber Elementary Statistics: Picturing the World, Seventh Edition
© 2019 Pearson
CHAPTER 2
Class Width =
Sample Standard Deviation of a Frequency Distribution:
Range of data
s =
Number of classes
(round up to next convenient number)
Midpoint =
(Lower class limit) + (Upper class limit)
2
Relative Frequency =
Class frequency
Sample size
=
n - 1
Standard Score: z =
x - m
Value - Mean
=
s
Standard deviation
CHAPTER 3
f
n
Classical (or Theoretical) Probability:
gx
Population Mean: m =
N
Sample Mean: x =
A
g(x - x)2f
P(E) =
gx
n
Number of outcomes in event E
Total number of outcomes
in sample space
Empirical (or Statistical) Probability:
gxw
Weighted Mean: x =
gw
P(E) =
Frequency of event E
Total frequency
=
f
n
gxf
Mean of a Frequency Distribution: x =
n
Probability of a Complement: P(E′) = 1 - P(E)
Range = (Maximum entry) - (Minimum entry)
Probability of occurrence of both events A and B:
g(x - m)
2
Population Variance: s2 =
P(A and B) = P(A) # P(B) if A and B are
independent
N
Population Standard Deviation:
s = 2s2 =
A
g(x - m)2
Sample Variance: s2 =
Probability of occurrence of either A or B:
N
g(x - x)2
P(A or B) = P(A) + P(B) - P(A and B)
n - 1
Sample Standard Deviation: s = 2s2 =
P(A and B) = P(A) # P(B | A)
A
g(x - x)2
n - 1
Empirical Rule (or 68-95-99.7 Rule) For data sets
with distributions that are approximately symmetric and
bell-shaped:
1. About 68% of the data lie within one standard
deviation of the mean.
2. About 95% of the data lie within two standard
deviations of the mean.
3. About 99.7% of the data lie within three standard
deviations of the mean.
Chebychev’s Theorem The portion of any data set lying
within k standard deviations (k 7 1) of the mean is at
1
least 1 - 2 .
k
P(A or B) = P(A) + P(B) if A and B are
mutually exclusive
Permutations of n objects taken r at a time:
nPr
=
n!
, where r … n
(n - r)!
Distinguishable Permutations: n1 alike, n2 alike, . . . ,
nk alike:
n!
,
n1! # n2! # n3! . . . nk!
where n1 + n2 + n3 + . . . + nk = n
Combinations of n objects taken r at a time:
nC r
=
n!
, where r … n
(n - r)!r!
Key Formulas
From Larson/Farber Elementary Statistics: Picturing the World, Seventh Edition
© 2019 Pearson
CHAPTER 4
Standard Deviation of the Sampling
Distribution (Standard Error):
Mean of a Discrete Random Variable: m = gxP(x)
x - mx
x - m
Value - Mean
z-Score =
=
=
sx
Standard Error
s/2n
Variance of a Discrete Random Variable:
s2 = g(x - m)2P(x)
2n
c-Confidence Interval for m: x - E 6 m 6 x + E,
s
where E = zc
when s is known, the sample is
2n
random, and either the population is normally distributed
s
or n Ú 30, or E = tc
when s is unknown, the
2n
sample is random, and either the population is normally
distributed or n Ú 30.
s = 2s2 = 2g(x - m)2P(x)
Expected Value: E(x) = m = gxP(x)
Binomial Probability of x successes in n trials:
n!
pxqn - x
(n - x)!x!
Population Parameters of a Binomial Distribution:
Mean: m = np
s
CHAPTER 6
Standard Deviation of a Discrete Random Variable:
P(x) = nC x pxqn - x =
sx =
Minimum Sample Size to Estimate m: n = a
Variance: s2 = npq
zcs 2
b
E
Point Estimate for p, the population proportion of
Standard Deviation: s = 2npq
successes: pn =
Geometric Distribution: The probability that the first
success will occur on trial number x is P(x) = pqx - 1,
where q = 1 - p.
x
n
c-Confidence Interval for Population Proportion p (when
np Ú 5 and nq Ú 5): pn - E 6 p 6 pn + E, where
Poisson Distribution: The probability of exactly x
E = zc
pnqn
A n
mxe -m
occurrences in an interval is P(x) =
, where
x!
e ≈ 2.71828 and m is the mean number of occurrences
per interval unit.
Minimum Sample Size to Estimate p: n = pnqn a
CHAPTER 5
c-Confidence Interval for Population Variance s2:
(n - 1)s2
Standard Score, or z-Score:
xR2
x - m
Value - Mean
z =
=
s
Standard deviation
6 s2 6
zc 2
b
E
(n - 1)s2
xL2
c-Confidence Interval for Population Standard Deviation s:
Transforming a z-Score to an x-Value: x = m + zs
A
Central Limit Theorem (n Ú 30 or population is
normally distributed):
Mean of the Sampling Distribution:
mx = m
Variance of the Sampling Distribution:
sx2 =
s2
n
(n - 1)s2
xR2
6 s 6
A
(n - 1)s2
xL2
Key Formulas
From Larson/Farber Elementary Statistics: Picturing the World, Seventh Edition
© 2019 Pearson
CHAPTER 7
z-Test for a Mean m: z =
x - m
, when s is known, the
s/2n
sample is random, and either the population is normally
distributed or n Ú 30.
t-Test for a Mean m: t =
x - m
, when s is unknown,
s/2n
the sample is random, and either the population is
normally distributed or n Ú 30. (d.f. = n - 1)
z-Test for a Proportion p (when np Ú 5 and nq Ú 5):
z =
pn - mpn
spn
pn - p
=
Chi-Square Test for a Variance s2 or Standard Deviation s:
x2 =
s2
(d.f. = n - 1)
Two-Sample z-Test for the Difference Between Means
(s1 and s2 are known, the samples are random and
independent, and either the populations are normally
distributed or both n1 Ú 30 and n2 Ú 30):
(x1 - x2) - (m1 - m2)
sx1 - x2
where sx1 - x2 =
s21
s2
+ 2
A n1
n2
(x1 - x2) - (m1 - m2)
sx1 - x2
If population variances are equal, d.f. = n1 + n2 - 2
and
sx1 - x2 =
A
(n1 - 1)s21 + (n2 - 1)s22 #
n1 + n2 - 2
1
1
+ .
A n1
n2
If population variances are not equal, d.f. is the
,
s21
s2
+ 2.
A n1
n2
t-Test for the Difference Between Means (the samples
are random and dependent, and either the populations are
normally distributed or n Ú 30):
t =
CHAPTER 8
z =
t =
smaller of n1 - 1 or n2 - 1 and sx1 - x2 =
2pq/n
(n - 1)s2
Two-Sample t-Test for the Difference Between Means
(s1 and s2 are unknown, the samples are random and
independent, and either the populations are normally
distributed or both n1 Ú 30 and n2 Ú 30):
d - md
sd /2n
, where d =
gd
g(d - d)2
, sd =
,
n
A n - 1
and d.f. = n - 1.
Two-Sample z-Test for the Difference Between Proportions
(the samples are random and independent, and n1p, n1q,
n2 p, and n2q are at least 5):
z =
( pn 1 - pn 2) - ( p1 - p2)
1
1
pqa
+
b
A
n1
n2
and q = 1 - p.
, where p =
x1 + x2
n1 + n2
Key Formulas
From Larson/Farber Elementary Statistics: Picturing the World, Seventh Edition
© 2019 Pearson
CHAPTER 9
CHAPTER 10
Correlation Coefficient:
Chi-Square: x2 = g
r =
ngxy - ( gx)( gy)
2ngx 2 - ( gx)2 2ngy2 - ( gy)2
t-Test for the Correlation Coefficient:
t =
(O - E)2
r
Goodness-of-Fit Test: d.f. = k - 1
Independence Test:
d.f. = (no. of rows - 1)(no. of columns - 1)
(d.f. = n - 2)
1 - r2
An - 2
E
Two-Sample F-Test for Variances: F =
Equation of a Regression Line: yn = mx + b,
where m =
ngxy - ( gx)( gy)
ngx 2 - ( gx)2
b = y - mx =
and
gy
gx
- m
.
n
n
Explained variation
Total variation
=
Standard Error of Estimate: se =
g(yni - y)2
g(yi - y)2
g(yi - yni)2
n - 2
c-Prediction Interval for y: yn - E 6 y 6 yn + E, where
E = t c se
, where
One-Way Analysis of Variance Test:
F =
A
s22
s21 Ú s22, d.f.N = n1 - 1, and d.f.D = n2 - 1.
Coefficient of Determination:
r2 =
s21
n(x0 - x)2
1
A 1 + n + ngx 2 - ( gx)2
(d.f. = n - 2)
gni(xi - x )2
MSB
SSB
, where MSB =
=
MSW
d.f.N
k - 1
and MSW =
g(ni - 1)s2i
SSW
=
.
d.f.D
N - k
(d.f.N = k - 1, d.f.D = N - k)
Download