1. The following table shows the cost in AUD of seven paperback

advertisement
1.
The following table shows the cost in AUD of seven paperback books chosen at random, together
with the number of pages in each book.
Book
1
2
3
4
5
6
7
Number of pages (x)
50
120
200
330
400
450
630
6.00
5.40
7.20
4.60
7.60
5.80
5.20
Cost (y AUD)
(a)
Plot these pairs of values on a scatter diagram. Use a scale of 1 cm to represent 50 pages on
the horizontal axis and 1 cm to represent 1 AUD on the vertical axis.
(3)
(b)
Write down the linear correlation coefficient, r, for the data.
(2)
(c)
Stephen wishes to buy a paperback book which has 350 pages in it. He plans to draw a line
of best fit to determine the price. State whether or not this is an appropriate method in this
case and justify your answer.
(2)
(Total 7 marks)
2.
200 people of different ages were asked to choose their favourite type of music from the choices
Popular, Country and Western and Heavy Metal. The results are shown in the table below.
Age/Music
choice
Popular
Country
and
Western
Heavy
Metal
Totals
11–25
35
5
50
90
26–40
30
10
20
60
41–60
20
25
5
50
Totals
85
40
75
200
It was decided to perform a chi-squared test for independence at the 5% level on the data.
(a)
Write down the null hypothesis.
(1)
(b)
Write down the number of degrees of freedom.
(1)
(c)
Write down the chi-squared value.
(2)
(d)
State whether or not you will reject the null hypothesis, giving a clear reason for your
answer.
(2)
(Total 6 marks)
3.
Manuel conducts a survey on a random sample of 751 people to see which television programme
type they watch most from the following: Drama, Comedy, Film, News. The results are as
follows.
1
Drama
Comedy
Film
News
Males under 25
22
65
90
35
Males 25 and over
36
54
67
17
Females under 25
22
59
82
15
Females 25 and over
64
39
38
46
Manuel decides to ignore the ages and to test at the 5% level of significance whether the most
watched programme type is independent of gender.
(a)
Draw a table with 2 rows and 4 columns of data so that Manuel can perform a chi-squared
test.
(3)
(b)
State Manuel’s null hypothesis and alternative hypothesis.
(1)
(c)
Find the expected frequency for the number of females who had “Comedy” as their
most-watched programme type. Give your answer to the nearest whole number.
(2)
(d)
Using your graphic display calculator, or otherwise, find the chi-squared statistic for
Manuel’s data.
(3)
(e)
(i)
State the number of degrees of freedom available for this calculation.
(ii)
State the critical value for Manuel’s test.
(iii)
State his conclusion.
(3)
(Total 12 marks)
4.
Tania wishes to see whether there is any correlation between a person’s age and the number of
objects on a tray which could be remembered after looking at them for a certain time.
She obtains the following table of results.
(a)
Age (x years)
15
21
36
40
44
55
Number of objects
remembered (y)
17
20
15
16
17
12
Use your graphic display calculator to find the equation of the regression line
of y on x.
(2)
(b)
Use your equation to estimate the number of objects remembered by a person aged 28
years.
(1)
(c)
Use your graphic display calculator to find the correlation coefficient r.
(1)
(d)
Comment on your value for r.
(2)
2
(Total 6 marks)
5.
The local park is used for walking dogs. The sizes of the dogs are observed at different times of
the day. The table below shows the numbers of dogs present, classified by size, at three different
times last Sunday.
Small
Morning ⎛ 9
⎜
Afternoon ⎜11
Evening ⎜⎝ 7
Medium
Large
2⎞
⎟
13 ⎟
9 ⎟⎠
18
6
8
(a)
Write a suitable null hypothesis for a χ2 test on this data.
(b)
Write down the value of χ2 for this data.
(c)
The number of degrees of freedom is 4. Show how this value is calculated.
The critical value, at the 5% level of significance, is 9.488.
(d)
What conclusion can be drawn from this test? Give a reason for your answer.
(Total 6 marks)
6.
(a)
For his Mathematical Studies project, Marty set out to discover if stress was related to the
amount of time that students spent travelling to or from school. The results of one of his
surveys are shown in the table below.
Travel time (t mins)
↓
high stress
Number of students
moderate stress
t ≤ 15
9
5
18
15 < t ≤ 30
17
8
28
30 < t
18
6
7
low stress
He used a χ2 test at the 5% level of significance to find out if there was any relationship
between student stress and travel time.
(i)
Write down the null and alternative hypotheses for this test.
(2)
(ii)
Write down the table of expected values. Give values to the nearest integer.
(3)
(iii)
Show that there are 4 degrees of freedom.
(1)
(iv)
Calculate the χ2 statistic for this data.
(2)
The χ2 critical value for 4 degrees of freedom at the 5% level of significance is 9.488.
(v)
What conclusion can Marty draw from this test? Give a reason for your answer.
(2)
3
(b)
Marty asked some of his classmates to rate their level of stress out of 10, with 10 being very
high. He also asked them to measure the number of minutes it took them to get from home
to school. A random selection of his results is listed below.
Travel time (x)
13
24
22
18
36
16
14
20
6
12
Stress rating (y)
3
7
5
4
8
8
4
8
2
6
(i)
Write down the value of the (linear) coefficient of correlation for
this information.
(1)
(ii)
Explain what a positive value for the coefficient of correlation indicates.
(1)
(iii)
Write down the linear regression equation of y on x in the form y = ax + b
(2)
(iv)
Use your equation in part (iii) to determine the stress rating for a student who takes
three quarters of an hour to travel to school.
(2)
(v)
Can your answer in part (iv) be considered reliable? Give a reason for your answer.
(2)
(Total 18 marks)
7.
Several candy bars were purchased and the following table shows the weight and the cost of each
bar.
Weight (g)
Cost (Euros)
(a)
Yummy
Chox
Marz
Twin
Chunx
Lite
BigC
Bite
60
85
80
65
95
50
100
45
1.10
1.50
1.40
1.20
1.80
1.00
1.70
0.90
Given that sx = 19.2, sy = 0.307 and sxy = 5.81, find the correlation coefficient, r, giving
your answer correct to 3 decimal places.
(2)
(b)
Describe the correlation between the weight of a candy bar and its cost.
(1)
(c)
Calculate the equation of the regression line for y on x.
(3)
(d)
Use your equation to estimate the cost of a candy bar weighing 109 g.
(2)
(Total 8 marks)
8.
In a competition the number of males and females taking part in different swimming races is
given in the table of observed values below.
Backstroke
(100 m)
Freestyle
(100 m)
Butterfly
(100 m)
Breaststroke
(100 m)
Relay
(4 × 100 m)
Male
30
90
31
29
20
Female
28
63
20
37
12
The Swimming Committee decides to perform a χ2 test at the 5% significance level in order to test
4
if the number of entries for the various strokes is related to gender.
(a)
State the null hypothesis.
(1)
(b)
Write down the number of degrees of freedom.
(1)
(c)
Write down the critical value of χ2.
(1)
The expected values are given in the table below:
Backstroke
(100 m)
Freestyle
(100 m)
Butterfly
(100 m)
Breaststroke
(100 m)
Relay
(4 × 100 m)
Male
32
a
28
37
18
Female
26
68
23
b
14
(d)
Calculate the values of a and b.
(2)
(e)
Calculate the χ2 value.
(3)
(f)
State whether or not you accept your null hypothesis and give a reason for your answer.
(2)
(Total 10 marks)
9.
It is thought that the breaststroke time for 200 m depends on the length of the arm of the swimmer.
Eight students swim 200 m breaststroke. Their times (y) in seconds and arm lengths (x) in cm are
shown in the table below.
Length of arm,
x cm
Breaststroke,
y seconds
(a)
1
2
3
4
5
6
7
8
79
74
72
70
77
73
64
69
135.1
135.7
139.3
141.0
132.8
137.0
152.9
144.0
Calculate the mean and standard deviation of x and y.
(4)
(b)
Given that sxy = –24.82, calculate the correlation coefficient, r.
(2)
(c)
Comment on your value for r.
(2)
(d)
Calculate the equation of the regression line of y on x.
(3)
(e)
Using your regression line, estimate how many seconds it will take a student with an arm
length of 75 cm to swim the 200 m breaststroke.
(1)
(Total 12 marks)
5
1.
(a)
A1)(A1)(A1)3
(A
Notes:
N
(A1) for
fo label and sscales, (A2) for
fo all
points
p
correct, (A1) for 5 oor 6 correct.
Award
A
a maximum of (A2)) if points are joined.
(b)
r = −0.141
(G2)2
Note:
N
If negative sign is missing
m
award
d (G1)(G0).
(c)
““The coefficient of correlaation is too low
w, (very) weaak
(
(linear)
relatio
onship”.
N a sensiblee thing to do, accept “no”
Not
”.
Note:
N
Do not award (R0)(A
(A1)
The
T correlatioon coefficientt has to be meentioned in
their
t
reasoninng.
(R1)
(A1)2
[7]
2.
2
(a)
C
Choice
of mu
usic is indepenndent of age.
(b)
( – 1)(3 – 1))
(3
(A1)(C
C1)
=4
(c)
(A1)(C
C1)
χ2 = 51.6
(A2)
Note:
N
52 is an
n accuracy peenalty (A1)(A0)(AP).
(C2)
(d)
p-value < 0.05 for 5% level of significance
(R1)(ft)
or 51.6 > χ2 crit
(R1)(ft)
Reject the null hypothesis (do not accept the null hypothesis).
Note: Do not award (R0)(A1).
(A1)(ft)
(C2)
[6]
3.
(a)
Drama
Comedy
Film
News
Males
58
119
157
52
Females
86
98
120
61
(M1)(M1)(A1)
(b)
H0: favourite TV programme is independent of gender or no association between favourite
TV programme and gender
H1: favourite TV programme is dependent on gender (must have both)
(c)
365× 217
751
(A1) 1
(M1)
= 105
(A1)(ft)(G2) 2
(d)
12.6 (accept 12.558)
(G3) 3
(e)
(i)
3
(A1)
(ii)
7.815 (accept 7.82)((ft) from their (i))
(A1)(ft)
(iii)
reject H0 or equivalent statement (eg accept H1)
(A1)(ft)
[12]
4.
(a)
a = –0.134, b = 20.9
(A1)
y = 20.9 – 0.134x
(A1) (C2)
7
3
(b)
17 objects
(A1)(ft)
(C1)
Note: Accept only 17
(c)
r = –0.756
(A1) (C1)
(d)
negative and moderately strong
(A1)(ft)(A1)(ft)
[6]
5.
(a)
Ho: The size of dog is independent of the time of day, (or equivalent)
Note: Award (A0) for ‘no correlation’
(A1) (C1)
(b)
χ2 = 4.33. (accept 4.328)
Note: GDC use is anticipated but candidates might calculate this
by hand. (M1) can be awarded for a reasonable attempt to use
the formula.
(M1)(A1)
(c)
(3–1)(3–1) = 4
Note: Award mark for left hand side seen.
(A1) (C1)
(d)
The hypothesis should not be rejected, (allow ‘accept Ho’)
(C2)
OR
The size of dog is independent of the time of day
(A1)(ft)
4.33 < 9.488 or 0.363 > 0.05
Notes: Allow χ2calc < χ2crit only if a value for χ2calc is seen
somewhere.
Award (R1)(ft) for comparing the values and (A1)(ft) if the
conclusion is valid according to the comparison given. If no
reason is given, or if the reason is wrong both marks are lost.
Note that (A0)(R1)(ft) can be awarded but (A1)(R0) cannot.
(R1)(ft)
[6]
8
(C2)
6.
(a)
(i)
H0 : level of stress is independent of travel time
(A1)
H1 : level of stress is not independent of travel time
(A1)
2
(ft)
(or reasonable equivalents)
(ii)
12.1 5.24 14.6
20.1 8.68 24.2
11.8 5.08 14.2
(M1)(A1)(G2)
Note: (M1) for attempting to calculate expected values by hand
44 × 32
=12.1 etc.
eg
116
12 5 15
20 9 24
12 5 14
Nearest integers
(A1)(G3)
(iii)
(iv)
(v)
3
df = (r – 1) (c – 1) = (3 – 1)(3 – 1) = 4
)(AG)
(M1
1
χ2 = 9.83(1)
(G2)
OR χ2 = 9.277 ..... if calculated from integer values
(M1)(A1)
OR
2
For χ2 = 9.83 Do not accept H0 :
(A1)
(ft)
(Level of stress is not independent of travel time or reasonable equivalent)
2
2
or p-value < 0.05
because χ calc
> χ crit
(R1)
(ft)
OR
For χ2 = 9.278 Accept H0 :
(A1)
(ft)
2
2
because χ calc
> χ crit
or p-value > 0.05
(ft)
Note: a correct reason must be given for the (A1) to be awarded.
(R1)
2
9
(b)
(i)
(ii)
(iii)
r = 0.667
(A1)
1
Stress rating increases as travel time increases
(or reasonable equivalent eg y increases as x increases).
Note: Do not accept “positive correlation”
(R1)
1
y = 0.181x + 2.22
for 0.181x and
for 2.22
Note: For y = 2.22x + 0.181, award (A0)(A1)(ft)
(A1)
(A1)
2
10
(iv)
Putting x = 45
(M1
)
0.181× 45 + 2.22
= 10.365 (10.4)
(ft)(G2)
Notes: Allow 10 or 11 only if the method is shown and is correct.
Allow follow through only if method shown.
(v)
not reliable …
Because result is outside the data range or because the
correlation coefficient not high or the sample is small or
responses are subjective.
Note: Award (R1) for any of the above. A correct reason must be given
to award the (A1).
(A1)
2
(A1)
(R1)
2
[18]
7.
(a)
r=
S xy
(S x S y )
= 0.986
=
5.81
(19.2 × 0.307)
(M1)
(A1)
2
1
Note: Award (G2) for 0.985 from GDC.
(b)
Strong, positive correlation
(A1)
(c)
y = 0.182 + 0.0158x
(G3)
OR
5.81
(x – 72.5)
19.2 2
y = 0.0158x + 0.182
y – 1.325 =
(d)
y = 0.0158 × 109 + 0.182
= 1.90 euros.
(M1)(A1)
(A1)
3
(M1)
(A1)
2
[8]
8.
(a)
H0 : number of entries is independent of gender.
(A1)
1
(b)
4
(A1)
1
11
(c)
9.488
(A1)
1
(d)
a = 85, b = 29
(A1)(A1)
(e)
(30 − 32) 2
+ ...
32
(M1)(A1)
= 6.10 (using given values)
(A1)
OR
(f)
5.80 (from calculator)
(G3)
3
Do not reject the null hypothesis as the χ2 value is less than the critical value.
So, gender and stroke are independent.
(A1)(R1)
(Also allow “accept”).
[10]
9.
(a)
(b)
mean of x = 72.25
(A1)
sd of x = 4.41
(A1)
mean of y = 139.7 (140)
(A1)
sd of y = 5.99
(A1)
r = – 0.940
(G2)
4
OR
r=
− 24.82
(4.41× 5.99)
= −0.9396 (= – 0.94)
(M1)(A1)
(c)
strong, negative correlation
Note: Award (A1) for negative, (A1) for strong.
(A2)
(d)
y = 232 – 1.28x
(G3)
12
2
OR
( y − 139.7) = −
24.82
( x − 72.25)
4.412
y = –1.28x + 232
(e)
y = 232 – 1.28 × 75 = 136 seconds
(M1)(A1)(A1))
(A1)
[12]
13
1
Download