Uploaded by Md Farhan Rafiq

180051235 CEE4655 ASS04

advertisement
ISLAMIC UNIVERSITY OF TECHNOLOGY (IUT)
Organisation of Islamic Cooperation (OIC)
Board Bazar, Gazipur, Bangladesh.
Department of Civil and Environmental Engineering (CEE)
COURSE TITLE
: Civil Engineering Data Analysis
COURSE CODE
: CEE 4655
ASSIGNMENT NAME : Statistical Problem solving using software.
STUDENT NAME
: PRETOM MD. TAHMIDUR RAHMAN
STUDENT ID
: 180051235
DATE OF SUBMISSION: 10/03/2022
DATE OF PERFORMANCE: 29/04/2022
SUBMITTED TO: Dr. Shakil Mohammad Rifaat, Professor.
Bayes Theorem
Question: a) Three machines A, B, and C are capable of producing (XX+5)%, (XX10)% and XX % of the total number of items of a factory. The percentage of defective
output of these machines are 2%, 4% and 5%. XX denotes the last two digits of
student ID.
(i)
(ii)
Find the probability of a defective item if it is randomly selected.
For a randomly selected item, what is the probability that machine A
produced the defective item?
Hand Calculation
Calculation using Software (Microsoft Excel)
P(Item produced by X1)=
P(Item produced by X2)=
P(Item produced by X3)=
0.4
0.25
0.35
P(defective/Item produced by X1)=
P(defective/Item produced by X2)=
P(defective/Item produced by X3)=
0.02
0.04
0.05
P(Item produced by A/defective)=
0.225352
Screenshot:
Comparison: The probability obtained using both the hand calculation and Microsoft Excel results
in the same value.
Poisson distribution
Q: b) The district of Bogura has fire burn on an average of 1 in 1000 houses during
a year. If there are XX00 houses in Bogura, find the probability of the following
number of houses having a fire burn during the year:
(i) Exactly 5 houses
(ii) No house
(iii) 1 house at best
XX denotes the last two digits of student ID.
Hand Calculation
Calculation using Software (Microsoft Excel)
p=
n=
𝜆=
0.001
3500
3.5
exactly 5
P(X=5)= 0.132169
none
P(X=0)=
not more than 1
0.030197
P(X≤1)=
0.135888
Screenshot:
Comparison:
So the obtained value from Excel and calculation done in hand are totally equal and similar to each other.
Binomial Distribution
Question: c) The probability that a student will secure A+ in Data Analysis is 0.XX.
Find that out of 5 students, probability of securing A+ :
(i)
(ii)
(iii)
(iv)
No Student
1 student
At least 1 student
All students
XX denotes the last two digits of student ID.
Hand Calculation
Calculation using Software (Microsoft Excel)
p
n
0.35
5
none
P(X=0)= 0.116029
exactly 1
P(X=1)=
0.312386
At least 1
P(X≥1)=
0.883971
all
P(X=5)=
0.005252
Screenshot:
Comparison: After analysis, it is seen that the value we got from Excel is the same as that
obtained by hand calculation.
Normal Distribution
Question: d) The mean daily salary of a laborer is Tk. (130+ XX) and the standard
deviation is Tk. XX. If a laborer is selected at random, find the probability that the
laborer earns:
(i) Between Tk. 165 and Tk. 200 per day
(ii) Above tk. 200 per day
(iii) Below Tk. 150 per day
XX denotes the last two digits of student ID.
Hand Calculation
Calculation using Software (Microsoft Excel)
Parameters
Mean
Std Dev
165
35
(i)
P(165≤x≤200)
0.341344746
(ii)
P(x>200)
0.158655254
(iii)
P(x<150)
0.334117571
Screenshot:
Comparison: So the value obtained from both hand calculation and Excel are
equal.
ANOVA
Question: e) An experiment was done to find the effect of flow rate of
Hexafluroethane (C2F6) on the uniformity of etch on a silicon water for the
manufacturing of IC circuit. The result of percentage of uniformity for six
replicates in three experiments are as follows:
Observation
1
2
3
4
5
6
C2F6 Flow (SCCM)
125
160
200
2.7
4.9
4.6
4.6
4.6
3.4
2.6
5
2.9
X.X-0.5 X.X+0.7
X.X
X.X-0.3
X.X
X.X+0.6
X.X
X.X+0.7 X.X+1.6
Does the flow rate of C2F6 affect etch uniformity? Use a significance level of 0.05.
XX denotes the last two digits of student ID.
Hand Calculation
Calculation using Software (Microsoft Excel)
Anova: Single Factor
SUMMARY
Groups
125
160
200
Count
Sum
19.6
26.4
23.6
6
6
6
Average Variance
3.266667 0.534667
4.4
0.308
3.933333 0.674667
ANOVA
Source of Variation
Between Groups
Within Groups
Total
Screenshot of analysis:
SS
3.893333
7.586667
11.48
df
MS
F
P-value
F crit
2 1.946667 3.848858 0.044753 3.68232
15 0.505778
17
Comparison: Hand calculation gives an Fcalculated value equal to 3.85 and Excel provides 3.848858.
Besides, the value obtained for P-value from excel is 0.044753 where it is below 0.05 obtained from
hand calculation. So they are almost equal to each other.
Contingency Table
Question: f) Three medicine companies Beximco, ACME and Square marketed
three different medicines for cold, namely Fexo, Brodil and Deslo respectively. A
survey was conducted on their effectivity in 2000, 2010 and 2020 on patients in a
certain hospital. Following are some data from surveys of these three medicines:
Year
2000
2010
2020
Fexo
12
XX-13
97
Brodil Deslo
XX
94
XX+5
52
25 XX-17
Does the data seem to be independent of year? Test the hypothesis with a significance
level of α=0.05 and also find the P-value of the test statistic.
XX denotes the last two digits of student ID.
Hand Calculation
Calculation using Software (Microsoft Excel)
Year
2000
2010
2020
Total
n
Fexo
12
22
97
131
395
Observed Data
Brodil
Deslo
35
94
40
52
25
18
100
164
u1 0.356962
u2 0.288608
u3 0.35443
v1
v2
v3
Total
141
114
140
395
Year
2000
2010
2020
Total
Expected Frequency
Fexo
Brodil
Deslo
46.7620
35.6962
58.54177
37.8075
28.86076
47.33165
46.43037
35.44304
58.12658
131
100
164
0.331646
0.253165
0.41519
Χ02 Calculation:
∑[{(Oi-Ei)2}/Ei]
Year
2000
2010
2020
Total
Here,
Χ02= 144.5563
df= 4
P-Value= 2.98523E-30
Fexo
25.84144711
6.609255577
55.07787157
87.52857426
Brodil
Deslo
Total
0.013578 21.47673 47.33176
4.299356 0.460443 11.36905
3.076967 27.70062 85.85546
7.389901 49.6378 144.5563
Total
141
114
140
395
Screenshot of Analysis:
Comparison: The value obtained in hand calculation is Χ02=146.16 and Excel provides Χ02=144.5563
and P-value from hand calculation is 2 x 10-30 and 2.98523E-30 from hand calculation and Excel
respectively. So the value is almost equal to each other.
Paired T test
Question: g) Ten sprinters have participated in a 10 seconds race before and after
exercise. The distance traversed before and after exercise in 10 seconds are:
Before
195
2XX-22
2XX
201
187
2XX-25
215
246
294
310
After
187
195
2XX-14
190
175
197
199
2XX-14
2XX+43
285
Find out whether the exercise program was effective or not. Use a significance
level of 0.05.
XX denotes the last two digits of student ID.
Hand Calculation
Calculation using Sofware (Microsoft Excel)
Before
After
Mean
230.6
214.8
Variance
1733.6
1436.622
10
10
Observations
Pearson Correlation
0.994433
Hypothesized Mean Difference
0
df
9
t Stat
8.900722
P(T<=t) one-tail
4.67E-06
t Critical one-tail
1.833113
P(T<=t) two-tail
9.35E-06
t Critical two-tail
2.262157
Screenshot of Analysis:
Comparison: After analysis, it is seen that both the hand calculation and Microsoft Excel provides
same value of t-test and it is 8.9. The P-Value is also almost equal.
Unpaired T Test (Equal Variances)
Question: h-i) Wet chemical is often used for the removal of silicon from the backs
of wafers prior to metallization while manufacturing semiconductors. The etch rate
is an important characteristic in this process and follows Normal Distribution, Two
different etching solutions have been compared using two different random
samples of 5 wafers for each of the solutions. The observed etch rates are as follows
(in mils per minute):
Solution 1
X.X+7.1
X.X+6.8
X.X+6.5
X.X+6.8
X.X+6.6
Solution 2
X.X+6.5
X.X+6.7
X.X+7.2
X.X+6.9
X.X+6.8
How will you conclude about the differences that the mean etch rate is the same for
both solutions? Use α=0.05 and assume the population variances is equal for
solutions.
X.X denotes the last two digits of student ID.
Hand Calculation
Calculation using Software (Microsoft Excel)
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Solution 1
10.26
0.053
5
0.06
0
8
-0.3873
0.354318
1.859548
0.708635
2.306004
Solution 2
10.32
0.067
5
Screenshot of Analysis:
Comparison: The calculation done in Microsoft Excel provides T-statistics value as -0.3873 and P-value
(2-tailed) of 0.708635 and the T-statistics value obtained from hand calculation are -0.387 and P-value
is between 0.5 and 0.8, i.e. it is greater than α=0.05. So both the hand calculation and the software
gives us the same value.
Unpaired T-Test (Unequal Variances)
Question: h-ii) The BOD level in the lakes of IUT and JU in 10 random days have been
measured and the result are as follows (in ppm unit):
IUT
JU
346.55 56.73
2XX 52.34
65.48 51.26
50
44.44
49
37.25
43.48 36.79
42.46 34.18
39.97 30.29
33.5 29.4
32.9 28.65
Draw a conclusion based on the level of BOD of these two lakes with a significance
level of 0.05. Assume that the population variances are not equal.
XX denotes the last two digits of student ID.
Hand Calculation
Calculation using Software (Microsoft Excel)
Mean
Variance
Observations
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
IUT
JU
93.834
40.133
11550.88 107.2998233
10
10
0
9
1.572776
0.075111
1.833113
0.150221
2.262157
Screenshot of Analysis:
Comparison: The calculation done in Microsoft Excel provides T-statistics value as 1.572776 and Pvalue (2-tailed) of 0.150221 and the T-statistics value obtained from hand calculation are 1.57 and Pvalue is between 0.1 and 0.2, i.e. it is greater than α=0.05. So both the hand calculation and the
software gives us the same value.
Multiple Linear Regression
Question: i) The data given below shows stack-loss from a plant oxidizing ammonia to
nitric acid with respect to flow of air and temperature:
i)
ii)
iii)
iv)
v)
Air
Temperature
Stack
Flow
Loss
XX+45
27
42
80
27
XX
75
25
XX
62
XX-11
28
XX+27
22
18
62
23
18
62
24
19
62
24
XX-15
XX+23
23
15
58
18
14
58
18
14
58
17
13
58
18
11
XX+23
19
12
50
18
8
50
18
7
XX+15
19
8
50
19
8
50
20
9
56
XX-15
15
70
20
15
Find a linear regression equation for the model.
Calculate R2.
Calculate R
Find the Radj.
Conduct a global test of hypothesis to test whether any of the regression
equations are not equal to zero. Use α=0.05.
XX denotes the last two digits of student ID.
Hand Calculation
Calculation using Software (Microsoft Excel)
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.952011
R Square
0.906325
Adjusted R Square 0.895916
Standard Error
3.161565
Observations
21
ANOVA
df
Regression
Residual
Total
SS
MS
F
2 1740.748 870.3739 87.07661
18 179.9189 9.995496
20 1920.667
Coefficients Standard
Error
Intercept
-48.0197 5.016082
x1
x2
t Stat
-9.57315
P-value
Significance F
5.55423E-10
Lower 95%
Upper
95%
-37.4813
Lower
95.0%
-58.5581
Upper
95.0%
-37.4813
1.74E-08 58.55811159
0.634746 0.123677 5.132287 6.98E-05 0.374909949 0.894581 0.37491 0.894581
1.279733 0.358743 3.567275 0.002202 0.526043168 2.033424 0.526043 2.033424
RESIDUAL OUTPUT
Observation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Screenshot of Analysis:
Predicted y
37.31273
37.31273
31.57954
22.04811
19.48864
20.76838
22.04811
22.04811
18.2294
11.83073
11.83073
10.551
11.83073
13.11046
6.752764
6.752764
8.032498
8.032498
9.312231
13.1207
22.00714
Residuals
4.687268
-2.31273
3.420463
5.951889
-1.48864
-2.76838
-3.04811
-2.04811
-3.2294
2.169271
2.169271
2.449005
-0.83073
-1.11046
1.247236
0.247236
-0.0325
-0.0325
-0.31223
1.879295
-7.00714
Comparison:
All the values obtained from Excel and hand calculation starting from coefficients, SSE, SST, SSR, MSR,
MSE, R2, Radj2 , F-value and so on are almost equal to each other.
Non Parametric Statistics (Sign Test)
Question: j) The arsenic level (in ppm) is routinely measured in a certain chemical
product. The experiment provided the following data:
OBSERVATION
ARSENIC
LEVEL
1
2.XX+0.35
2
2.XX+0.15
3
1.72
4
1.6
5
1.9
6
2.XX+0.25
7
1.3
8
1.81
9
2.XX-0.25
10
2.7
11
2.5
12
2.36
13
2.XX-0.35
14
1.75
15
1.42
16
1.81
17
2.XX-0.35
18
1.9
19
2.XX
20
1.93
21
2.39
22
1.61
Can it be claimed that median Arsenic level is below 2.5 ppm? State and the
appropriate hypothesis using the sign test with α=0.05 and also find the P-value.
Hand Calculation
Calculation using Software (Microsoft Excel)
Observation Arsenic xi-2.5
Level
(xi)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
2.7
2.5
1.72
1.6
1.9
2.6
1.3
1.8
2.1
2.7
2.5
2.36
2
1.75
1.42
1.81
2
1.9
2.35
1.93
2.39
1.61
0.2
0
-0.78
-0.9
-0.6
0.1
-1.2
-0.7
-0.4
0.2
0
-0.14
-0.5
-0.75
-1.08
-0.69
-0.5
-0.6
-0.15
-0.57
-0.11
-0.89
Sign
1
0
-1
-1
-1
1
-1
-1
-1
1
0
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
Median
2.5
Number of Positive Signs
Number of Negative Signs
3
17
Total number of (+)ve and
(-ve) signs
20
Minimum between (+)ve
and (-ve) signs
3
P-Value
0.001288414
Screenshot of Analysis:
Comparison: The P-value obtained from both Microsoft Excel and hand calculation are
0.001288414 and 0.001288 respectively which is approximately considered as equal.
Download