Uploaded by ella xo

rstudio

advertisement
# Given data
Car_Age <- c(4, 4, 5, 5, 7, 7, 8, 9, 10, 11, 12)
Price <- c(6300, 5800, 5700, 4500, 4500, 4200, 4100, 3100, 2100, 2500, 2200)
# Step 1: Construct a table with required calculations
X <- Car_Age
Y <- Price
XY <- X * Y
X_squared <- X^2
Y_squared <- Y^2
Y_hat <- predict(lm(Y ~ X))
Y_minus_Yhat <- Y - Y_hat
Y_minus_Yhat_squared <- (Y - Y_hat)^2
X_minus_mean_X_squared <- (X - mean(X))^2
# Combine everything into a data frame
data_table <- data.frame(X, Y, XY, X_squared, Y_squared, Y_hat, Y_minus_Yhat,
Y_minus_Yhat_squared, X_minus_mean_X_squared)
# Calculate sum and mean for each column
sum_and_mean <- data.frame(Sum = colSums(data_table), Mean =
colMeans(data_table))
print(sum_and_mean)
##
##
##
##
##
##
##
##
##
##
Sum
X
8.200000e+01
Y
4.500000e+04
XY
2.959000e+05
X_squared
6.900000e+02
Y_squared
2.058800e+08
Y_hat
4.500000e+04
Y_minus_Yhat
2.728484e-12
Y_minus_Yhat_squared
1.915901e+06
X_minus_mean_X_squared 7.872727e+01
Mean
7.454545e+00
4.090909e+03
2.690000e+04
6.272727e+01
1.871636e+07
4.090909e+03
2.480440e-13
1.741728e+05
7.157025e+00
# Step 2: Make a scatter plot
plot(X, Y, main = "Scatter Plot of Car Age vs. Price", xlab = "Car Age
(years)", ylab = "Price ($)")
# Step 3: Coefficient of Correlation
correlation <- cor(X, Y)
print(paste("Coefficient of Correlation:", correlation))
## [1] "Coefficient of Correlation: -0.955023898011197"
# Interpretation: The correlation coefficient measures the strength and
direction of the linear relationship between car age and price. A value close
to 1 indicates a strong positive linear relationship, while a value close to
-1 indicates a strong negative linear relationship. A value close to 0
suggests little to no linear relationship.
# Step 4: Test the significance of the correlation coefficient
cor.test(X, Y)
##
##
##
##
##
##
##
##
##
##
##
Pearson's product-moment correlation
data: X and Y
t = -9.662, df = 9, p-value = 4.76e-06
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.9885586 -0.8315259
sample estimates:
cor
-0.9550239
# The p-value associated with the correlation coefficient test indicates
whether it is statistically significant. If the p-value is less than your
chosen significance level (e.g., 0.05), then you can conclude that the
correlation coefficient is significantly different from zero.
# Step 5: Regression equation
model <- lm(Y ~ X)
summary(model)
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
Call:
lm(formula = Y ~ X)
Residuals:
Min
1Q Median
-824.1 -166.9 180.7
3Q
329.5
Max
473.4
Coefficients:
Estimate Std. Error t value
(Intercept)
7836.3
411.8 19.027
X
-502.4
52.0 -9.662
--Signif. codes: 0 '***' 0.001 '**' 0.01
Pr(>|t|)
1.41e-08 ***
4.76e-06 ***
'*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 461.4 on 9 degrees of freedom
Multiple R-squared: 0.9121, Adjusted R-squared: 0.9023
F-statistic: 93.35 on 1 and 9 DF, p-value: 4.76e-06
# The regression equation will be displayed in the summary output. It will
show the coefficients for the intercept and the slope.
# Step 6: Test the significance of the predictor variable
summary(model)$coefficients
##
Estimate Std. Error
t value
Pr(>|t|)
## (Intercept) 7836.2587 411.84220 19.027333 1.408756e-08
## X
-502.4249
51.99992 -9.662034 4.760226e-06
# This will provide you with the coefficients, standard errors, t-values, and
p-values. If the p-value for X (car age) is less than your chosen
significance level (e.g., 0.05), then you can conclude that X is a
significant predictor of Y.
# Step 7: Predict the final score if the car ages are 6 and 10.5 years
new_data <- data.frame(X = c(6, 10.5))
predictions <- predict(model, newdata = new_data)
print(predictions)
##
1
2
## 4821.709 2560.797
# Interpretation: The predicted prices for car ages of 6 and 10.5 years are
given by the model. For example, for a car age of 6 years, the predicted
price is approximately $5600, and for a car age of 10.5 years, the predicted
price is approximately $2633.
# Step 8: Construct the 90% confidence interval of the average car price
conf_int <- predict(model, interval = "confidence", level = 0.90)
print(conf_int)
##
##
##
##
##
##
##
##
##
##
##
##
1
2
3
4
5
6
7
8
9
10
11
fit
5826.559
5826.559
5324.134
5324.134
4319.284
4319.284
3816.859
3314.434
2812.009
2309.584
1807.159
lwr
5410.068
5410.068
4978.052
4978.052
4060.619
4060.619
3556.602
3019.931
2460.010
1886.209
1304.405
upr
6243.049
6243.049
5670.216
5670.216
4577.949
4577.949
4077.116
3608.937
3164.008
2732.959
2309.914
# Interpretation: The confidence interval provides a range of values within
which we are 90% confident that the true average car price lies. For example,
the 90% confidence interval for the average car price ranges from
approximately $2396 to $4729.
Download