Uploaded by bebad butneverbesad

Assignment Data Exploration and Visualizations

advertisement
Assignment Data Exploration and Visualizations #1
Jawad awawda
First, we import the dataset into the R project, and we use the (str) function to display the dataset
structure.
# Importing the dataset into R
marketing <- read.csv("marketing.csv", stringsAsFactors = TRUE)
str(marketing)
Second, we order the factor (pop_density), to do se we use the (factor) function with the option
(ordered=TRUE).
# order factor for pop_density
marketing$pop_density <- factor(marketing$pop_density, ordered = TRUE,
levels = c("Low", "Medium", "High"))
str(marketing)
Third, we use the (summary) function for farther analysis of the dataset column (google_adwords),
that it will be in focus.
We notice that the (summary) function gives us five values in order (minimum, first
quartile, median, mean, third quartile, maximum)
#summary
summary(marketing$google_adwords)
Forth, to get a better understanding of the spread of the data we use the (sd, var) functions, were
the (sd) function gives the standard deviation and (var) gives the variance.
#spread of the data
sd(marketing$google_adwords)
var(marketing$google_adwords)
summary(marketing)
Fifths, Graphical visualisation. Of the (anscombe) dataset. To do so we import the dataset into the R
project. And then inspect it.
To get a more descriptive statistics about the dataset we apply (mean, sd, var) function to the
dataset using the function (sapply)
data("anscombe")
anscombe
sapply(anscombe, mean)
sapply(anscombe, sd)
sapply(anscombe, var)
Sixth, data inspection using graphical techniques
1-using (plot) function:
plot(marketing$pop_density)
2- using (boxplot) function:
boxplot(marketing$google_adwords, ylab = "Expenditures")
3- using (hist) function on google_adwords:
hist(marketing$google_adwords, main = NULL)
4- using (hist) function on facebook:
hist(marketing$facebook, main = NULL)
5- using (hist) function on revenues:
hist(marketing$revenues, main = NULL)
Download