Assignment 7

advertisement
Assignment 7
Overview
Below is a small dataset concerning whether to go skiing or not. The decision to go skiing
depends on the attributes snow, weather, season, and physical condition, as shown in the table
below
snow
weather season physical condition go skiing
sticky foggy
low
rested
no
fresh
sunny
low
injured
no
fresh
sunny
low
rested
yes
fresh
sunny
high
rested
yes
fresh
sunny
mid
rested
yes
frosted windy
high
tired
no
sticky sunny
low
rested
yes
frosted foggy
mid
rested
no
fresh
windy
low
rested
yes
fresh
windy
low
rested
yes
fresh
foggy
low
rested
yes
fresh
foggy
low
rested
yes
sticky sunny
mid
rested
yes
frosted foggy
low
injured
no
Questions
1. Apply Naive Bayes as the probabilistic mining algorithm on the dataset above and create
a table with counts and probabilities. The following calculations based on the smaller
dataset below are provided as an example:
snow
weather go skiing
fresh
foggy
no
sticky windy
no
sticky sunny
yes
fresh
windy
yes
fresh
foggy
yes
frosted sunny
no
This small data table leads to the following tables with counts and probabilities:
snow
weather
yes no
go skiing
yes no
2
1
sunny 1
1
yes no
sticky 1
1
foggy 1
1
3
frosted 0
1
windy 1
1
fresh
3
yes no
yes no
2/3 1/3
sunny 1/3 1/3
yes no
sticky 1/3 1/3
foggy 1/3 1/3
3/6 3/6
frosted 0/3 1/3
windy 1/3 1/3
fresh
If we want to classify the following new instance "snow=fresh and weather=sunny", we calculate
the likelihood of "go skiing=yes" in the following way:
likelihood of yes = 2/3 * 1/3 * 3/6 = 6/54 = 1/9
likelihood of no = 1/3 * 1/3 * 3/6 = 3/54 = 1/18
(we assume that all attributes are equally important and independent - that's why Naive Bayes is
called Naive)
probability of yes = (1/9)/((1/9)+(1/18)) = 2/3
probability of no = (1/18)/((1/18)+(1/9)) = 1/3
Therefore, the probability is 33% to not go skiing based on the information given in this small
example.
2. Explain in your own words the terms: Naive Bayes Classifier,and Bayesian Belief Network.
3. Draw the Bayesian Belief Network that represents the conditional independence assumptions
of the Naive Bayes Classifier for the skiing problem. Hint: The Naive Bayes conditional
independence assumption is P(a1, a2, …, an|vi) = P(a1|vi) * P(a2|vi) * … * P(an|vi), where vi is
a class label and a1, a2, …, an are dataset attributes.
Download