The Sample Mode
Part of Mike's Big Data, Data Mining, and Analytics Tutorial
The sample mode is a statistic that reflects which value occurs most frequently in the sample. It is a suitable measure of center for nominal data. It can also be used on higher level data (ordinal and continuous).Given the following 20 values generated between 1 and 3:
#Get 20 random integer values uniformly distributed between 1 and 3
x<-round(runif(20,1,3))
#sort and display the values
x<-x[order(x)]
x
## [1] 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3
These values can be summarized as frequencies of individual values (frequency referring to the number [count] of times each individual value appears in the set):x_freq<-table(x)
x_freq
## x
## 1 2 3
## 4 13 3
The mean can be determined by finding which table values are the highest:names(x_freq)[which(x_freq == max(x_freq))]
## [1] "2"
Assessed graphically, the mode is the tallest bar:R does not have a built in function to find the mode; however it is easy using the combination of
table
and which
names(x_freq)[which(x_freq == max(x_freq))]
## [1] "2"
Back to Mike's Big Data, Data Mining, and Analytics Tutorial
No comments:
Post a Comment