Correlation

Read data; plot data; calculate correlation

1) Read data

Download example of dataset saved as tab separated text file .tsv

# read dataset (tabulator \t separated text file)

mydata = read.table('Downloads/data_2x20.tsv', sep="\t", header=TRUE)

# check dataset 'mydata'

head(mydata) # show top lines of dataset

height width

sample01 6.576 3.644

sample02 6.379 3.110

sample03 10.542 4.213

sample04 4.543 2.954

sample05 6.092 3.248

sample06 8.804 3.907

dim(mydata) # get size of dataset (20 lines, 2 columns)

[1] 20 2

2) Plot height vs. width

plot(height ~ width, data = mydata, cex = 1.5, pch = 21, bg = 'blue')

# plot circles 1.5 times bigger; filled circles (code:21); in blue color

3) Calculate correlation

cor( mydata$height , mydata$width )

[1] 0.9304076

Result: 'height' and 'width' are highly correlated (Pearson correlation=0.93).