More Data Exploration in R (part II)

Created by

Rischan Mafrur

Chonnam National University of South Korea

http://rischanlab.github.io

May 28, 2014

In this page I will continue from the first part how to use R command to explore the data in R, the first part you can access at => http://rischanlab.github.io/R/ExploreData.html

Explore multiple variable (Iris data)

Compute covariance more than one variable

cov(iris$Sepal.Length,iris$Sepal.Width)
## [1] -0.04243

Compute covariance all variable except Species

cov(iris[,1:4])
##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length      0.68569    -0.04243       1.2743      0.5163
## Sepal.Width      -0.04243     0.18998      -0.3297     -0.1216
## Petal.Length      1.27432    -0.32966       3.1163      1.2956
## Petal.Width       0.51627    -0.12164       1.2956      0.5810

You can compute the correlation also, the command use cor()

cor(iris[,1:4])
##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length       1.0000     -0.1176       0.8718      0.8179
## Sepal.Width       -0.1176      1.0000      -0.4284     -0.3661
## Petal.Length       0.8718     -0.4284       1.0000      0.9629
## Petal.Width        0.8179     -0.3661       0.9629      1.0000

Aggregate one variable for example we want to compute the stats of Sepal.length of every Species

aggregate(iris$Sepal.Length ~ iris$Species, summary, data=iris)
##   iris$Species iris$Sepal.Length.Min. iris$Sepal.Length.1st Qu.
## 1       setosa                   4.30                      4.80
## 2   versicolor                   4.90                      5.60
## 3    virginica                   4.90                      6.22
##   iris$Sepal.Length.Median iris$Sepal.Length.Mean
## 1                     5.00                   5.01
## 2                     5.90                   5.94
## 3                     6.50                   6.59
##   iris$Sepal.Length.3rd Qu. iris$Sepal.Length.Max.
## 1                      5.20                   5.80
## 2                      6.30                   7.00
## 3                      6.90                   7.90

Plotting Sepal.Length to boxplot

boxplot(iris$Sepal.Length ~ iris$Species,data=iris)

plot of chunk unnamed-chunk-5 Scatter Plot two numeric variable

with(iris, plot(Sepal.Length, Sepal.Width, col=Species, pch=as.numeric(Species)))

plot of chunk unnamed-chunk-6 Jitter

plot(jitter(iris$Sepal.Length), jitter(iris$Sepal.Width))

plot of chunk unnamed-chunk-7

Produce matrix of scatter plots

pairs(iris)

plot of chunk unnamed-chunk-8 or

plot(iris)

plot of chunk unnamed-chunk-9