2016年4月24日 星期日

R_how to import data

For CSV 

There is a function in utils package, reads a file in csv format and creates a data frame from it .
read.csv(file)

#Import swimming_pools.csv , named pools
pools<-read.csv ("swimming_pools.cv")

Be careful!!! If the strings are imported as characters , not as factors, the argument (stringAsFactors) must be set to FALSE.  It is only TRUE for the strings ,import represent categorical variables in R.

pools <- read.csv("swimming_pools.csv", stringsAsFactors = TRUE)
str(pools)
'data.frame': 20 obs. of  4 variables:
 $ Name     : Factor w/ 20 levels "Acacia Ridge Leisure Centre",..: 1 2 3 4 5 6 19 7 8 9 ...
 $ Address  : Factor w/ 20 levels "1 Fairlead Crescent, Manly",..: 5 20 18 10 9 11 6 15 12 17 ...
 $ Latitude : num  -27.6 -27.6 -27.6 -27.5 -27.4 ...
 $ Longitude: num  153 153 153 153 153 ..
pools <- read.csv("swimming_pools.csv", stringsAsFactors = FALSE)
> str(pools)
'data.frame': 20 obs. of  4 variables:
 $ Name     : chr  "Acacia Ridge Leisure Centre" "Bellbowrie Pool" "Carole Park" "Centenary Pool (inner City)" ...
 $ Address  : chr  "1391 Beaudesert Road, Acacia Ridge" "Sugarwood Street, Bellbowrie" "Cnr Boundary Road and Waterford Road Wacol" "400 Gregory Terrace, Spring Hill" ...
 $ Latitude : num  -27.6 -27.6 -27.6 -27.5 -27.4 ...
 $ Longitude: num  153 153 153 153 153 ...
For TXT

There is another function to import this file.
read.delim(file, header = TRUE, sep = "\t")

header =TRUE (the first row contains the field names)
sep ="\t"(fields in a record are delimited by tabs)

#Import hotdogs.txt names hotdogs
hotdogs<-read.delim("hotdog.txt", header=FALSE , sep="\t")

or
hotdogs<-read.table("hotdog.txt", header=FALSE , sep="\t") (especially for dealing with more exotic file formats.)

    a) The name of column also can be changed by adding col.names

> hotdogs <- read.delim("hotdogs.txt", header = FALSE)
> names(hotdogs)
[1] "V1" "V2" "V3"
> hotdogs <- read.delim("hotdogs.txt", header = FALSE, col.names = c("type", "calories", "sodium"))
> names(hotdogs)
[1] "type"     "calories" "sodium"  
a) The type of column also can be changed by adding colClass
hotdogs <- read.delim("hotdogs.txt", header = FALSE, col.names = c("type", "calories", "sodium")) 
# Display structure of hotdogs 
str(hotdogs)
'data.frame': 54 obs. of  3 variables:
 $ type    : Factor w/ 3 levels "Beef","Meat",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ calories: int  186 181 176 149 184 190 158 139 175 148 ...
 $ sodium  : int  495 477 425 322 482 587 370 322 479 375 ...
hotdogs <- read.delim("hotdogs.txt", header = FALSE, 
                       col.names = c("type", "calories", "sodium"),
                       colClasses = c("factor", "NULL", "numeric"))
# Display structure of hotdogs
 str(hotdogs)
'data.frame': 54 obs. of  2 variables:
 $ type  : Factor w/ 3 levels "Beef","Meat",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ sodium: num  495 477 425 322 482 587 370 322 479 375 ...




沒有留言:

張貼留言