Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error in reading a CSV file with read.table()

Tags:

r

I am encountering an issue while loading a CSV data set in R. The data set can be taken from

https://data.baltimorecity.gov/City-Government/Baltimore-City-Employee-Salaries-FY2015/nsfe-bg53

I imported the data using read.csv as below and the dataset was imported correctly.

EmpSal <- read.csv('E:/Data/EmpSalaries.csv')

I tried reading the data using read.table and there were a lot of anomalies when looking at the dataset.

EmpSal1 <- read.table('E:/Data/EmpSalaries.csv',sep=',',header = T,fill = T)

The above code started reading the data from 7th row and the dataset actually contains ~14K rows but only 5K rows were imported. When looked at the dataset in few cases 15-20 rows were combined into a single row and the entire row data appeared in a single column.

I can work on the dataset using read.csv but I am curious to know the reason why it didn't work with read.table.

like image 902
mockash Avatar asked Sep 07 '25 22:09

mockash


1 Answers

read.csv is defined as:

function (file, header = TRUE, sep = ",", quote = "\"", dec = ".", 
    fill = TRUE, comment.char = "", ...) 
read.table(file = file, header = header, sep = sep, quote = quote, 
    dec = dec, fill = fill, comment.char = comment.char, ...)

You need to add quote="\"" (read.table expects single quotes by default whereas read.csv expects double quotes)

EmpSal <- read.csv('Baltimore_City_Employee_Salaries_FY2015.csv')
EmpSal1 <- read.table('Baltimore_City_Employee_Salaries_FY2015.csv', sep=',', header = TRUE, fill = TRUE, quote="\"")
identical(EmpSal, EmpSal1)
# TRUE
like image 162
lebatsnok Avatar answered Sep 10 '25 01:09

lebatsnok