I am encountering an issue while loading a CSV data set in R. The data set can be taken from
https://data.baltimorecity.gov/City-Government/Baltimore-City-Employee-Salaries-FY2015/nsfe-bg53
I imported the data using read.csv
as below and the dataset was imported correctly.
EmpSal <- read.csv('E:/Data/EmpSalaries.csv')
I tried reading the data using read.table
and there were a lot of anomalies when looking at the dataset.
EmpSal1 <- read.table('E:/Data/EmpSalaries.csv',sep=',',header = T,fill = T)
The above code started reading the data from 7th row and the dataset actually contains ~14K rows but only 5K rows were imported. When looked at the dataset in few cases 15-20 rows were combined into a single row and the entire row data appeared in a single column.
I can work on the dataset using read.csv
but I am curious to know the reason why it didn't work with read.table.
read.csv is defined as:
function (file, header = TRUE, sep = ",", quote = "\"", dec = ".",
fill = TRUE, comment.char = "", ...)
read.table(file = file, header = header, sep = sep, quote = quote,
dec = dec, fill = fill, comment.char = comment.char, ...)
You need to add quote="\""
(read.table
expects single quotes by default whereas read.csv
expects double quotes)
EmpSal <- read.csv('Baltimore_City_Employee_Salaries_FY2015.csv')
EmpSal1 <- read.table('Baltimore_City_Employee_Salaries_FY2015.csv', sep=',', header = TRUE, fill = TRUE, quote="\"")
identical(EmpSal, EmpSal1)
# TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With