Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to save a CSV file with R with line breaks that Notepad will recognize?

Tags:

r

I'm sorry to bother you with probably an encoding question. Spending couple of hours without getting the solution I decided to post it here. I'm trying to write a simple table unsuccessfully using write.table, write.csv,write.csv2from Ubuntu 14.04. My data is kind of messy resulting from a cronjob:

ID <- c("",30,26,20,30,40,5,10,4)
b <- c("",2233,12,2,22,13,23,23,100)
c <- c("","","","","","","","","")
d <- c("","","","","","","","","")
e <- c("","","","","","800","","","")
f <- c("","","","","","","","","")
g <- c("","","","","","","","EA","")
h <- c("","","","","","","","","")
df <- data.frame(ID,b,c,d,e,f,g,h)

  # change columns to chr
for(i in c(1,2:ncol(df))) {
 df[,i] <- as.character(df[,i])
}

str(df)

# data.frame':  9 obs. of  8 variables:
#  $ ID: chr  "" "30" "26" "20" ...
#  $ b : chr  "" "2233" "12" "2" ...
#  $ c : chr  "" "" "" "" ...
#  $ d : chr  "" "" "" "" ...
#  $ e : chr  "" "" "" "" ...
#  $ f : chr  "" "" "" "" ...
# $ g :  chr  "" "" "" "" ...
#  $ h : chr  "" "" "" "" ...


head(df,n=9)

    ID    b c d   e   f  g  h
# 1                       
# 2 30 2233               
# 3 26   12               
# 4 20    2               
# 5 30   22               
# 6 40   13      800       
# 7  5   23               
# 8 10   23              EA  
# 9  4  100               

I have tried different combinations and suggestions found on SO, however nothing worked. The result is always somehow displaced instead of long its wide. In the current example ist just one long row.

I tried:

write.table(df,"df.csv",row.names = FALSE, dec=".",sep=";")
write.table(df,"df.csv",row.names = FALSE,dec=".",sep=";", col.names = T)
write.table(df,"df.csv",row.names = FALSE,sep=";",fileEncoding = "UTF-8")
write.table(df,"df.csv",row.names = FALSE,fileEncoding = "UTF-8")

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
[1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C                   LC_TIME=de_DE.UTF-8       
 [4] LC_COLLATE=de_DE.UTF-8     LC_MONETARY=de_DE.UTF-8        LC_MESSAGES=de_DE.UTF-8   
 [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=de_DE.UTF-8   LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.4.3   DBI_0.4-1     RGA_0.4.2     RMySQL_0.11-3

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.5     lubridate_1.5.6 digest_0.6.9    assertthat_0.1    R6_2.1.2       
 [6] plyr_1.8.3      jsonlite_1.0    magrittr_1.5    httr_1.1.0         stringi_1.1.1  
[11] curl_0.9.7      tools_3.3.1     stringr_1.0.0   parallel_3.3.1 

Wrong output as pic:

enter image description here

Correct output results from the same data on : enter image description here

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
[![enter image description here][2]][2]
like image 377
Mamba Avatar asked Oct 17 '25 00:10

Mamba


1 Answers

The problem isn't R or Ubuntu it is notepad. Specifically, it expects "\r\n" for line breaks whereas most other text readers are happy with "\n" which is the default line break used by write.xxx.

If you add the parameter eol="\r\n" then you should be able to open in Notepad and see the expected line breaks.

For instance:

write.table(df,"df.csv",row.names = FALSE, dec=".",sep=";",eol="\r\n")
like image 89
Dean MacGregor Avatar answered Oct 19 '25 14:10

Dean MacGregor