I have a 370MB zip file and the content is a 4.2GB csv file.
I did:
unzip("year2015.zip", exdir = "csv_folder")
And I got this message:
1: In unzip("year2015.zip", exdir = "csv_folder") :
  possible truncation of >= 4GB file
Have you experienced that before? How did you solve it?
I agree with @Sixiang.Hu's answer, R's unzip() won't work reliably with files greater than 4GB.
To get at how did you solve it?: I've tried a few different tricks with it, and in my experience the result of anything using R's built-ins is (almost) invariably an incorrect identification of the end-of-file (EOF) marker before the actual end of the file.
I deal with this issue in a set of files I process on a nightly basis, and to deal with it consistently and in an automated fashion, I wrote the function below to wrap the UNIX unzip. This is basically what you're doing with system(unzip()), but gives you a bit more flexibility in its behavior, and allows you to check for errors more systematically.
decompress_file <- function(directory, file, .file_cache = FALSE) {
    if (.file_cache == TRUE) {
       print("decompression skipped")
    } else {
      # Set working directory for decompression
      # simplifies unzip directory location behavior
      wd <- getwd()
      setwd(directory)
      # Run decompression
      decompression <-
        system2("unzip",
                args = c("-o", # include override flag
                         file),
                stdout = TRUE)
      # uncomment to delete archive once decompressed
      # file.remove(file) 
      # Reset working directory
      setwd(wd); rm(wd)
      # Test for success criteria
      # change the search depending on 
      # your implementation
      if (grepl("Warning message", tail(decompression, 1))) {
        print(decompression)
      }
    }
}    
Notes:
The function does a few things, which I like and recommend:
system2 over system because the documentation says "system2 is a more portable and flexible interface than system"  directory and file arguments, and moves the working directory to the directory argument; depending on your system, unzip (or your choice of decompression tool) gets really finicky about decompressing archives outside the working directory 
.file_cache argument which allows you to skip decompression
if + grepl check at the end looks for warnings in the stdout, and prints the stdout if it finds that expressionChecking ?unzip, found the following comment in Note:
It does have some support for bzip2 compression and > 2GB zip files (but not >= 4GB files pre-compression contained in a zip file: like many builds of unzip it may truncate these, in R's case with a warning if possible).
You can try to unzip it outside of R (using 7-Zip for example).
To add to the list of possible solutions, in case you have Java (JDK) available on your machine, you can wrap jar xf into an R function similar to utils::unzip() in interface, a very simple example:
unzipLarge <- function(zipfile, exdir = getwd()) {
  oldWd <- getwd()
  on.exit(setwd(oldWd))
  setwd(exdir)
  system2("jar", args = c("xf", zipfile))
}
And then use:
unzipLarge("year2015.zip", exdir = "csv_folder")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With