Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read from SAS to R for only a subset of rows

Tags:

r

sas

I have a very large dataset in SAS (> 6million rows). I'm trying to read that to R. For this purpose, I'm using "read_sas" from the "haven" library in R.

However, due to its extremely large size, I'd like to split the data into subsets (e.g., 12 subsets each having 500000 rows), and then read each subset into R. I was wondering if there is any possible way to address this issue. Any input is highly appreciated!

like image 596
user16019699 Avatar asked Jan 23 '26 20:01

user16019699


1 Answers

Is there any way you can split the data with SAS beforehand ... ?

read_sas has skip and n_max arguments, so if your increment size is N=5e5 you should be able to set an index i to read in the ith chunk of data using read_sas(..., skip=(i-1)*N, n_max=N). (There will presumably be some performance penalty to skipping rows, but I don't know how bad it will be.)

like image 113
Ben Bolker Avatar answered Jan 25 '26 13:01

Ben Bolker



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!