Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read CSV data in batches and Process it

Tags:

c#

linq

I have a csv file that looks like this

#DELTA,1#    
Risk1,10
Risk2,10
Risk3,10
Risk4,10
Risk5,10
#DELTA,1#    
Risk6,10
Risk7,10
Risk8,10
Risk9,10
Risk10,10

and so on. These are very large files (in order of GBs).

What I want to be able to do is to read them in batches like

start streamreader from csv file from first line to just before next #Delta starts

---Batch 1---
#DELTA,1
Risk1,10
Risk2,10
Risk3,10
Risk4,10
Risk5,10
--Batch 2-----
#DELTA,1
Risk6,10
Risk7,10
Risk8,10
Risk9,10
Risk10,10
----------------------

and once I get a batch put this subset for processing and come back and restart preparing another batch and so on till the end of file is reached.

I have tried making the LINQ's take and take while but with my understanding of LINQ I am not getting far.

Basically in summary it have to stream data in batches based on a pattern in my stream.. maybe my brain cells are dead or maybe it is too late in evening. Really appreaicate anyone's help

like image 348
Ash Avatar asked Oct 31 '25 02:10

Ash


1 Answers

The easiest approach would be a TextReader and ReadLine().

For positioning, I would just leave the Reader open between processing the batches. If that's not an option, save the (stream) Position and restore it later.

With a StreamReader, if you have to close the file, you'd have to keep a lineCount and read-and-skip from the beginning again. Not too attractive.

like image 104
Henk Holterman Avatar answered Nov 01 '25 18:11

Henk Holterman



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!