Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing text file in structured array

Tags:

regex

io

matlab

I have an input text file which is well structured:

START_PARAMETERS
C:\Users\admin\Desktop\Bladed_wind_generator\_wind
C:\Users\admin\Desktop\Bladed_wind_generator\reference_v_4_2.$PJ
END_PARAMETERS
---------------------------------------------------------------------------
START_DLC1-2
4 6 8 10 12 14 16 18 20 22 24 26 28 29
6
8192
600
END_DLC1-2
---------------------------------------------------------------------------
START_DLC6-1
44.8
30
8192
600
END_DLC6-1
---------------------------------------------------------------------------
START_DLC6-4
3 31 33 35
6
8192
600
END_DLC6-4
---------------------------------------------------------------------------
START_DLC7-2
2 4 6 8 10 12 14 16 18 20 22 24 
6
8192
600
END_DLC7-2
---------------------------------------------------------------------------

At the moment I read it this way:

clc,clear all,close all

f = fopen('inp.txt','rt');  % Read Input File
C = textscan(f, '%s', 'Delimiter', '\r\n');
C = C{1}; % Store Input File in a Cell
fclose(f);

Then, by means of regexp I read every occurrence of (START_DLC/END_DLC) block:

startIndx = regexp(C,'START_DLC','match');
endIndx = regexp(C,'END_DLC','match');

The aim is to store the content of the text between each START_DLC/END_DLC block in a structured cell(supposed to be called store_DLCs). The result has to be(e.g. DLC1-2):

DLC1-2
4 6 8 10 12 14 16 18 20 22 24 26 28 29
6
8192
600

and so on until DLC7-2.

Would you mind to give me some hints how to proceed?

I thank you all in advance.

BR, Francesco

like image 480
fpe Avatar asked Dec 17 '25 03:12

fpe


1 Answers

Your code so far is okay. One thing though, I'd slightly alter your computation of startIndx and endIndx to the following:

startIndx = find(~cellfun(@isempty, regexp(C, 'START_DLC', 'match')));
endIndx = find(~cellfun(@isempty, regexp(C, 'END_DLC', 'match')));

so that you would get actual indices (I've transposed them here for visual convenience), like so:

startIndx =

     6    13    20    27


endIndx =

    11    18    25    32

I'd also add an assertion to check the integrity of the input:

assert(all(size(startIndx) == size(endIndx)))

Now, with all the indices computed as above, you can proceed to extract the data into cells:

extract_dlc = @(n)({C{startIndx(n):endIndx(n) - 1}});
store_DLCs = arrayfun(extract_dlc, 1:numel(startIndx), 'UniformOutput', false)

And to "fix" the names (which are first entries) of each cell, you can do:

fix_dlc_name = @(x){strrep(x{1}, 'START_', ''), x{2:end}};
store_DLCs = cellfun(fix_dlc_name, store_DLCs,  'UniformOutput', false);

This code applied on your example input would yield a 1-by-4 cell array of cells:

store_DLCs =

    {'DLC1-2', '4 6 8 10 12 14 16 18 20 22 24 26 28 29', '6', '8192', '600'}  
    {'DLC6-1', '44.8', '30', '8192', '600'}   
    {'DLC6-4', '3 31 33 35', '6', '8192', '600'}   
    {'DLC7-2', '2 4 6 8 10 12 14 16 18 20 22 24', '6', '8192', '600'}
like image 189
Eitan T Avatar answered Dec 19 '25 20:12

Eitan T