AWK print all regex matches on every line

Question

I have the following text input:

lorem <a> ipsum <b> dolor <c> sit amet,
consectetur <d> adipiscing elit <e>, sed 
do eiusmod <f> tempor
incididunt ut

As seen in the text, the appearances of <?> is not fixed and can appear 0 or multiple times on the same line.

Only using awk I need to output this:

<a> <b> <c>
<d> <e>
<f>

I tried this awk script:

awk '{
  match($0,/<[^>]+>/,a);           // fill array a with matches
  for (i in a) {
    if (match(i, /^[0-9]+$/) != 0) // ignore non numeric indices
      print a[i]
  }
}' somefile.txt

but this only outputs the first match on every line:

<a>
<d>
<f>

Is there some way of doing this with match() or any other built-in function?

RavinderSingh13 · Accepted Answer

With GNU awk you could use its OOTB variable named FPAT and could try following awk code.

awk -v FPAT='<[^>]*>' '
NF{
  val=""
  for(i=1;i<=NF;i++){
    val=(val?val OFS:"") $i
  }
  print val
}
'  Input_file

glenn jackman · Answer

Assuming there are no stray angle brackets, use either < or > as a field separator and print every second field:

awk -F'[<>]' '{for (i=2; i <= NF; i += 2) {printf "<%s> ", $i}; print ""}' data

AWK print all regex matches on every line

Tags:

awk

aee

2 Answers

RavinderSingh13

glenn jackman

Recent Activity

Donate For Us

AWK print all regex matches on every line

Tags:

awk

aee

2 Answers

RavinderSingh13

glenn jackman

Related questions

Recent Activity

Donate For Us