Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract variable from string in Bash using Regex?

The string look like

str1= "the value of var1=test, the value of var2=testing, the final value of var3=testing1"

Upto now I split the string by IFS=','

IFS="," read -r -a final <<< "$str1"

Then assigning values to variables

var1="${final[0]#"var1="}"

How to assign the variable values in shortest way using regex?

Expected Output

var1=test
var2=testing
var3=testing1
like image 884
Kartik M Avatar asked Sep 05 '25 20:09

Kartik M


2 Answers

#!/usr/bin/env bash
str1="the value of var1=test, the value of var2=testing, the final value of var3=testing1"
re='(^|[[:space:]])([[:alpha:]][[:alnum:]]*)=([^, ]+)([, ]|$)(.*)'

remaining=$str1
while [[ $remaining =~ $re ]]; do
  varname=${BASH_REMATCH[2]}
  value=${BASH_REMATCH[3]}
  remaining=${BASH_REMATCH[5]}
  printf -v "$varname" %s "$value"
done

# show current values to demonstrate that variables were really assigned
declare -p var1 var2 var3

This works because =~ stores each match group in your regex in a different position in the BASH_REMATCH variable, so we're able to pick out the groups with the names and values and perform an indirect assignment (printf -v varname %s "$value" stores value in varname).

The regex has a fair bit going on, so let's break it down piece-by-piece:

  • (^|[[:space:]]) ensures that we only match content at the beginning of the string or preceded by a space.
  • ([[:alpha:]][[:alnum:]]*)=([^, ]+) matches only assignments where the left-hand side is a valid variable name (a letter, optionally followed by characters that can be either letters or numbers). Because your sample data has commas following values, we know a comma can't be allowed in a value, so we disallow both commas and spaces from being considered part of a value.
  • ([, ]|$) allows a variable to terminate either with a comma or space following, or at the end of input.
  • (.*)' matches any remaining content we haven't yet processed, so that content can be run against the regex on the time cycle through the loop.
like image 198
Charles Duffy Avatar answered Sep 08 '25 10:09

Charles Duffy


This is actually one where grep with -E extended regex matching can help, e.g.

grep -E -o 'var[0-9]*[[:blank:]]*=[[:blank:]]*[^,[:blank:]]+' <<< $str

Results in:

var1=test
var2=testing
var3=testing1

The [[:blank:]]* on either side of the '=' just allows for spaces on either side, if present. If there is never a chance of that, you can shorten it to grep -E -o 'var[0-9]*=[^,[:blank:]]+'.

Edit Per-Comment

To store it in var1, simply:

var1=$(grep -E -o 'var[0-9]*[[:blank:]]*=[[:blank:]]*[^,[:blank:]]+' <<< $str)

(or better, store each combination in an array, or create an associative array from the variable names and values themselves) For example, to store all of the var=val combinations in an associative array you could do:

str="the value of var1=test, the value of var2=testing, the final value of var3=testing1"
declare -A array
while read -r line; do 
    array[${line%=*}]=${line#*=}
done < <(grep -E -o 'var[0-9]*[[:blank:]]*=[[:blank:]]*[^,[:blank:]]+' <<< $str)
for i in ${!array[@]}; do
    echo "$i => ${array[$i]}"
done

Example Output

var1 => test
var3 => testing1
var2 => testing
like image 27
David C. Rankin Avatar answered Sep 08 '25 11:09

David C. Rankin