I have a script that is trying to get blocks of information from gparted.
My Data looks like:
Disk /dev/sda: 42.9GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Number  Start   End     Size    Type     File system     Flags
 1      1049kB  316MB   315MB   primary  ext4            boot
 2      316MB   38.7GB  38.4GB  primary  ext4
 3      38.7GB  42.9GB  4228MB  primary  linux-swap(v1)
log4net.xml
Model: VMware Virtual disk (scsi)
Disk /dev/sdb: 42.9GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Number  Start   End     Size    Type     File system     Flags
 1      1049kB  316MB   315MB   primary  ext4            boot
 5      316MB   38.7GB  38.4GB  primary  ext4
 6      38.7GB  42.9GB  4228MB  primary  linux-swap(v1)
I use a regex to break this into two Disk blocks
^Disk (/dev[\S]+):((?!Disk)[\s\S])*
This works with multiline on.
When I test this in a bash script, I can't seem to match \s, or \S -- What am I doing wrong?
I am testing this through a script like:
data=`cat disks.txt`
morematches=1
x=0
regex="^Disk (/dev[\S]+):((?!Disk)[\s\S])*"
if [[ $data =~ $regex ]]; then
echo "Matched"
while [ $morematches == 1 ]
do
        x=$[x+1]
        if [[ ${BASH_REMATCH[x]} != "" ]]; then
                echo $x "matched" ${BASH_REMATCH[x]}
        else
                echo $x "Did not match"
                morematches=0;
        fi
done
fi
However, when I walk through testing parts of the regex, Whenever I match a \s or \S, it doesn't work -- what am I doing wrong?
\s stands for “whitespace character”. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.
\d (digit) matches any single digit (same as [0-9] ). The uppercase counterpart \D (non-digit) matches any single character that is not a digit (same as [^0-9] ). \s (space) matches any single whitespace (same as [ \t\n\r\f] , blank, tab, newline, carriage-return and form-feed).
From man bash : -s If the -s option is present, or if no arguments remain after option processing, then commands are read from the standard input. This option allows the positional parameters to be set when invoking an interactive shell. From help set : -e Exit immediately if a command exits with a non-zero status.
As far as I know, \s means all white-space symbols and \S means all non white-spaced symbols or [^\s] so [\s\S] logically should be equivalent to .
Perhaps \S and \s are not supported, or that you cannot place them around [ ]. Try to use the following regex instead:
^Disk[[:space:]]+/dev[^[:space:]]+:[[:space:]]+[^[:space:]]+
EDIT
It seems like you actually want to get the matching fields. I simplified the script to this for that.
#!/bin/bash 
regex='^Disk[[:space:]]+(/dev[^[:space:]]+):[[:space:]]+(.*)'
while read line; do
    [[ $line =~ $regex ]] && echo "${BASH_REMATCH[1]} matches ${BASH_REMATCH[2]}."
done < disks.txt
Produces:
/dev/sda matches 42.9GB.
/dev/sdb matches 42.9GB.
Because this is a common FAQ, let me list a few constructs which are not supported in Bash, and how to work around them, where there is a simple workaround.
There are multiple dialects of regular expressions in common use. The one supported by Bash is a variant of Extended Regular Expressions. This is different from e.g. what many online regex testers support, which is often the more modern Perl 5 / PCRE variant.
\d \D \s \S \w \W -- these can be replaced with POSIX character class equivalents [[:digit:]], [^[:digit:]], [[:space:]], [^[:space:]], [_[:alnum:]], and [^_[:alnum:]], respectively.  (Notice the last case, where the [:alnum:] POSIX character class is augmented with underscore to be exactly equivalent to the Perl \w shorthand.)a.*?b with something like a[^ab]*b to get a similar effect in practice, though the two are not exactly equivalent.(?:...). In the trivial case, just use capturing parentheses (...) instead; though of course, if you use capture groups and/or backreferences, this will renumber your capture groups.(?<=before) or (?!after) and in fact anything with (? is a Perl extension.  There is no simple general workaround for these, though you can often rephrase your problem into one where lookarounds can be avoided.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With