Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract everything before the first undersore in a string?

Tags:

string

bash

I have files that are named like this:

MG-AB-110_S101_R2_001.fastq.gz, MG-AB-109_S100_R1_001.fastq.gz...

I am trying to extract everything before the first underscore so that I get: MG-AB-110, MG-AB-109...

I tried to do this:

name="MG-AB-110_S101_R2_001.fastq.gz"
base_name=${name%%.*}
echo $base_name
MG-AB-110_S101_R2_001

and this:

base_name=${name%%(.*?)_.* }
echo $base_name
MG-AB-110_S101_R2_001.fastq.gz

I need these base names to match base names in another folder, so the above regex would be part of this loop:

#!/bin/bash

for name in test1/*.gz; do
    base_name=${name%%.*}

    if [ -f "test2/$base_name" ]; then
        cat "$name" "test2/$base_name" >"all_combined/$base_name"
    else
         printf 'No file in test2 corresponds to "%s"\n' "$name" >&2
    fi
done
like image 544
newbash Avatar asked Sep 19 '25 12:09

newbash


1 Answers

With bash and its Parameter Expansion:

name="MG-AB-110_S101_R2_001.fastq.gz"
echo "${name%%_*}"

Output:

MG-AB-110
like image 194
Cyrus Avatar answered Sep 21 '25 03:09

Cyrus