Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Join, split and map using perl for creating new attribs

my $str = "<SampleElement oldattribs=\"sa1 sa2 sa3\">";

$str =~ s#<SampleElement[^>]*oldattribs="([^"]*)"#
          my $fulcnt=$&;
          my $afids=$1;
          my @affs = ();
          if($afids =~ m/\s+/) {
              @affs = split /\s/, $afids; 
              my $jnafs = join ",", map { $_=~s/[a-z]*//i, } @affs;
              ($fulcnt." newattribs=\"$jnafs\"");
          }
          else {
              ($fulcnt);
          }
         #eg;

My Output:

<SampleElement oldattribs="sa1 sa2 sa3" newattribs="1,1,1">

Expected Output:

<SampleElement oldattribs="sa1 sa2 sa3" newattribs="1,2,3">

Someone could point out me where I am doing wrong. Thanks in advance.

like image 293
ssr1012 Avatar asked Jan 25 '26 05:01

ssr1012


1 Answers

Where you're going wrong is earlier than you think - you're parsing XML using regular expressions. XML is contextual, and regex isn't, so it's NEVER going to be better than a dirty hack.

#!/usr/bin/env perl
use strict;
use warnings;

use XML::Twig;
my $twig = XML::Twig -> parse ( \*DATA );

my $sample_elt = $twig -> get_xpath('//SampleElement',0); 
my @old_att = split ( ' ', $sample_elt -> att('oldattribs') );
$sample_elt -> set_att('newattribs', join " ", map { /(\d+)/ } @old_att);

$twig -> set_pretty_print ( 'indented_a' );
$twig -> print;


__DATA__
<XML>
    <SampleElement oldattribs="sa1 sa2 sa3">
    </SampleElement>
</XML>

But to answer the core of your problem - you're misusing map as an iterator here.

map { $_=~s/[a-z]*//i, } @affs;

Because what that is doing is iterating all the elements in @affs and modifying those... but map is just returning the result of the expression - which is 1 because it worked.

If you want to change @affs you'd:

s/[a-z]*//i for @affs; 

But if you didn't want to, then the easy answer is to use the r regex flag:

map { s/[a-z]*//ir } @affs;

Or as I've done in my example:

map { /(\d+)/ } @affs; 

Which regex matches and captures the numeric part of the string, but as a result the 'captured' text is what's returned.

like image 104
Sobrique Avatar answered Jan 26 '26 19:01

Sobrique