I have a config file and I need to remove the comments starting with # to the end of the line. But it should not affect the values that are in double/single quotes.
My input file:
# comment1
# comment2
#hbase_table_name=mytable # hbase table.
hbase_table_name=newtable # hbase table.
hbase_txn_family=txn
app_name= "cust#100" # Name of the application
app_user= 'all#50,all2#100' # users
hbase.zookeeper.quorum=localhost
zookeeper.znode.parent=/hbase-secure
hbase.zookeeper.property.clientPort=2181
The perl command that I'm trying
perl -0777 -pe ' s/^\s*$//gms ; s/#.*?$//gm; s/^\s*$//gms;s/^$//gm' config.txt
The output I'm getting is
hbase_table_name=newtable
hbase_txn_family=txn
app_name= "cust
app_user= 'all
hbase.zookeeper.quorum=localhost
zookeeper.znode.parent=/hbase-secure
hbase.zookeeper.property.clientPort=2181
But the required output is
hbase_table_name=newtable
hbase_txn_family=txn
app_name= "cust#100"
app_user= 'all#50,all2#100'
hbase.zookeeper.quorum=localhost
zookeeper.znode.parent=/hbase-secure
hbase.zookeeper.property.clientPort=2181
I'm looking for a bash solution using any tools - awk or perl that can solve this.
A rare scenario may be with config entry like
app_user= 'all#50,all2#100' # users - "all" of them
and the result should be app_user= 'all#50,all2#100'
Here is a perl script:
#!/usr/bin/perl
use strict;
while (<DATA>){
if (m/^\h*#/) {next;};
if (m/((['"])[^\2]*\2)/) {print substr $_, 0, @+[0]; print "\n"; next; }
s/#.*$//; print ;
}
__DATA__
# comment1
# comment2
#hbase_table_name=mytable # hbase table.
hbase_table_name=newtable # hbase table.
hbase_txn_family=txn
app_name= "cust#100" # Name of the application
#app_name= "cust#100" # Name of the application
app_user= 'all#50,all2#100' # users
hbase.zookeeper.quorum=localhost
zookeeper.znode.parent=/hbase-secure
hbase.zookeeper.property.clientPort=2181
# from comments, other lines
hbase_table_name=newtable ## hbase table.
app_user= 'all#50,all2#100' # users - "all" of them
Output:
hbase_table_name=newtable
hbase_txn_family=txn
app_name= "cust#100"
app_user= 'all#50,all2#100'
hbase.zookeeper.quorum=localhost
zookeeper.znode.parent=/hbase-secure
hbase.zookeeper.property.clientPort=2181
hbase_table_name=newtable
app_user= 'all#50,all2#100'
Change <DATA> to <> and to use on a file...
Could you please try following(written and tested with shown samples).
awk '
/^#/{
next
}
/".*"|\047.*\047/{
match($0,/.*#/)
print substr($0,RSTART,RLENGTH-1)
next
}
{
sub(/#.*/,"")
}
1
' Input_file
Explanation: Adding detailed explanation for above code.
awk ' ##Starting awk program from here.
/^#/{ ##Checking condition if a line starts from # then do following.
next ##next will skip all further statements from here.
}
/".*"|\047.*\047/{ ##Checking condition if a line matching regex from " to * OR single quote to single quote in current line.
match($0,/.*#/) ##If above TRUE then come inside block; using match to match everything till # here.
print substr($0,RSTART,RLENGTH-1) ##Printing substring which prints from starting to length of matched regex with -1 to remove # in it.
next ##next willskip all further statements from here.
}
{
sub(/#.*/,"") ##This statement will executewhen either a line is NOT starting from # OR does not have single/double quote in it.
}
1 ##1 will print edited/non-edited lines here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With