Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing data from bash table with empty fields

Tags:

python

pandas

awk

I am currently trying to parse some data from bash tables, and I found strange behavior in parsing data if some columns is empty for example

i have data like this

containerName    ipAddress          memoryMB  name       numberOfCpus  status
---------------  ---------------  ----------  -------  --------------  ----------
TEST_VM          192.168.150.111        8192  TEST_VM               4  POWERED_ON

and sometimes like this

containerName    ipAddress      memoryMB  name                      numberOfCpus  status
---------------  -----------  ----------  ----------------------  --------------  -----------
TEST_VM_second                      3072  TEST_VM_second_renamed               1  POWERED_OFF

I tried with python and with bash, but same results, I need data "name" but when I am using bash for example awk '{print $4}' in first table it prints expected result:

name
-------
TEST_VM

but in second table in prints:

name
----------------------
1

same results with python:

df_info = pd.read_table(StringIO(table), delim_whitespace=True)
df_info = df_info.drop(0)
pd.set_option('display.max_colwidth', None)
print(df_info['name'], df_info['containerName'])

Output:


1    TEST_VM
Name: name, dtype: object 1    TEST_VM
Name: containerName, dtype: object

1    1
Name: name, dtype: object 1    TEST_VM_second
Name: containerName, dtype: object


Maybe someone knows how to play around if ipaddress is empty field ?

like image 919
robotiaga Avatar asked Nov 01 '25 23:11

robotiaga


1 Answers

1st solution: With GNU awk try following solution, with your shown samples. This will even take care of VMs which have spaces in their names.

awk '
FNR>2{
  NF-=2
  match($0,/.*[0-9]{4}[[:space:]]+(.*)$/,arr)
  print arr[1]
}
' Input_file


2nd solution: With GNU grep with your shown samples please try following. Using regex ^.*?[0-9]{4}\s\K\S+ regex along with \K option of GNU grep to forget previous match and print only match captured after \K. This considers that VM doesn't have spaces in its name.

grep -oP '^.*?[0-9]{4}\s+\K\S+'  Input_file
like image 192
RavinderSingh13 Avatar answered Nov 03 '25 13:11

RavinderSingh13



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!