Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to extract format of number from a string using pyspark

I have a column in my table which have this value :

  |col_A|
  -------
  |00140|
  -------
  |00120|
  -------
  |00058|
  -------
  |00009|
  -------
  |00052|

I want to delete all 0 in the left. I use pyspark to build the dataframe. You find as below an exemple :

while tab.col_A.like('0%'):
        tab = tab.withColumn('tab_B', tab['col_A'][2:5])

When I try to execute this code I have this error :

Cannot convert column into bool

Please help.

like image 479
mehdi Avatar asked Jan 27 '26 10:01

mehdi


1 Answers

I tried this code :

tab = tab.withColumn("col_B", F.regexp_extract(tab['col_A'], '[1-9][0-9]*',0))

The problem is resolved.

Thanks,

like image 99
mehdi Avatar answered Jan 29 '26 00:01

mehdi



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!