Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing Last Name from Name in Python

Tags:

python

pandas

Trying to determine a single last name.

names = ["John Smith", "D.J. Richies III","AJ Hardie Jr.", "Shelia Jackson-Lee", "Bob O'Donnell"]

Desired Output

last_names = ['Smith', 'Richies','Hardie','Lee', 'ODonnell' ]

I'm hoping there is an existing library or set of code that can easily handle some of these more rare/odd cases.

Thanks for your help!

like image 488
nonegiven72 Avatar asked Dec 07 '25 08:12

nonegiven72


2 Answers

Dealing with Names is Hard

Naive string-manipulation solutions will eventually fail. You start to realize this with suffixes (III, Jr.), but what about compound last names like de la Paz?

You want: The Python Human Name Parser

>>> from nameparser import HumanName
>>> name = HumanName("Dr. Juan Q. Xavier de la Vega III")
>>> name.title
'Dr.'
>>> name["title"]
'Dr.'
>>> name.first
'Juan'
>>> name.middle
'Q. Xavier'
>>> name.last
'de la Vega'
>>> name.suffix
'III'
like image 90
Adam Michael Wood Avatar answered Dec 08 '25 21:12

Adam Michael Wood


You can try this:

names = ["John Smith", "D.J. Richies III","AJ Hardie Jr.", "Shelia Jackson-Lee", "Bob O'Donnell"]

suffixes = ["II", "Jr.", "III", "Sr."]

last_names = []

for i in names:
    new_name = i.split()
    if len(new_name) == 2 and "-" in new_name[1]:
         last_names.append(new_name[1].split("-")[1])

    elif len(new_name) == 2:
          last_names.append(new_name[1])

    else:
        if new_name[-1] in suffixes:
           last_names.append(new_name[1])

print(last_names)

Output will contain the last names:

['Smith', 'Richies', 'Hardie', 'Lee', "O'Donnell"]
like image 31
Ajax1234 Avatar answered Dec 08 '25 22:12

Ajax1234