Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Excel - String Remove Duplicates

I am working with some UK address data which within an Excel cell is split into its constituent parts by a comma.

I have some VBA which I've taken from the web which has removed a number of exact duplicated entries but I am left with a large amount of data which has repeating segments some sequentially and some non sequentially.

Attached is an image highlighting what I am trying to achieve, the code I have used thus far which is not mine is included to show you the direction in which I have been looking. Anyone have any further thoughts on how this can be achieved?

Function stringOfUniques(inputString As String, delimiter As String)
Dim xVal As Variant
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")

For Each xVal In Split(inputString, delimiter)
dict(xVal) = xVal
Next xVal

stringOfUniques = Join(dict.Keys(), ",")
End Function

This did manage to get rid of a number of them but there is a huge population that I am working on so automating this would be incredible.

Ideal Outcome

like image 750
Mr Deans Avatar asked Feb 02 '26 13:02

Mr Deans


1 Answers

Possibly not the most elegant answer, but this does the trick. Here I use the Split command to split the string at each comma. The result returned from this is

bat ball banana

Code:

Option Explicit
Private Sub test()
 Dim Mystring As String
 Dim StrResult As String

 Mystring = "bat,ball,bat,ball,banana"
 StrResult = shed_duplicates(Mystring)
End Sub
Private Function shed_duplicates(ByRef Mystring As String) As String
 Dim MySplitz() As String
 Dim J As Integer
 Dim K As Integer
 Dim BooMatch As Boolean
 Dim StrTemp(10) As String ' assumes no more than 10 possible splits!
 Dim StrResult As String


 MySplitz = Split(Mystring, ",")
  For J = 0 To UBound(MySplitz)
     BooMatch = False
     For K = 0 To UBound(StrTemp)
         If MySplitz(J) = StrTemp(K) Then
            BooMatch = True
            Exit For
         End If
     Next K
    If Not BooMatch Then
       StrTemp(J) = MySplitz(J)
    End If
Next
For J = 0 To UBound(StrTemp)
   If Len(StrTemp(J)) > 0 Then ' ignore blank entries
      StrResult = StrResult + StrTemp(J) + " "
   End If
Next J
Debug.Print StrResult
End Function
like image 130
MiguelH Avatar answered Feb 05 '26 03:02

MiguelH