I created a document with just one "Thumb up" Emoji (Unicode code point U+1F44D) that I inserted via the standard Windows+; shortcut:
But I can't get its actual code point with VBA.
I get these values (debug):
text = 12
length = 2
arrBytes = { 49, 0, 50, 0 }
with the following Sub procedure:
Sub test()
Dim text As String
Dim length As Integer
Dim arrBytes() As Byte
text = ActiveDocument.Range.Characters(1).text
length = Len(ActiveDocument.Range.Characters(1).text)
arrBytes = ActiveDocument.Range.Characters(1).text
End Sub
But if I had inserted the same Emoji via the menu Insert > Symbol > Font "Segoe UI Emoji" > U+1F44D (Thumb up), the same Sub procedure gets the values I expect (in debug; ?? are not "real" characters, they are surrogate code points which individually mean nothing):
text = ??
length = 2
arrBytes = { 61, 216, 77, 220 }
(for information, this code decodes the two characters into 👍
)
How to determine the actual character if the Emoji is inserted using Windows+;? (asking the users to choose the workaround above is not part of my question)
ADDENDUM May 26th: solution by @Florent B. works on all of my 3 computers (ActiveDocument.Content.InsertXML ActiveDocument.Content.XML
). Reloading the XML may have impacts on VBA programs, for instance it renumbers the image "Shape IDs", but that's another story.
ADDENDUM May 22nd: for the symbol added with Windows+;, I can find the correct value (4 bytes { 61, 216, 77, 220 }) only in the XML
property of the document Range object, but it requires that I parse the whole XML and determine which XML characters correspond to which positions of the Range objects, unfortunately I feel it requires a lot of knowledge or assumptions. Here is the part of the XML where I can see the 4 bytes (<w:t>??</w:t>
where ?? correspond to the 4 bytes) :
<?xml version="1.0" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<w:wordDocument ...>
... (around 23.000 characters)
<w:body>
<wx:sect>
<w:p wsp:rsidR="002703DB" wsp:rsidRDefault="003926FB">
<w:r>
<w:rPr>
<w:rFonts w:ascii="Segoe UI Emoji" w:h-ansi="Segoe UI Emoji"/>
<wx:font wx:val="Segoe UI Emoji"/>
</w:rPr>
<w:t>??</w:t>
</w:r>
</w:p>
<w:sectPr wsp:rsidR="002703DB" wsp:rsidSect="002849CD"><w:pgSz w:w="11906"
w:h="16838"/><w:pgMar w:top="1417" w:right="1417" w:bottom="1417"
w:left="1417" w:header="708" w:footer="708" w:gutter="0"/><w:cols
w:space="708"/><w:docGrid w:line-pitch="360"/></w:sectPr>
</wx:sect>
</w:body>
</w:wordDocument>
The XML is almost the same when I insert the Emoji as a symbol, there are 2 more "rFonts":
<w:body>
<wx:sect>
<w:p wsp:rsidR="00CD420D" wsp:rsidRDefault="00CD420D">
<w:r>
<w:rPr>
<w:rFonts w:ascii="Segoe UI Emoji" w:fareast="Segoe UI Emoji"
w:h-ansi="Segoe UI Emoji" w:cs="Segoe UI Emoji"/>
<wx:font wx:val="Segoe UI Emoji"/>
</w:rPr>
<w:t>??</w:t>
</w:r>
</w:p>
<w:sectPr wsp:rsidR="00CD420D" wsp:rsidSect="002849CD"><w:pgSz w:w="11906"
w:h="16838"/><w:pgMar w:top="1417" w:right="1417" w:bottom="1417"
w:left="1417" w:header="708" w:footer="708" w:gutter="0"/><w:cols
w:space="708"/><w:docGrid w:line-pitch="360"/></w:sectPr>
</wx:sect>
</w:body>
</w:wordDocument>
PS: computers/softwares where I could reproduce the problem:
I hope this helps: building upon @SandraRossi's comments above, it seems the input from the emoji panel is not correctly translated to its surrogate code point. If you save a document containing both symbols ( one from the Emoji Panel, and the other from via the menu, as you described) as an XML doc, you notice the difference:
Emoji Input:
<w:r w:rsidR="003814F5">
<w:rPr>
<mc:AlternateContent>
<mc:Choice Requires="w16se">
<w:rFonts w:ascii="Segoe UI Emoji" w:hAnsi="Segoe UI Emoji"/>
</mc:Choice>
<mc:Fallback>
<w:rFonts w:hint="eastAsia"/>
</mc:Fallback>
</mc:AlternateContent>
</w:rPr>
<mc:AlternateContent>
<mc:Choice Requires="w16se">
<w16se:symEx w16se:font="@SimHei" w16se:char="1F44D"/>
</mc:Choice>
<mc:Fallback>
<w:t>👍</w:t>
</mc:Fallback>
</mc:AlternateContent>
</w:r>
Menu (symbol) input:
<w:r w:rsidR="003814F5">
<w:rPr>
<w:rFonts w:ascii="Segoe UI Emoji" w:hAnsi="Segoe UI Emoji"/>
</w:rPr>
<w:t xml:space="preserve"> is not 👍</w:t>
</w:r>
The line <w16se:symEx w16se:font="@SimHei" w16se:char="1F44D"/>
is the key difference here. The normal (Menu -> Insert Symobol) emoji is used as a fallback.
It seems only Word has the issue. I tried the same emoji panel input on Excel (and PowerPoint), and I get the right values in debug ??
which translates to the Unicode code point U+1F44D
both in Excel and when copied back to Word.
Here are my final believes and findings.
It's probably a bug in MS Word VBA, based on the tests performed by AAA on Excel, Powerpoint and Word. Some people don't have this bug (cf comments).
The VBA objects give an invalid value for the emoji, but the XML property is correct. The XML is too much complex to be parsed easily, so the simplest workaround was found by Florent B. in the comments, which consists in "recreating the document from itself":
ActiveDocument.Content.InsertXML ActiveDocument.Content.XML
Unfortunately, in my personal case, it may have some collateral effects like shape IDs are renumbered.
So, I extended the code above to only correct the emoji characters in the original document, the rest remains intact, by:
Okay, the macros runs longer, but I couldn't find a better solution.
Here is my code, simplified (you may be surprised by the useless collection of Range objects, where each Range is one Character object, in fact I don't provide the original code for the function Split_Into_Ranges
, which is much bigger but faster, but it works and demonstrates well the solution in the sub correct_emojis
):
Sub test()
Dim text As String
Dim length As Integer
Dim arrBytes() As Byte
Dim zranges As Collection
Set zranges = Split_Into_Ranges(ActiveDocument)
Call correct_emojis(zranges) ' <=== here the important algorithm
text = ActiveDocument.Range.Characters(1).text
length = Len(ActiveDocument.Range.Characters(1).text)
arrBytes = ActiveDocument.Range.Characters(1).text
End Sub
Function Split_Into_Ranges(ioDocument As Document) As Collection
Dim zranges As Collection
Set zranges = New Collection
For i = 1 To ioDocument.Characters.Count
zranges.Add ioDocument.Characters(i)
Next
Set Split_Into_Ranges = zranges
End Function
Sub correct_emojis(zranges As Collection)
Dim current_emoji_zranges As Collection
Dim temp_zranges As Collection
Dim temp_emoji_zranges As Collection
Dim doc_current As Document
Dim doc_temp As Document
Dim arrBytes() As Byte
Set doc_current_zranges = get_emoji_zranges(zranges)
If doc_current_zranges.Count = 0 Then
Exit Sub
End If
Set doc_current = ActiveDocument
Set doc_temp = Documents.Add()
Call doc_temp.Content.InsertXML(doc_current.Content.XML)
Set temp_zranges = Split_Into_Ranges(doc_temp)
Set current_emoji_zranges = get_emoji_zranges(zranges)
Set temp_emoji_zranges = get_emoji_zranges(temp_zranges)
For i = 1 To current_emoji_zranges.Count
If 0 = 1 Then
arrBytes = current_emoji_zranges(i).Characters(1).text
arrBytes = temp_emoji_zranges(i).Characters(1).text
End If
current_emoji_zranges(i).Characters(1).text = temp_emoji_zranges(i).Characters(1).text
Next
Call doc_temp.Close(False)
End Sub
Function get_emoji_zranges(zranges As Collection) As Collection
Dim emoji_zranges As Collection
Set emoji_zranges = New Collection
For i = 1 To zranges.Count
If Len(zranges(i).text) > zranges(i).Characters.Count Then
For j = 1 To zranges(i).Characters.Count
If Len(zranges(i).Characters(j).text) > 1 Then
emoji_zranges.Add (zranges(i))
End If
Next
End If
Next
Set get_emoji_zranges = emoji_zranges
End Function
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With