Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing a string from C# to cpp with COM

I have a C# COM server which is consumed by a cpp client.

One of the C# methods returns a string.

In cpp the returned string is represented in Unicode (UTF-16), at least according to the memory view.

  1. Is this always the case with COM strings?
  2. Is there a way to use UTF-8 instead?
  3. I saw some code where strings were passed between cpp and c# as byte arrays. Is there any benefit in this?
like image 491
Yaron Naveh Avatar asked Feb 25 '26 08:02

Yaron Naveh


1 Answers

  1. Yes. The standard COM string type is BSTR. It is a Unicode string encoded in UTF16, just like Windows' native string type.
  2. No, a COM method isn't going to understand a UTF8 string, it will turn it into Chinese. UTF8 is a good encoding for a text file, not for programs manipulating strings in memory. UTF8 requires anywhere between 1 and 4 bytes to encode a Unicode codepoint. Very incompatible with basic string manipulations like getting the size or indexing a character.
  3. C and C++ programs tend to use 8-bit encodings, compatible with the "char" type. That's an old practice, dating back from an era before Unicode was around. There's nothing attractive about it, there are many 8-bit encodings. The typical problem is that data entered as text can only be interpreted correctly if it is read by a program that uses the same 8-bit encoding. In other words, when the computers are less than 1000 miles apart. Less in Europe.
like image 51
Hans Passant Avatar answered Feb 27 '26 20:02

Hans Passant



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!