I am trying to get part of the text from an input element, depending on where the user clicked, but if I click in between the 1 and 0, the result is wrong "שלום 1", it should be "שלום 0". I tried adding CSS "unicode-bidi: plaintext;" to the input element. It did not help fix the problem.
var myInput = document.getElementById("myInput")
myInput.value = "שלום 10 hello";
alert("Needs to be: 'שלום 0' \n Result '"+myInput.value.substring( 0, 6 )+"'")
function mouseUp(){
alert("Result '"+myInput.value.substring( 0, myInput.selectionStart )+"'")
}
myInput.addEventListener("mouseup", mouseUp);
<input id="myInput" style="direction: rtl;unicode-bidi: plaintext;">
I found a solution thanks to Bergi's comment:
console.log("שלום 10 hello".split("").map(c =>
'${c}': \\u${c.charCodeAt(0).toString(16).padStart(4,'0')}
).join("\n")); might be helpful to see how the string is actually composed, without weird bidi rendering, which is what slice works on. I suspect you need to add some ltr/rtl marks.
After reading the page linked in the comment, I realized that the Unicode formatting offers two types of keycodes (LRM and RLM marks) that have zero width upon rendering and their purpose is to override the default direction flow.
So when the first non-space neighboring character, before a number is an RTL character, the explicit RTL characters interfere with the numbers' default LTR behavior. So when I added an LRM bidi control character before the number, it ignored the RTL direction flow override that caused the problem.
One thing to consider: I also learned that an RLM control character might be required right after the number if an LTR char immediately follows after the number (ignoring white spaces due to their neutrality) because of the fact that, if you start a sentence with an RTL word followed by a (space, LRM, number, space, LTR word), the LRM flow override merges the number and the following LTR word into a single LTR block which flips the display of the number and the LTR section since the number is found before the LTR block. So the string described above would be rendered like that: number, space, LTR word, space, RTL word in contrast to the expected way it should be rendered. I added a code example to demonstrate it:
var myInput = document.getElementById("myInput"),
Normal = document.getElementById("Normal"),
Fixed = document.getElementById("Fixed");
myInput.value = "שלום 10 hello";
Normal.innerHTML = myInput.value.substring( 0, 6 );
Normal.innerHTML += '<div style="display: inline-block;"></div>';
Normal.innerHTML += myInput.value.substring( 6, 14 );
Fixed.innerHTML =
myInput.value.replace("10",String.fromCodePoint("0x200E")+"10").substring( 0, 7 );
Fixed.innerHTML += '<div style="display: inline-block;"></div>';
Fixed.innerHTML += myInput.value.replace("10",String.fromCodePoint("0x200E")+"10"+String.fromCodePoint("0x200F")).substring( 7, 16 );
<input id="myInput" style="direction: rtl;width: 79px;">
<h3>Before the fix</h3>
<div id="Normal" style="background: #ffa0a0;width: fit-content;direction: rtl;display: inline-block;"> </div>
<h3>After the fix</h3>
<div id="Fixed" style="background: #a2ffa0;width: fit-content;direction: rtl;display: inline-block;"> </div>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With