regexp for nvda to put spaces between all capital letters?

Question

So, I use NVDA, a free screen reader for the blind that many people use, and a speech synthesizer. I am building a library of modified versions of addons which it takes, and dictionaries that can contain regular expressions acceptable by python, as well as standard word replacement operation. My thing is, I do not know how to design a regular expression that will place a space between capital letters such as in ANM, which the synth says as one word rather than spelling it like it should. I do not know enough python to manually code an addon for this thing, I only use regexp for this kind of thing. I do know regular expressions basics, the general implementation, which you can find by googling "regular expressions in about 55 minutes". I want it to do something like this.

Input: ANM
Output: A N M

Also with the way this speech synth works, I may have to replace A with eh, which would make this.

Input: ANM
Output: Eh N M

Could any of you provide me a regular expression to do this if it is possible? And no, I don't think I can compile them in loops because I didn't write the python.

Input: ANM
Output: A N M

Also with the way this speech synth works, I may have to replace A with eh, which would make this.

Input: ANM
Output: Eh N M

Could any of you provide me a regular expression to do this if it is possible? And no, I don't think I can compile them in loops because I didn't write the python.

Galax · Accepted Answer

This should do the trick for the capital letters, it uses ?= to look ahead for the next capital letter without 'eating it up':

>>> import re
>>> re.sub("([A-Z])(?=[A-Z])", r"\1 ", "ABC thIs iS XYZ a Test")
'A B C thIs iS X Y Z a Test'

If you have a lot of replacements to make, it might be easiest to put them into a single variable:

replacements = [("A", "eh"), ("B", "bee"), ("X", "ex")]
result = re.sub("([A-Z])(?=[A-Z])", r"\1 ", "ABC thIs iS XYZX. A Xylophone")
for source, dest in replacements:
    result = re.sub("("+source+r")(?=\W)" , dest, result)
print(result)

Output:

eh bee C thIs iS ex Y Z ex. eh Xylophone

I build a regex in the 'replacements' code to handle capitalised words and standalone capitals at the end of sentences correctly. If you want to avoid replacing e.g. the standalone 'A' with 'eh' then the more advanced regex replacement function as mentioned in @fjarri's answer is the way to go.

fjarri · Answer

While @Galax's solution certainly works, it may be easier to perform further processing of abbreviations if you use callbacks on matches (this way you won't replace any standalone capitals):

import re

s = "This is a normal sentence featuring an abbreviation ANM. One, two, three."

def process_abbreviation(match_object):
    spaced = ' '.join(match_object.group(1))
    return spaced.replace('A', 'Eh')

print(re.sub("([A-Z]{2,})", process_abbreviation, s))

regexp for nvda to put spaces between all capital letters?

Tags:

python

regex

Colton Hill

2 Answers

Galax

fjarri

Recent Activity

Donate For Us

regexp for nvda to put spaces between all capital letters?

Tags:

python

regex

Colton Hill

2 Answers

Galax

fjarri

Related questions

Recent Activity

Donate For Us