I have an output.txt file which has around 1000 words that looks like this:
SESSIONDAYOFWEEK FILMTITLELONGALT tblTrans_Ticket. ADMITDETAILSALT2 MESSAGESTUB2ALT3 StartDayOfWeek Description MESSAGESTUB2ALT2 FILMTITLESHORTALT Applications TICKETTYPELONGALT
I need to filter that file, select only words that only have UPPER CASE characters, and get rid of those that have lower case characters.
I run this command in PowerShell:
Get-Content .\out.txt | ForEach-Object if ($_.IsUpper) {Write-Host $_}
and the shell parse all the words one by one and for each words prints me:
ForEach-Object : Input name "if" cannot be resolved to a method.
At line:1 char:25
+ ... et-Content .\out.txt | ForEach-Object if ($_.IsUpper) {Write-Host $_}
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidArgument: (TAIL:PSObject) [ForEach-Object], PSArgumentException
+ FullyQualifiedErrorId : MethodNotFound,Microsoft.PowerShell.Commands.ForEachObjectCommand
I don't understand where am I wrong?
Use the -cmatch operator for case-sensitive matching against a regex (regular expression):
Get-Content .\out.txt | Where-Object { $_ -cmatch '^\p{Lu}+$' }
-cmatch is the case-sensitive variant of the -match operator (whose alias is -imatch); given that -match is case-insensitive, -cmatch must be used in order to detect case distinctions.
\p{Lu} matches a single uppercase character - including accented non-ASCII characters such as Ü[1] - and adding + matches one or more in a row. Enclosing the expression in ^ (start of string) and $ (end of string) means that only lines entirely made up of uppercase characters are matched.
-cnotmatch '\p{Ll}' instead, which works slightly differently: it would eliminate lines that contain at least one lowercase character, which means that lines are kept even if they (also) contain non-letter characters (as long as there's no lowercase letter).An alternative with Select-String that may perform better:
Select-String -CaseSensitive '^\p{Lu}+$' .\out.txt | Select-Object -ExpandProperty Line
Select-String too is case-insensitive by default (as is PowerShell in general), so the -CaseSensitive switch is required here.
Note that, despite its name, Select-String as of PowerShell Core 6.1.0 doesn't support outputting the matched lines directly; instead, it outputs match-information objects whose .Line property contains the matched line, hence the need for Select-Object -ExpandProperty Line.
This GitHub issue proposes adding a new switch parameter to support direct output of the matched strings.
As for what you tried:
The code to be executed by the ForEach-Object cmdlet must be passed as a script block - i.e., a piece of code enclosed in { ... }.
You neglected to do that, which caused the syntax error you saw.
Also, the [string] type (a .NET string) does not have an .IsUpper() method (and even if it did, you forgot the () after .IsUpper).
Only the [char] type has an .IsUpper() method, namely a static one, which you can call as follows: [char]::IsUpper('A') - but you'd have to call this method in a loop for each character in your input string:
Get-Content .\out.txt | Where-Object {
foreach ($c in $_.ToCharArray()) { if (-not [char]::IsUpper($c)) { return $False } }
$True
}
Finally, don't use Write-Host to return results - Write-Host prints to the console only - you won't be able to capture or redirect such output[2]. Instead, use Write-Output or, better yet, rely on PowerShell's implicit output behavior: simply using $_ as a statement of its own will output it - any expression or command you neither capture nor redirect is automatically output (sent to the success output stream).
[1] By contrast, using character range expression [A-Z] would only recognize ASCII-range (English) uppercase characters.
[2] Never in PSv4-, but with additional effort you can in PSv5+ - but the point is that Write-Host is not meant for outputting results (data).
The easiest way to do this is probably with regex.
Get-Content .\out.txt | Where-Object { $_ -cmatch "\b[A-Z0-9_]+\b" }
Where-Object acts as a filter, allowing anything that matches through and discarding anything that doesn't match.
-cmatch will do case-sensitive regex match
Regex explanation:
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
A-Z a single character in the range between A (index 65) and Z (index 90)
0-9 a single character in the range between 0 (index 48) and 9 (index 57)
_ matches the character _ literally
\b assert position at a word boundary
You can remove 0-9 and _ if you don't want to allow words with those characters through the filter.
See: https://regex101.com/r/CfgEmU/1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With