Context:
You want to test whether a codepoint is valid in a specific character encoding.
Problem
In .NET, there are no equivalent functions to CharsetEncoder#canEncode() in Java.
Solution
If you want to test whether a character is valid in an encoding, you can test by using character interconversion like:
function Test-Character { Param( [Parameter(ValueFromPipeline=$true,Mandatory=$true)] [string] $Character, [Parameter(ValueFromPipeline=$false,Mandatory=$true)] $Encoding ) begin { if ($PSVersionTable.PSEdition -eq "Core") { [System.Text.Encoding]::RegisterProvider([System.Text.CodePagesEncodingProvider]::Instance) } $TestEncoding = [System.Text.Encoding]::GetEncoding($Encoding) } process { [String]::new($TestEncoding.GetChars($TestEncoding.GetBytes($Character))).Equals($Character) } }
You can call this like Test-Character -Character "あ" -Encoding 932
and will get $True
, and Test-Character -Character "♩" -Encoding 932
and will get $False
.
This cmdlet is suitable for test whether a character valid in Unicode is valid in another encoding.
Also, you can test codepoints with Convert-CodePoint like:
function Test-CodePoint { Param( [Parameter(ValueFromPipeline=$true,Mandatory=$true)] [string] $CodePoint, [Parameter(ValueFromPipeline=$false,Mandatory=$true)] $Encoding ) begin { [System.Text.Encoding]::RegisterProvider([System.Text.CodePagesEncodingProvider]::Instance) } process { $UnicodeCodePoint = (Convert-CodePoint -CodePoint $CodePoint -From $Encoding -To "utf-16BE") $ReverseCodePoint = (Convert-CodePoint -CodePoint $UnicodeCodePoint -From "utf-16BE" -To $Encoding) $CodePoint.Equals($ReverseCodePoint) } }
You can call this like Test-CodePoint -CodePoint "84BE" -Encoding 932
and will get $True
, and Test-CodePoint -CodePoint "84BF" -Encoding 932
and will get $False
.
This cmdlet is suitable for test whether a codepoint is valid in an encoding.