Lazy Diary @ Hatena Blog

PowerShell / Java / miscellaneous things about software development, Tips & Gochas. CC BY-SA 4.0/Apache License 2.0

How to convert from a code point (U+xxxx) to a character

function ConvertFrom-CodePoint {
  Param(
    [Parameter(ValueFromPipeline=$true,Mandatory=$true)]
    [string] $CodePoint,
    [Parameter(ValueFromPipeline=$false,Mandatory=$true)]
    $From
  )
  begin {
    [System.Text.Encoding]::RegisterProvider([System.Text.CodePagesEncodingProvider]::Instance)
    $FromEncoding = [System.Text.Encoding]::GetEncoding($From)
    $HexNumber = [System.Globalization.NumberStyles]::HexNumber
  }
  process {
    Select-String -InputObject $CodePoint -Pattern ".{2}" -AllMatches `
        | ForEach-Object { $_.Matches } `
        | ForEach-Object { [Byte]::Parse($_.Value, $HexNumber) } `
        | Set-Variable InputCharBytes
    $FromEncoding.GetString($InputCharBytes)
  }
}

You can call this like ConvertFrom-CodePoint -CodePoint "3042" -From "utf-16BE" and will get a character “あ”.

Note: Use surrogate pairs (ex. D867DE3D) instead of actual Unicode codepoint (U+29E3D) if you want to use code points outside of BMP.

Note: Unicode codepoints (U+xxxx) are shown in big-endian order.

EDIT 2017-05-11

You can also use [char]::ConvertFromUtf32(0xXXXX) or [System.Activator]::CreateInstance([System.String], [char[]]@(0xXXXX, 0xXXXX)) for Unicode codepoints. If the codepoint is larger than U+10000, you can use UTF-16 surrogate pair for CreateInstance().

PS > [char]::ConvertFromUtf32(0x29E3D)                                                                       
𩸽
PS > [System.Activator]::CreateInstance([System.String], [char[]]@(0xD867, 0xDE3D))                          
𩸽