Context:
- You want to write the result of
ConvertTo-Csv
in UTF-8 encoding without BOM. e.g. You need a file that can be read by a Java program (Java File API cannot handle BOM in UTF-8 encoded files). - UTF-8 in PowerShell, e.g.
ConvertTo-Csv | Out-File -Encoding utf8
orExport-Csv -Encoding UTF8
, will prepend a BOM to a file.
Problem
There are some soutions that will not work as expected.
- Using
WriteAllLines()
returns error like this:
PS > "1" | ConvertTo-Csv | Set-Variable tmp PS > [System.IO.File]::WriteAllLines("/tmp/foobar.csv", $tmp, $UTF8woBomEncoding) Cannot find an overload for "WriteAllLines" and the argument count: "3". At line:1 char:1 + [System.IO.File]::WriteAllLines("/tmp/foobar.csv", $tmp, $UTF8woBomEn ... + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : NotSpecified: (:) [], MethodException + FullyQualifiedErrorId : MethodCountCouldNotFindBest
WriteAllLines()
with@()
in 2nd parameter also returns error like this:
PS > "1" | ConvertTo-Csv | Set-Variable tmp PS > [System.IO.File]::WriteAllLines("/tmp/foobar.csv", @($tmp), $UTF8woBomEncoding) Cannot find an overload for "WriteAllLines" and the argument count: "3". At line:1 char:1 + [System.IO.File]::WriteAllLines("/tmp/foobar.csv", @($tmp), $UTF8woBo ... + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : NotSpecified: (:) [], MethodException + FullyQualifiedErrorId : MethodCountCouldNotFindBest
ConvertTo-Csv | % { [Text.Encoding]::UTF8.GetBytes($_) } | Set-Content -Encoding Byte
orWriteAllText()
don’t prepend a BOM, but line breaks are omitted and all lines are joined together in one line.
PS > "1" | ConvertTo-Csv | % { [Text.Encoding]::UTF8.GetBytes($_) } | Set-Content -Encoding Byte -Path "./foobar.csv" PS > Get-Content ./foobar.csv #TYPE System.String"Length""1"
PS > "1" | ConvertTo-Csv | Set-Variable tmp PS > $UTF8woBomEncoding = New-Object System.Text.UTF8Encoding $False PS > [System.IO.File]::WriteAllText("/tmp/foobar.csv", $tmp, $UTF8woBomEncoding) PS > Get-Content /tmp/foobar.csv #TYPE System.String"Length""1"
Reason:
WriteAllLines()
expectsstring[]
orIEnumerable[string]
, by contrast, the type of result ofConvertTo-Csv
or@()
isObject[].
PS > [System.IO.File]::WriteAllLines OverloadDefinitions ------------------- static void WriteAllLines(string path, System.Collections.Generic.IEnumerable[string] contents) static void WriteAllLines(string path, System.Collections.Generic.IEnumerable[string] contents, System.Text.Encoding en coding) PS > "1" | ConvertTo-Csv | Set-Variable tmp PS > Get-Member -InputObject $tmp TypeName: System.Object[]
- The result of
ConvertTo-Csv
doesn’t have line breaks on the end of line.
Solution:
You should cast the result of ConvertTo-Csv
to [String[]]
, and use WriteAllLines()
like this:
PS > "1" | ConvertTo-Csv | Set-Variable tmp PS > $UTF8woBomEncoding = New-Object System.Text.UTF8Encoding $False PS > [System.IO.File]::WriteAllLines("/tmp/foobar.csv", [String[]]$tmp, $UTF8woBomEncoding) PS > Get-Content /tmp/foobar.csv #TYPE System.String "Length" "1"
2017/05/23 add
Or, with % { [Text.Encoding]::UTF8.GetBytes($_) } | Set-Content -Encoding Byte
or WriteAllText()
, you can add CRLF at end of each line explicitly:
PS > "1" | ConvertTo-Csv | Out-String | % { [Text.Encoding]::UTF8.GetBytes($_) } | Set-Content -Encoding Byte -Path "./foobar.csv" PS > Get-Content ./foobar.csv #TYPE System.String "Length" "1"
PS > $UTF8woBomEncoding = New-Object System.Text.UTF8Encoding $False PS > "1" | ConvertTo-Csv | Out-String | Set-Variable tmp PS > [System.IO.File]::WriteAllText("/tmp/foobar.csv", $tmp, $UTF8woBomEncoding) PS > Get-Content /tmp/foobar.csv #TYPE System.String "Length" "1"