Lazy Diary @ Hatena Blog

PowerShell / Java / miscellaneous things about software development, Tips & Gochas. CC BY-SA 4.0/Apache License 2.0

How to write result of ConvertTo-Csv to a file in UTF-8 without BOM

Context:

  • You want to write the result of ConvertTo-Csv in UTF-8 encoding without BOM. e.g. You need a file that can be read by a Java program (Java File API cannot handle BOM in UTF-8 encoded files).
  • UTF-8 in PowerShell, e.g. ConvertTo-Csv | Out-File -Encoding utf8 or Export-Csv -Encoding UTF8, will prepend a BOM to a file.

Problem

There are some soutions that will not work as expected.

PS > "1" | ConvertTo-Csv | Set-Variable tmp
PS > [System.IO.File]::WriteAllLines("/tmp/foobar.csv", $tmp, $UTF8woBomEncoding)                                   
Cannot find an overload for "WriteAllLines" and the argument count: "3".
At line:1 char:1
+ [System.IO.File]::WriteAllLines("/tmp/foobar.csv", $tmp, $UTF8woBomEn ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodException
    + FullyQualifiedErrorId : MethodCountCouldNotFindBest
  • WriteAllLines() with @() in 2nd parameter also returns error like this:
PS > "1" | ConvertTo-Csv | Set-Variable tmp
PS > [System.IO.File]::WriteAllLines("/tmp/foobar.csv", @($tmp), $UTF8woBomEncoding)                         
Cannot find an overload for "WriteAllLines" and the argument count: "3".
At line:1 char:1
+ [System.IO.File]::WriteAllLines("/tmp/foobar.csv", @($tmp), $UTF8woBo ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodException
    + FullyQualifiedErrorId : MethodCountCouldNotFindBest
PS > "1" | ConvertTo-Csv | % { [Text.Encoding]::UTF8.GetBytes($_) } | Set-Content -Encoding Byte -Path "./foobar.csv"
PS > Get-Content ./foobar.csv                                                                           
#TYPE System.String"Length""1"
PS > "1" | ConvertTo-Csv | Set-Variable tmp
PS > $UTF8woBomEncoding = New-Object System.Text.UTF8Encoding $False
PS > [System.IO.File]::WriteAllText("/tmp/foobar.csv", $tmp, $UTF8woBomEncoding)
PS > Get-Content /tmp/foobar.csv
#TYPE System.String"Length""1"

Reason:

  • WriteAllLines() expects string[] or IEnumerable[string], by contrast, the type of result of ConvertTo-Csv or @() is Object[].
PS > [System.IO.File]::WriteAllLines                                                                    

OverloadDefinitions                                                                                                    
-------------------                                                                                                    
static void WriteAllLines(string path, System.Collections.Generic.IEnumerable[string] contents)                        
static void WriteAllLines(string path, System.Collections.Generic.IEnumerable[string] contents, System.Text.Encoding en
coding)

PS > "1" | ConvertTo-Csv | Set-Variable tmp
PS > Get-Member -InputObject $tmp

   TypeName: System.Object[]
  • The result of ConvertTo-Csv doesn’t have line breaks on the end of line.

Solution:

You should cast the result of ConvertTo-Csv to [String[]], and use WriteAllLines() like this:

PS > "1" | ConvertTo-Csv | Set-Variable tmp
PS > $UTF8woBomEncoding = New-Object System.Text.UTF8Encoding $False
PS > [System.IO.File]::WriteAllLines("/tmp/foobar.csv", [String[]]$tmp, $UTF8woBomEncoding)
PS > Get-Content /tmp/foobar.csv
#TYPE System.String
"Length"
"1"

2017/05/23 add

Or, with % { [Text.Encoding]::UTF8.GetBytes($_) } | Set-Content -Encoding Byte or WriteAllText(), you can add CRLF at end of each line explicitly:

PS > "1" | ConvertTo-Csv | Out-String | % { [Text.Encoding]::UTF8.GetBytes($_) } | Set-Content -Encoding Byte -Path "./foobar.csv"
PS > Get-Content ./foobar.csv
#TYPE System.String
"Length"
"1"
PS > $UTF8woBomEncoding = New-Object System.Text.UTF8Encoding $False
PS > "1" | ConvertTo-Csv | Out-String | Set-Variable tmp
PS > [System.IO.File]::WriteAllText("/tmp/foobar.csv", $tmp, $UTF8woBomEncoding)
PS > Get-Content /tmp/foobar.csv
#TYPE System.String
"Length"
"1"