Lazy Diary @ Hatena Blog

PowerShell / Java / miscellaneous things about software development, Tips & Gochas. CC BY-SA 4.0/Apache License 2.0

How to get a #text in XML even if the tag doesn't have attributes

Background:

In PowerShell (even in C# or VB.NET?), you can get a body of the tag (text content) with '#text#' property.

> $xml = New-Object System.Xml.XmlDocument
> $xml.LoadXml('<a><b id="1">foo</b></a>')
> $xml.SelectNodes('//a').b.'#text'
foo

Problem:

If a tag has no attributes, you cannot get text content with '#text#' property.

> $xml = New-Object System.Xml.XmlDocument
> $xml.LoadXml('<a><b>foo</b></a>')
> $xml.SelectNodes('//a').b.'#text'
(nothing shown)

Reason:

The type of $xml.SelectNodes('//a').b will be XmlElement when <b> has attributes. On the other hand, it will be String when <b> has no attributes.

> $xml = New-Object System.Xml.XmlDocument
> $xml.LoadXml('<a><b id="1">foo</b></a>')
> $xml.SelectNodes('//a').b.GetType().Name
XmlElement
> $xml.LoadXml('<a><b>foo</b></a>')
> $xml.SelectNodes('//a').b.GetType().Name
String

Solution:

Use XPath method rather than property on DOM object. SelectSingleNode() will alrays return XmlElement and you can use #text property.

> $xml = New-Object System.Xml.XmlDocument
> $xml.LoadXml('<a><b id="1">foo</b></a>')
> $xml.SelectNodes('//a').SelectSingleNode('//b').'#text'
foo
> $xml.LoadXml('<a><b>foo</b></a>')
> $xml.SelectNodes('//a').SelectSingleNode('//b').'#text'
foo