How to convert IDN Domains to ACE encoding with PowerShell


The other day I needed to lookup a lot of domains regarding changes to DNS. The list was long and contained plenty of IDN (International Domain Name) which contains special characters such as “æøå”. I used the command Resolve-DnsName, and for this to work one must use the ACE-encoded version of the domain to make it resolve. So, this ended with two small scripts that converts domains such as krøllalfa.no to xn—krllalfa-64a.no and xn—krllalfa-64a.no to krøllalfa.no.

There are tools out there that does this for you, but what’s the fun in that, when you can create a tool on your own when you have some spare time in the evening. Also you get the learning benefits from it as well. So here is the scripts, lets break it down.

Here I’m creating an Advanced PowerShell function, which takes one parameter as input, the domain name. Then in the Process block I created a new object using the IdnMapping class from the .NET SystemGlobalization Namespace, which contains classes that defines culture-related information. These classes are often useful when writing internationalized applications.

Then I use the GetAscii method to convert the internationalized domain name to Punycode, the ACE encoded version of the domain. Punycode looks like this xn—krllalfa-64a.no.

function ConvertTo-AceEncoding {
    [CmdletBinding()]
    Param (
        # Domain name
        [Parameter(Mandatory = $true)]
        [String]
        $Domain
    )
    Process {
        $Idn = New-Object System.Globalization.IdnMapping
        $Idn.GetAscii("$Domain")
    }
}

To run the function,

# Either use dot source, to load the function into memory
. .\ConvertTo-AceEncoding.ps1

# Or you can use CTRL+A to mark all the code, then hit run (Play button) to load it into memory.

# Then you can run the function like any other PowerShell function, like so:
ConvertTo-AceEncoding -Domain alfakrøll.no

This will output the converted version:

Output from ConvertTo-AceEncoding

The script converting from ACE encoding (Punycode) to Unicode is built the same way as above, only different is the GetUnicode() method to convert it from Punycode to Unicode. As I’m typing I could’ve just added both in the same script, but for now, their separated.

function ConvertTo-Unicode {
    [CmdletBinding()]
    param (
        # Domain name
        [Parameter(Mandatory = $true)]
        [String]
        $Domain
    )
    process {
        $Idn = New-Object System.Globalization.IdnMapping
        $Idn.GetUnicode("$Domain")
    }
}

This function can also ran the same way as the script above by:

# Either use dot source, to load the function into memory
. .\ConvertTo-Unicode.ps1

# Or you can use CTRL+A to mark all the code, then hit run (Play button) to load it into memory.

# Then you can run the function like any other PowerShell function, like so:
ConvertTo-AceEncoding -Domain "xn--krllalfa-64a.no"

This will output the converted version:

Output from ConvertTo-Unicode

Had some issues I couldn’t quite figure out though.. This gives the correct output in the console. Meanwhile, I thought this was a good example to try out the Pester module in PowerShell, this is PowerShell’s module for Unit Testing.

Apparently this was not as straight forward as I thought. Tried to verify the output with the following Pester tests:

First the Unit Test for the ConvertTo-AceEncoding.

# Loads the function into memory
. .\ConvertTo-AceEncoding.ps1

Describe "Converts IDN from ACE-Encoding" {
    Context "Converting from Unicode to Punycode" {
        it "converts IDN domain to expected format" {
            $EncodedDomain = ConvertTo-AceEncoding -Domain "krøllalfa.no"
            $EncodedDomain | Should Be "xn--krllalfa-64a.no"
        }
    }
}

Then the Unit test for ConvertTo-Unicode.

. .\ConvertTo-Unicode.ps1

Describe "Converts IDN from ACE Encoding to UNICODE" {
    Context "Converting from Punycode to Unicode" {
        it "converts IDN domain to expected format" {
            $EncodedDomain = ConvertTo-Unicode -Domain "xn--krllalfa-64a.no"
            $EncodedDomain | Should Be "krøllalfa.no"
        }
    }
}

Well.. As you can see below, the unit tests did not pass. It gives the correct result when running the function outside the unit test, but when the function is ran in the unit test it gives wrong output.

The result from the Unit test for ConvertTo-AceEncoding.ps1

Unit Test error from ConvertTo-AceEncoding

The result from the unit test for ConvertTo-Unicode.ps1

Unit Test error from ConvertTo-Unicode

The struggle with encoding becomes real.. Hmm. Tested out different approaches where I saved output to file using Out-File and used the parameter -Encoding Ascii to specify the encoding format, then read from the file, to see if that helped. Still got the same result. Kept on for a couple of hours checking for spacing, encoding formats, input, output, different consoles and so forth, until I almost gave up. Then I looked down in VS Code and checked the document encoding, like below.

VSCode encoding setting UTF8

I changed this to UTF-8 with BOM instead, and boom, it worked.

VSCode encoding setting UTF8 with BOM

The unit test passed and displayed the sweet satisfying color of green.

Result from ConvertTo-AceEncoding

ConvertTo-AceEncoding Unit Test Passed

Result from ConvertTo-Unicode

ConvertTo-Unicode Unit Test Passed

It seemed that UTF-8 encounters problems when there is data converted from other encoding forms that use a BOM or BOM is used as a UTF-8 signature. With that solved, thanks for reading.

Enjoy!