08
Mar 18

Look-Alike Domains and Visual Confusion

How good are you at telling the difference between domain names you know and trust and impostor or look-alike domains? The answer may depend on how familiar you are with the nuances of internationalized domain names (IDNs), as well as which browser or Web application you’re using.

For example, how does your browser interpret the following domain? I’ll give you a hint: Despite appearances, it is most certainly not the actual domain for software firm CA Technologies (formerly Computer Associates Intl Inc.), which owns the original ca.com domain name:

https://www.са.com/

Go ahead and click on the link above or cut-and-paste it into a browser address bar. If you’re using Google Chrome, Apple’s Safari, or some recent version of Microsoft‘s Internet Explorer or Edge browsers, you should notice that the address converts to “xn--80a7a.com.” This is called “punycode,” and it allows browsers to render domains with non-Latin alphabets like Cyrillic and Ukrainian.

Below is what it looks like in Edge on Windows 10; Google Chrome renders it much the same way. Notice what’s in the address bar (ignore the “fake site” and “Welcome to…” text, which was added as a courtesy by the person who registered this domain):

The domain https://www.са.com/ as rendered by Microsoft Edge on Windows 10. The rest of the text in the image (beginning with “Welcome to a site…”) was added by the person who registered this test domain, not the browser.

IE, Edge, Chrome and Safari all will convert https://www.са.com/ into its punycode output (xn--80a7a.com), in part to warn visitors about any confusion over look-alike domains registered in other languages. But if you load that domain in Mozilla Firefox and look at the address bar, you’ll notice there’s no warning of possible danger ahead. It just looks like it’s loading the real ca.com:

What the fake ca.com domain looks like when loaded in Mozilla Firefox. A browser certificate ordered from Comodo allows it to include the green lock (https://) in the address bar, adding legitimacy to the look-alike domain. The rest of the text in the image (beginning with “Welcome to a site…”) was added by the person who registered this test domain, not the browser. Click to enlarge.

The domain “xn--80a7a.com” pictured in the first screenshot above is punycode for the Ukrainian letters for “s” (which is represented by the character “c” in Russian and Ukrainian), as well as an identical Ukrainian “a”.

It was registered by Alex Holden, founder of Milwaukee, Wis.-based Hold Security Inc. Holden’s been experimenting with how the different browsers handle punycodes in the browser and via email. Holden grew up in what was then the Soviet Union and speaks both Russian and Ukrainian, and he’s been playing with Cyrillic letters to spell English words in domain names.

Letters like A and O look exactly the same and the only difference is their Unicode value. There are more than 136,000 Unicode characters used to represent letters and symbols in 139 modern and historic scripts, so there’s a ton of room for look-alike or malicious/fake domains.

For example, “a” in Latin is the Unicode value “0061” and in Cyrillic is “0430.”  To a human, the graphical representation for both looks the same, but for a computer there is a huge difference. Internationalized domain names (IDNs) allow domain names to be registered in non-Latin letters (RFC 3492), provided the domain is all in the same language; trying to mix two different IDNs in the same name causes the domain registries to reject the registration attempt.

So, in the Cyrillic alphabet (Russian/Ukrainian), we can spell АТТ, УАНОО, ХВОХ, and so on. As you can imagine, the potential opportunity for impersonation and abuse are great with IDNs. Here’s a snippet from a larger chart Holden put together showing some of the more common ways that IDNs can be made to look like established, recognizable domains:

Image: Hold Security.

Holden also was able to register a valid SSL encryption certificate for https://www.са.com from Comodo.com, which would only add legitimacy to the domain were it to be used in phishing attacks against CA customers by bad guys, for example.

A SOLUTION TO VISUAL CONFUSION

To be clear, the potential threat highlighted by Holden’s experiment is not new. Security researchers have long warned about the use of look-alike domains that abuse special IDN/Unicode characters. Most of the major browser makers have responded in some way by making their browsers warn users about potential punycode look-alikes.

With the exception of Mozilla, which by most accounts is the third most-popular Web browser. And I wanted to know why. I’d read the Mozilla Wiki’s IDN Display Algorithm FAQ,” so I had an idea of what Mozilla was driving at in their decision not to warn Firefox users about punycode domains: Nobody wanted it to look like Mozilla was somehow treating the non-Western world as second-class citizens.

I wondered why Mozilla doesn’t just have Firefox alert users about punycode domains unless the user has already specified that he or she wants a non-English language keyboard installed. So I asked that in some questions I sent to their media team. They sent the following short statement in reply:

“Visual confusion attacks are not new and are difficult to address while still ensuring that we render everyone’s domain name correctly. We have solved almost all IDN spoofing problems by implementing script mixing restrictions, and we also make use of Safe Browsing technology to protect against phishing attacks. While we continue to investigate better ways to protect our users, we ultimately believe domain name registries are in the best position to address this problem because they have all the necessary information to identify these potential spoofing attacks.”

If you’re a Firefox user and would like Firefox to always render IDNs as their punycode equivalent when displayed in the browser address bar, type “about:config” without the quotes into a Firefox address bar. Then in the “search:” box type “punycode,” and you should see one or two options there. The one you want is called “network.IDN_show_punycode.” By default, it is set to “false”; double-clicking that entry should change that setting to “true.”

Incidentally, anyone using the Tor Browser to anonymize their surfing online is exposed to IDN spoofing because Tor by default uses Mozilla as well. I could definitely see spoofed IDNs being used in targeting phishing attacks aimed at Tor users, many of whom have significant assets tied up in virtual currencies. Fortunately, the same “about:config” instructions work just as well on Tor to display punycode in lieu of IDNs.

Holden said he’s still in the process of testing how various email clients and Web services handle look-alike IDNs. For example, it’s clear that Twitter sees nothing wrong with sending the look-alike CA.com domain in messages to other users without any context or notice. Skype, on the other hand, seems to truncate the IDN link, sending clickers to a non-existent page.

“I’d say that most email services and clients are either vulnerable or not fully protected,” Holden said.

For a look at how phishers or other scammers might use IDNs to abuse your domain name, check out this domain checker that Hold Security developed. Here’s the first page of results for krebsonsecurity.com, which indicate that someone at one point registered krebsoṇsecurity[dot]com (that domain includes a lowercase “n” with a tiny dot below it, a character used by several dozen scripts). The results in yellow are just possible (unregistered) domains based on common look-alike IDN characters.

The first page of warnings for Krebsonsecurity.com from Hold Security’s IDN scanner tool.

I wrote this post mainly because I wanted to learn more about the potential phishing and malware threat from look-alike domains, and I hope the information here has been interesting if not also useful. I don’t think this kind of phishing is a terribly pressing threat (especially given how far less complex phishing attacks seem to succeed just fine for now). But it sure can’t hurt Firefox users to change the default “visual confusion” behavior of the browser so that it always displays punycode in the address bar (see the solution mentioned above).

[Author’s note: I am listed as an adviser to Hold Security on the company’s Web site. However this is not a role for which I have been compensated in any way now or in the past.]

Tags: , , , , , , , , , , , , , , , , ,

104 comments

  1. Very valuable info for those of us who have to judge the legitimacy of domains when analyzing potential phishing or other attacks. Thank you.

    I definitely get my money’s worth out of my subscription.

  2. Every time punycode comes up I’m dreading that someone is paying attention and will start to use it against victims. Whether your browser reveals it or not, it’s impossible for victims to hover and reveal the actual landing spot, and thus opens you up to a drive-by attack.

    I’m surprised that Mozilla didn’t take a more proactive approach to addressing this like the other browsers.

    • I think punycode is so common in Russia that Mozilla does not want to piss off their userbase there.

      • Yes, but their argument is silly. The remedy is showing punycode either when: 1) the user does not have the Cyrillic keyboard or locale installed (a big indication the user does not speak the language) 2) the crylic is mixed with non-Cyrillic (which almost all scams do).

        Firefox rejected those arguments without good reason.

        Reality is that these kind of spoofs are in practice almost exclusively replacing a few Latin characters by Cyrillic. So you don’t have to make one locale 2nd class or affect Russian userbase to address it.

    • If you make the config change in Firefox to show the puny code, hovering over the link shows you puny code, not the “disguised” link.

      • Did you test before making the change?
        I tested before changing the default (Firefox 57.0, Linux). Hovering over the link showed me the punycode. Opening in a new tab showed the punycode in the tab as the page was loading, before changing to “Fake Site” when loading completed.

        • I’m running Firefox 58.02 with

          network.IDN_show_punycode default false

          When I hover over the link, it shows the punycode at the bottom. In these days of dangerous links, I always look there.
          I have done some custom config of Firefox, but perhaps only 4-5 things along the lines of ‘ask before running flash’ etc.

  3. When I hover my mouse pointer over a link using MS Edge, I get a look at the underlying URL in a small box popup at the bottom of the page. I use this to determine the validity of a link. Not sure this will catch all spoofers or “miscreants.”

    • Same here, mouse over to see what it really is. We teach all our people to do the same.

  4. This post was very interesting. I just checked the sample ca[.]com url in my proxy for kicks and it flags as a newly registered domain (which is a blocked category). It’s good to know I have a level of protection for our users for at least new URLs of this type.

    I’ve also blocked a large number of punycode URLs based on threat intel reports. I never paid much attention to those URLs (othen included in large lists of random domains), but I recognize the format and now I understand what they represent. Thanks Brian.

  5. Diane Wilkinson Trefethen

    Thank you for the info. Changed my Firefox default + posted to my FB page with a link to this article.

  6. This is a can of really ugly worms. The vast majority of IDN domain names don’t look anything like English names, they’re like 普遍接受-测试.世界. The names you’re noted are the hard corner case, Cyrillic or Greek letters that are homographs, they look like Latin letters. It would be rude to ban every Cyrillic name since most of them are benign, to, e.g. Россия.РФ, and it’s surprisingly hard to identify the ones that are Latin homographs or near-homographs.

    • Just do what Chrome did and ban all domains that consist entirely of Latin look-alike characters. Or do what Edge does and show punycode for scripts that aren’t in languages the current user speaks.

      This isn’t an insurmountable problem. Other browsers already fixed it: Firefox just refuses to.

  7. Brian
    Have you looked into Quad9 (9.9.9.9) from the Global Cyber Alliance….yes, this is a valid DNS url…. 🙂

    https://www.globalcyberalliance.org/initiatives/quad9.html

    Simply put – Quad8 (Google) wants your Meta Data for $, Quad9 does not. Quad9 is partnership of Research and Law Enforcement organizations. The IP Address 9.9.9.9 was donated by IBM. It is worth the research & investigation. It is very secure and accurate.
    Regards;
    John

  8. The Mozilla response is ““Visual confusion attacks are not new…” But there is no visual confusion to Firefox! Firefox knows punycode, it’s not confused at all.

    Does Mozilla really think that colorizing a punycode URL would somehow be insulting to non-Western world citizens? All they need to do is indicate to the user that the use of a non-Latin URL may be unexpected if they are using a Latin keyboard. If the user is expecting a non-Latin URL, they have no reason to be insulted.

  9. Not usually one to defend Mozilla on any account, but in all fairness, the situation, at least with v58, is NOT quite as dire as you imply.

    First, if I hover over your booby-trapped link, FF shows the full punycode version in the link preview that pops up in the LL corner of the browser window.

    Second, if I right-click and copy your booby-trapped link, when I paste it into the address bar of a new tab, FF again shows the full punycode version of the link in the address bar AND also the “funny” version of the link in the suggestions list.

    So it is NOT TRUE that FF gives NO indication of the boobytrap. One just has to be a bit more creative to see it.

    [ Tested on both Win7 and MacOS 10.12.6 Sierra. ]

    • 99% of normal people will just click on the link, then perhaps inspect the domain name if they’re being careful. On my Firefox, clicking results in the raw punycode being shown for a fraction of a second in the URL bar, then switches to the IDN rendered version.

  10. Thank you, Brian. Your research and the information you provide are invaluable, keeping all of us much safer from cyber threats.

    I use Firefox and I haven’t clicked on the ca[dot]com link, but when I copy and paste the link to a new browser window, Firefox displays “xn--80a7a.com” (without the quotes) on the address line. Also, when I hover over the link on your page, the little box at the bottom left corner of my browser also displays the xn-- etc. Like petepall who commented earlier, I check the URL in that box against the link and avoid sites where there are discrepancies.

  11. There is another system for identifying internationalized domain squats here: https://xntwist.tk/.

  12. After posting my earlier comment, I copied and pasted the ca[dot]com URL from your email alert into Firefox. This time, Firefox did NOT show the punycode version in the address line. I’m no expert, so I’m curious why the difference.

    • Liz, did you use the same copy technique each time?

      If you highlight the link, then right-click and select Copy, you’ll get the phony link. If you select Copy link location, you’ll get the punycode version.

      • Hi James.

        In both instances I right clicked the link and used the Copy Link Location option. I tried again, using both “highlighting the URL + pasting” and “right clicking Copy Link Location” methods. The link in the email consistently shows up as ca[dot]com in FF, while the link on the webpage reveals the punycode. I tried again after changing the network.IDN_show_punycode setting to “true.” Same thing. No punycode with email URL link and punycode revealed with webpage URL link.

        It’s really not a problem for me since I’m religious about not clicking on links in emails, but it is curious. I’m using FF Quantum 58.0.2 (64-bit) Win7.

    • Hey,
      if you’re using thunderbird: network.IDN_show_punycode is available in it’s about:config. (Here’s how to get there: http://mzl.la/1ApHliI )
      Maybe try to toggle that setting and see what it does to the email.

      I’d be curious how email adresses from fake domains would display, too. IDNs are supported by exim (disabled by default I think) and postfix, so the problem exists here, too.

      • Yup, I do use TBird. I changed the network.IDN_show_punycode to true as you suggested and that did the trick. Thank you Dennis, and James, too!

      • Yes, thank you. I didn’t even know Thunderbird had an about:config. D’oh.

  13. I checked my Firefox Quantum (58.0.2 (64-bit) under Win7/64) config for punycode and the default is TRUE, so it might help FF users to migrate to the latest version of the browser (which has been steadily improving since Quantum came out – I have some 500+ tabs open at any one time as part of a project and after a period of adjustment the browser now seems to handle that load well).

    I’m paranoid enough that I tend to mouseover all embedded links to check them for veracity before I even think of copying them to the clipboard and then checking them in a plain text editor, and Quantum shows the punycode version of the displayed URL, which is helpful.

    I’ll be sure to check out Alex Holden’s software, and thanks again for all you do, Brian.

  14. guys,is any fraud carder or transfer jobs gusy here?
    i need good partners for transfers and all kind of business jobs.
    i found few carding forums but they are rippers!!
    no honor they are liars, im looking real business no rippers!!
    why the hell…90% carders are just rippers nowdays?==?=?==

    • Seriously? You’re using krebsonsecurity to advertise for partners in cybercrime? Ballsy move….

  15. While in Chrome I click “view source” on the above story and here’s how the hyperlink is rendered:

    https://www.са.com/

    I thought I’d see it pointing to xn--80a7a.com in the raw HTML page source, is Chrome protecting me from myself in this case?

  16. I checked my Firefox Quantum (58.0.2 (64-bit) under Win10/64) on two systems sitting side by side in my home office. One had the punycode set to true, the other set to false. Not something I had set on either of them in the past so I’m not sure what the default is.

  17. Wow interesting stuff.

    I ran just a few popular websites through the checker:

    Facebook – 100 registered domains
    PayPal – 28

    So then I wondered about banks

    USBank.com – 1
    wellsfargo.com – 8
    jpmorganchase.com – 2
    citigroup.com – 1
    bankofamerica.com – 14
    goldmansachs.com – 5
    morganstanley.com – 4

    How about investment companies?

    Schwab.com – 1
    fidelity.com – 3

    Entertainment?

    Comcast.net – 1
    netflix.com – 15

    Employment?

    adp.com – 1

    Every state government I checked had at least one domain already registered.

    Ouch!

    • That is very interesting.

      • Just goes to show, it doesn’t pay to follow email links, or type domain URLs into the address bar. I’ve become too lazy to read the URL like I used to – I still do occasionally, but the font trick could easily fool, me even though I use the magnifier to see details as well as I can!

        This is another great KOS story – thanks be to Brian!

  18. Thanks, Brian. I’m a Firefox user and while I do not often click on email links there is always the possibility. The link looked genuine until I adjusted my Firefox settings to show the punycode.

  19. This website is not entirely compatible with IE 11 – unless the settings I have on IE are blocking the captcha. The error messages do not come up on IE, but they do on Chrome.

    Interesting article Brian. Thank you!

  20. I work for Thinking Objects, a security company from Germany. Our primary domain is to.com and I did some tests last year.

    tᴏ.com vs to.com vs tᴑ.com

    My focus was on apples safari and mail, this ended up in CVE-2017-7106 and CVE-2017-7152

    I wrote about this in
    https://blog.to.com/phishing-with-an-apple-as-bait/

    Additionally I built a “live js injection reverse proxy” for demonstration purposes on https://tᴏ.com/ aka https://xn--t-26l.com/

  21. This is already VERY widespread with Steam phishing and a few other Steam-related scam sites like fake cashout and gambling sites. Has been for a while now. I’m just surprised it’s not widespread elsewhere.

  22. I created a IDN domain builder visualizing the script rules for several domains allowed for registrations in particular registries
    http://dom.xn--enea.com/

    Worst are .com and .net; .de and .org for example are rather strict.

  23. Worth noting: in Firefox, if you highlight a link, it does display the punycode on the bottom left of the browser window. This is true even if you have not yet changed the network.IDN_show_punycode value to “true.” That’s usually where I look when checking out a link, as pasting one into the URL bar is a bit more risky.

    • Oliver Paukstadt

      Might not help:
      For this an attacker can create links with the genuine website in href and add an on click handler which replaces the href right before executing the link with the bad site and return true.

    • I viewed this article in Firefox on Ubuntu (latest version) and hovering over the ca.com link below the second paragraph showed https://www.ca.com

      • Brian — When I hover over the URL link in your article, I can see the punycode using FF (Win7), but when I hover over the link in your Reply post, I don’t. Same web page, same browser (FF), but different results.

      • I neglected to mention that when I hover over the URL link in thedinger’s comment, I can see the punycode. The only difference between the URL in his post vs yours is the forward slash at the end.

      • I just tried it in the latest Firefox version (58.0.2) and the tab said Fake Site

        The result showed: Welcome to a site that looks like ca.com but it is not ca.com Stay tuned for more

        This is a GREAT article. Thank you so much. 🙂

        • That’s interesting. I get a variant. Clicking on the link, FF (58.0.2 on Win7 64 bit) takes me to a page with https://www.xn--80a7a.com/ in the address bar, but with a green lock. It’s like Bach ……. variations on the theme, just not as pleasant.

    • You’re right when it comes to FF, but in Thunderbird, you have to change the setting to “true” in order to see the punycode, so those people using TB as their email client may want to change that setting. You can find the configuration editor button under Tools > Options > Advanced > General.

  24. Why not just apply punycode selectively based on the toplevel domain?

    I.e. Россия.РФ or 普遍接受-测试.世界 are fine, but Россия.com or 普遍接受-测试.co.uk is not, and neither would 普遍接受-测试.РФ or Россия.世界

    You can easily treat cyrillic, CJK or arabic scripts as “first class citizens” at their respective TLDs and at the same time immediately throw up a red flag with extreme prejudice if there’s a cyrillic character in the english TLDs.

    • Firefox *is* supposed to show punycode when alphabets are mixed, just like Pete suggested here!

      https://wiki.mozilla.org/IDN_Display_Algorithm#Algorithm

      It’s really really weird that са.com does not show up as punycode, it’s a bug! Report this to the bugzilla instead of writing blog posts 😉

      • Not a bug. Well, actually it is, but Firefox refuses to fix it. Both the “c” and “a” in that domain name are cyrillic characters; they’re not mixed script.

  25. Seems the rendering is very different from font to font.
    Switching back to FixedSys – its quite obvious that the ca in the url is not really ca.
    So lets all switch back to fixed character spacing green screen terminals and dial up BBSs.
    No more URLs. No more SPAM. No more pop up ads either.
    Don’t we miss the good old days ?

  26. Thanks Brian.
    Long time Mozilla (since Netscape Navigator days!) user, and this is a surprise to me.
    So, thanks for sharing.
    (off to fix mom and dad’s browsers)

  27. Thank you, Brian, for the article.

    I just wanted to say though that I completely agree with Mozilla’s stance on this. The punicode is shown when you hover over the link. The TOR Browser should IMHO always show the punicode version.

  28. Henry S. Winokur

    FF Quantum is my default browser. I have just started using NoScript again, after it was updated to handle Quantum and even though FF did not present the address correctly–as Brian noted, it comes up as ca.com–NoScript does show the “xn--80a7a.com” as an item that can be blocked or enabled. Certainly the page doesn’t work until you enable it in NoScript.

    Following Brian’s directions I changed the way FF handles Unicode because I want to know beforehand, so I don’t “step in” anything.

  29. While punnycode exploits pose a threat the current mitigation’s limit the likelihood of impact. While on the other hand, domain fronting which is used for legitimate reasons is also being used for nefarious efforts and can fool users quite easily. Honestly, how many people have a baseline of certificates and validate the hash of every site they visit? Not many. Brian Krebs and others who read this blog, you haven’t researched what domain fronting is you should.