The Phishing Guide: URL Obfuscation Attacks

The secret for many phishing attacks is to get the message recipient to follow a hyperlink (URL) to the attacker’s server, without them realising that they have been duped.

Unfortunately phishers have access to an increasingly large arsenal of methods for obfuscating the final destination of the customer’s web request.

The most common methods of URL obfuscation include:

• Bad domain names

• Friendly login URL’s

• Third-party shortened URL’s

• Host name obfuscation

• URL obfuscation

Bad Domain Names

One of the most trivial obfuscation methods is through the purposeful registration and use of bad domain names. Consider the financial institute MyBank with the registered domain and the associated customer transactional site

The Phisher could set up a server using any of the following names to help obfuscate the real destination host:



• or even http://privatebanking.mybá


It is important to note that as domain registration organisations move to internationalise their services, it is possible to register domain names in other languages and their specific character sets. For example, the Cyrillic “o” looks identical to the standard ASCII “o” but can be used for different domain registration purposes – as pointed out by a company who registered in Russia a few years ago. Finally, it is worth noting that even the standard ASCII character set allows for ambiguities such as upper-case “i” and lower-case “L”.

Friendly Login URL’s

Many common web browser implementations allow for complex URL’s that can include authentication information such as a login name and password. In general the format is URI://username:password@hostname/path. Phishers may substitute the username and password fields for details associated with the target organisation.

For example the following URL sets the username =, password = ebanking and the destination hostname is

This friendly login URL can successfully trick many customers into thinking that they are actually visiting the legitimate MyBank page. Because of its success, many current browser versions have dropped support for this URL encoding method.

Third-party Shortened URL’s

Due to the length and complexity of many web-based application URLs – combined with the way URL’s may be represented and displayed within various email systems (e.g. extra spaces and line feeds into the URL) – third-party organisations have sprung up offering free services designed to provide shorter URL’s.

Through a combination of social engineering and deliberately broken longs or incorrect URL’s, Phishers may use these free services to obfuscate the true destination. Common free services include and For example:

Host Name Obfuscation

Most Internet users are familiar with navigating to sites and services using a fully qualified domain name, such as For a web browser to communicate over the Internet, this address must to be resolved to an IP address, such as for This resolution of IP address to host name is achieved through domain name servers.

A Phisher may wish to use the IP address as part of a URL to obfuscate the host and possibly bypass content filtering systems, or hide the destination from the end user.

For example the following URL: could be obfuscated such as:

While some customers are familiar with the classic dotted-decimal representation of IP addresses (, most are not familiar with other possible representations. Using these other IP representations within an URL, it is possible obscure the host destination even further from regular inspection.

Depending on the application interpreting an IP address, there may be a variety of ways to encode the address other than the classic dotted-decimal format. Alternative formats include:

Dword – meaning double word because it consists essentially of two binary “words” of 16 bits; but it is expressed in decimal (base 10),

Octal – address expressed in base 8, and

Hexadecimal – address expressed in base 16.

These alternative formats are best explained using an example. Consider the URL, resolving to This can be interpreted as:


Dwordhttp:// 3532038435/

Octal http://0322.0206.0241.0043/

Hexadecimal – http://0xD2.0x86.0xA1.0x23/ or even http://0xD286A123/

• In some cases, it may be possible to mix formats (e.g. http://0322.0×86.161.0043/).

URL Obfuscation

To ensure support for local languages in Internet software such as web browsers and email clients, most software will support alternate encoding systems for data. It is a trivial exercise for a Phisher to obfuscate the true nature of a supplied URL using one (or a mix) of these encoding schemes. These encoding schemes tend to be supported by most web browsers, and can be interpreted in different ways by web servers and their custom applications.

Typical encoding schemes include:

Escape Encoding – Escaped-encoding, or sometimes referred to as percentencoding, is the accepted method of representing characters within a URL that may need special syntax handling to be correctly interpreted. This is achieved by encoding the character to be interpreted with a sequence of three characters. This triplet sequence consists of the percentage character “%” followed by the two hexadecimal digits representing the octet code of the original character.

For example, the USASCII character set represents a space with octet code 32, or hexadecimal 20. Thus its URL-encoded representation is %20.

• Unicode Encoding – Unicode Encoding is a method of referencing and storing characters with multiple bytes by providing a unique reference number for every character no matter what the language or platform.

It is designed to allow a Universal Character Set (UCS) to encompass most of the world’s writing systems. Many modern communication standards (such as XML, Java, LDAP, JavaScript, WML, etc.), operating systems and web clients/servers use Unicode character values.

Unicode (UCS-2 ISO 10646) is a 16-bit character encoding that contains all of the characters (216 = 65,536 different characters total) in common use in the world’s major languages. Microsoft Windows platforms allow for the encoding of Unicode characters in the following format – %u0000 – for example %u0020 represents a space, while %u01FC represents the accented Ǽ and %uFD3F is an ornate right parenthesis.

• Inappropriate UTF-8 Encoding – One of the most commonly utilised formats, Unicode UTF-8, has the characteristic of preserving the full US-ASCII character range. This great flexibility provides many opportunities for disguising standard characters in longer escape-encoded sequences. For example, the full stop character “.” may be represented as %2E, or %C0%AE, or %E0%80%AE, or %F0%80%80%AE, or %F8%80%80%80%AE, or even %FX%80%80%80%80%AE.

• Multiple Encoding – Various guidelines and RFC’s carefully explain the method of decoding escape encoded characters and hint at the dangers associated with decoding multiple times and at multiple layers of an application. However, many applications still incorrectly parse escape-encoded data multiple times.

Consequently, Phishers may further obfuscate the URL information by encoding characters multiple times (and in different fashions). For example, the back-slash “\” character may be encoded as %25 originally, but could be extended to: %255C, or %35C, or %%35%63, or %25%35%63, etc.


Everything contained within this website is strictly provided for entertainment purposes only.


The website owner does not support ANY information posted on this website.

Nothing contained within this site should be construed as legal, medical, or any other professional advice, on any subject matter. does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. You are taking full responsibility for your actions.  A visitor to this site uses the site at his or her own risk.


Get Updates of New Posts on Agora Road!

No Comments Yet

Leave a Reply

Your email address will not be published.