| 10.8. Evasion TechniquesIntrusion detection systems (IDSs) are an integral part of web application security. In Chapter 9, I introduced web application firewalls (also covered in Chapter 12), whose purpose is to detect and reject malicious requests. Most web application firewalls are signature-based. This means they monitor HTTP traffic looking for signature matches, where this type of "signature" is a pattern that suggests an attack. When a request is matched against a signature, an action is taken (as specified by the configuration). But if an attacker modifies the attack payload in some way to have the same meaning for the target but not to resemble a signature the web application firewall is looking for, the request will go through. Techniques of attack payload modification to avoid detection are called evasion techniques. Evasion techniques are a well-known tool in the TCP/IP-world, having been used against network-level IDS tools for years. In the web security world, evasion is somewhat new. Here are some papers on the subject: 
 10.8.1. Simple Evasion TechniquesWe start with the simple yet effective evasion techniques: 
 10.8.2. Path ObfuscationMany evasion techniques are used in attacks against the filesystem. For example, many methods can obfuscate paths to make them less detectable: 
 10.8.3. URL EncodingSome characters have a special meaning in URLs, and they have to be encoded if they are going to be sent to an application rather than interpreted according to their special meanings. This is what URL encoding is for. (See RFC 1738 at http://www.ietf.org/rfc/rfc1738.txt and RFC 2396 at http://www.ietf.org/rfc/rfc2396.txt.) I showed URL encoding several times in this chapter, and it is an essential technique for most web application attacks. It can also be used as an evasion technique against some network-level IDS systems. URL encoding is mandatory only for some characters but can be used for any. As it turns out, sending a string of URL-encoded characters may help an attack slip under the radar of some IDS tools. In reality, most tools have improved to handle this situation. Sometimes, rarely, you may encounter an application that performs URL decoding twice. This is not correct behavior according to standards, but it does happen. In this case, an attacker could perform URL encoding twice. The URL: http://www.example.com/paynow.php?p=attack becomes: http://www.example.com/paynow.php?p=%61%74%74%61%63%6B when encoded once (since %61 is an encoded a character, %74 is an encoded t character, and so on), but: http://www.example.com/paynow.php?p=%2561%2574%2574%2561%2563%256B when encoded twice (where %25 represents a percent sign). If you have an IDS watching for the word "attack", it will (rightly) decode the URL only once and fail to detect the word. But the word will reach the application that decodes the data twice. There is another way to exploit badly written decoding schemes. As you know, a character is URL-encoded when it is represented with a percentage sign, followed by two hexadecimal digits (0-F, representing the values 0-15). However, some decoding functions never check to see if the two characters following the percentage sign are valid hexadecimal digits. Here is what a C function for handling the two digits might look like: unsigned char x2c(unsigned char *what) {    
    unsigned char c0 = toupper(what[0]);
    unsigned char c1 = toupper(what[1]);
    unsigned char digit;
   
    digit = ( c0 >= 'A' ? c0 - 'A' + 10 : c0 - '0' );
    digit = digit * 16;
    digit = digit + ( c1 >= 'A' ? c1 - 'A' + 10 : c1 - '0' );
   
    return digit;
}This code does not do any validation. It will correctly decode valid URL-encoded characters, but what happens when an invalid combination is supplied? By using higher characters than normally allowed, we could smuggle a slash character, for example, without an IDS noticing. To do so, we would specify XV for the characters since the above algorithm would convert those characters to the ASCII character code for a slash. The URL: http://www.example.com/paynow.php?p=/etc/passwd would therefore be represented by: http://www.example.com/paynow.php?p=%XVetc%XVpasswd 10.8.4. Unicode EncodingUnicode attacks can be effective against applications that understand it. Unicode is the international standard whose goal is to represent every character needed by every written human language as a single integer number (see http://en.wikipedia.org/wiki/Unicode). What is known as Unicode evasion should more correctly be referenced as UTF-8 evasion. Unicode characters are normally represented with two bytes, but this is impractical in real life. First, there are large amounts of legacy documents that need to be handled. Second, in many cases only a small number of Unicode characters are needed in a document, so using two bytes per character would be wasteful. UTF-8, a transformation format of ISO 10646 (http://www.ietf.org/rfc/rfc2279.txt) allows most files to stay as they are and still be Unicode compatible. Until a special byte sequence is encountered, each byte represents a character from the Latin-1 character set. When a special byte sequence is used, two or more (up to six) bytes can be combined to form a single complex Unicode character. One aspect of UTF-8 encoding causes problems: non-Unicode characters can be represented encoded. What is worse is multiple representations of each character can exist. Non-Unicode character encodings are known as overlong characters, and may be signs of attempted attack. There are five ways to represent an ASCII character. The five encodings below all decode to a new line character (0x0A): 0xc0 0x8A 0xe0 0x80 0x8A 0xf0 0x80 0x80 0x8A 0xf8 0x80 0x80 0x80 0x8A 0xfc 0x80 0x80 0x80 0x80 0x8A Invalid UTF-8 encoding byte combinations are also possible, with similar results to invalid URL encoding. 10.8.5. Null-Byte AttacksUsing URL-encoded null bytes is an evasion technique and an attack at the same time. This attack is effective against applications developed using C-based programming languages. Even with scripted applications, the application engine they were developed to work with is likely to be developed in C and possibly vulnerable to this attack. Even Java programs eventually use native file manipulation functions, making them vulnerable, too. Internally, all C-based programming languages use the null byte for string termination. When a URL-encoded null byte is planted into a request, it often fools the receiving application, which happily decodes the encoding and plants the null byte into the string. The planted null byte will be treated as the end of the string during the program's operation, and the part of the string that comes after it and before the real string terminator will practically vanish. We looked at how a URL-encoded null byte can be used as an attack when we covered source code disclosure vulnerabilities in the "Source Code Disclosure" section. This vulnerability is rare in practice though Perl programs can be in danger of null-byte attacks, depending on how they are programmed. 
 Null-byte encoding is used as an evasion technique mainly against web application firewalls when they are in place. These systems are almost exclusively C-based (they have to be for performance reasons), making the null-byte evasion technique effective. Web application firewalls trigger an error when a dangerous signature (pattern) is discovered. They may be configured not to forward the request to the web server, in which case the attack attempt will fail. However, if the signature is hidden after an encoded null byte, the firewall may not detect the signature, allowing the request through and making the attack possible. To see how this is possible, we will look at a single POST request, representing an attempt to exploit a vulnerable form-to-email script and retrieve the passwd file: POST /update.php HTTP/1.0 Host: www.example.com Content-Type: application/x-form-urlencoded Content-Length: 78 firstname=Ivan&lastname=Ristic%00&email=ivanr@webkreator.com;cat%20/etc/passwd A web application firewall configured to watch for the /etc/passwd string will normally easily prevent such an attack. But notice how we have embedded a null byte at the end of the lastname parameter. If the firewall is vulnerable to this type of evasion, it may miss our command execution attack, enabling us to continue with compromise attempts. 10.8.6. SQL EvasionMany SQL injection attacks use unique combinations of characters. An SQL comment --%20 is a good example. Implementing an IDS protection based on this information may make you believe you are safe. Unfortunately, SQL is too versatile. There are many ways to subvert an SQL query, keep it valid, but sneak it past an IDS. The first of the papers listed below explains how to write signatures to detect SQL injection attacks, and the second explains how all that effort is useless against a determined attacker: 
 "Determined attacker" is a recurring theme in this book. We are using imperfect techniques to protect web applications on the system administration level. They will protect in most but not all cases. The only proper way to deal with security problems is to fix vulnerable applications. |