Republished with permission from WatchGuard Technologies, Inc.

WatchGuard


Anatomy of a Cross-Site Scripting Attack

by David M. Piscitello, President, Core Competence, Inc.

The Hydra was a terrifying mythical beast with many heads. Greek mythology records it as one of Hercules' most formidable opponents. The Hydra seemed impossible to defeat: each time Hercules cut off one of the Hydra's heads, a new head grew to replace it. 

Cross-site scripting is the Web corollary of the Hydra, and like the mythological creature, the Web Hydra has many heads. Cross-site scripting attacks are perpetrated through Web browsers facilitated by poorly written Web applications. No vendor operating system, Web server, or browser can claim immunity from cross-site scripting, largely because the root cause of the problem lies elsewhere. Attackers don't need to be really clever or particularly selective to succeed with a cross-site scripting attack: casual reviews of well known Web sites show many are vulnerable to cross-site scripting.

Attack Fundamentals

Many attacks could be categorized as cross-site scripting. Fundamentally, however, cross-site scripting attacks share the following characteristics:

  1. An attacker inserts malicious data into a hyperlink. (We'll discuss what "malicious" means in a moment.)
  2. The attacker convinces a user to open this hyperlink. This isn't as difficult as it might seem. Hyperlinks are embedded in e-mail and instant messages daily. Images are hyperlinked to URLs on nearly every Web page on earth. Cross-site scripting succeeds largely because users tend to trust URLs (or don't understand them), and rarely parse a URL before they click on a hyperlink. It's relatively easy for an attacker to create a hyperlink that professes to invite you to browse at Amazon.com, but in reality visits the attacker's Web site. 

  3. Having attracted the user to his site, the attacker posts a page to the innocent user's Web site that might look and feel like Amazon.com, but in truth is the attacker's site. In this example, if the user logs in to what he or she believes is Amazon, or enters credit card data, the attacker can capture all that information.

The attacker has provided a link to another site with an embedded script. This is why it is called cross-site scripting, because it crosses sites. It's also quite easy and common for the malicious data in the hyperlink itself to gather data from the user; a malicious script, for example, can gather a user's Web cookie, then send that cookie to the attacker (Visit The Cross Site Scripting FAQ for several examples). If the cookie happens to contain account information from, say, E*Trade or another financial institution, the attacker has collected some pretty sensitive information!

A Clear and Present Danger

What makes cross-site scripting so pervasive is the widespread use of dynamic Web content. Dynamic Web content is a very general concept, but it is essentially anything that allows a Web site to provide interactive sessions with visitors. Web designers use programs (scripts) to perform actions such as processing the input from forms for searches, guest books, e-transactions, and more. Such dynamic Web content adds considerable value and function to Web sites. It is also the root of much of the angst cross-site scripting causes.

Many Web designers are not "secure code aware" programmers, and many organizations don't invest the time and effort to verify that scripts are secure before putting them into production. The consequence of this haste-to-market approach is woefully evident: poorly written scripts, especially those that do not filter input, are easily "tricked" by attackers into performing entirely unintended and potentially malicious functions.

So many forms of cross-site scripting exist that it's impossible to explain them all. However, cross-site scripting attacks typically rely on the fact that a Web designer has failed to consider what actions the Web server or browser may take if the text that users type into a form (for example, requesting a name and address) is not the expected alphanumeric characters, but is one or more HTML tags, or a rogue script (JavaScript, VBScript, ActiveX, PERL, etc.). The list of nefarious acts that cross-site scripts facilitate includes account hijacking, cookie theft or poisoning, denial of service, and defacement/modification of Web appearance.

Examples of Malicious Cross-site Scripting Attacks

Let's look at some malicious and sobering examples. In October 2001, a security expert reported two cross-site scripting flaws in a popular log analysis tool. The purpose of the tool is to report information from your log files in a nicely organized and easier to understand format. To use it, you feed the tool your log files, and it uses them to generate pretty HTML reports of your Web server's activity. To make the log files more understandable, the log analysis tool also finds IP addresses in the log entries and converts them into their domain names. That way, instead of reading your Web server logs and asking yourself, "Who is 63.251.168.47?" you can read the report and say, "Wow, we're getting a strange amount of traffic from Akamai.com."

The log analysis tool translates IP addresses into domain names by doing a Domain Name Service (DNS) reverse lookup. To understand this attack, you must distinguish between forward DNS lookups, and reverse DNS lookups. A forward lookup works like the telephone book: you have the name, but you need the number. A reverse lookup works like the phone listings that telemarketing firms use: you have the number, but you need the name. The flaw that makes this log analysis tool susceptible to cross-site scripting attacks is that when it requests reverse lookups from a DNS server, the tool assumes that whatever comes back in response is legitimate. Bad assumption.

Here's how an attacker exploits this assumption. First, he designates one of his PCs as the authoritative DNS server for some domain he owns, let's say, foo.com. Let's imagine its IP address is 1.1.1.1. He then sets up his DNS server so that when reverse-lookup DNS requests come in ("Who is 1.1.1.1?"), instead of returning the answer, "foo.com," the server sends back a string of HTML code containing an evil script. (Note that to avoid being traced, most attackers would compromise someone else's DNS server, but I'm intentionally keeping this example simple.)

Next, the attacker finds a site that uses the flawed log analysis tool (a simple discovery to make through social engineering, or by lurking on mailing lists where such topics might be discussed). Once he has identified a potential victim, he sends packets to the victim's Web server from 1.1.1.1. It doesn't matter whether the victim site accepts or denies these packets; the point is to get the poisoned IP address into a log entry on the victim's server.

The bomb is planted. The attacker now waits for the Web administrator to run the log analysis tool on the log file containing his IP address. The log analysis tool attempts to resolve the IP address via the compromised DNS server. Since the log analyzer doesn't check the value returned by the local DNS resolver library -- after all, it expects to receive text, and sure enough, it receives text -- the HTML tags embedded in the booby-trapped DNS name are inserted without modification into the generated HTML reports. How malicious this attack might be, and what malicious scripts the attacker includes, is only limited by the amount and kind of HTML the attacker chooses to embed in the DNS name. A successful attack means the attacker's HTML code has the same permissions as the user running the log analysis tool, so it's not a stretch to imagine the attacker attaining the permissions of the Web administrator. His script could do anything from deleting the entire Web site to modifying Web pages, creating false log entries, or even deleting selected log files -- a great way for the attacker to hide the presence of a Trojan horse he might have loaded onto the victim system.

Any browser used to access the compromised HTML reports may inadvertently execute the HTML tags. The tags may simply display an embarrassing image on the browser, or they may attempt to glean information from the user or his computer (remember, the user in many cases trusts this server). This particular attack can wreak havoc on the server itself. Recall that the HTML reports are written to disk, so the embedded HTML tags are now stored on the Web server's file system. Storing the script does not represent a danger, but subsequent processing of the script by anyone viewing the reports might be. For example, the scripts might contain poisoned data, or perhaps false logging information that might interfere with an audit.

Although DNS resolver libraries in many operating systems now filter host names containing HTML meta-characters, there are probably plenty of hosts out there running old libraries. Chances are that the same site that makes no effort to evaluate scripts for vulnerabilities also fails to update libraries.

Defeating the Hydra: Remedies and Emerging Best Practices

Hercules defeated the Hydra by cauterizing the wounds he inflicted each time he beheaded the beast. For the moment, severing and cauterizing every cross-site scripting vulnerability the Web Hydra generates consists of adopting the following best practices.

Advise your users to:

  • Be suspicious when visiting sites that require Java or Java script. Not all of them are benign
  • Examine the display line at the bottom of your browser window before you click on links. Be particularly suspicious of hexadecimally encoded hyperlinks. (http://325696539)

Advise your Web programmers to:

  • Verify the origin of a script before you modify or use it.
  • Refrain from implicitly trusting a script you download, beg, or borrow from the Web or a friend. My colleague, Rik Farrow warns "even trustworthy [script source] sites have been hacked and the scripts/sources modified. Origin means nothing.
  • Review all scripts to see that they filter problematic input characters such as HTML tags, Unicode characters or, in general, any character that might cause your underlying operating system to perform an unexpected function (such as a wildcard, or "pipe" character, as explained in Rik Farrow’s previous editorial on this topic.) Only accept characters that are expected. Test all scripts thoroughly before you put them into production.

  • If possible, write scripts with programming languages like PERL, that let you restrict operations when user-provided input is processed. With a feature like PERL's taint option, you can prevent input to a script from opening a file, or executing a system call or command.

  • If you are offering dynamic pages, use server-side includes conservatively, or not at all. Serve- side includes allow the execution of arbitrary external programs to produce a Web page: if compromised, "arbitrary" might be an unintended or malicious program.

In the general area of Web server security:

  • Restrict or disable external program execution in your Web server configuration and restrict the Web server's ability to execute external programs at the OS level.
  • Run the most recent versions of Web servers, and any patches that correct server specific cross-site scripting vulnerabilities. Patch MS02-018, for example, mitigates a Cross-site Scripting in Custom 404 Error Page Vulnerability in IIS. All versions later than Apache 1.3.11 mitigate early, server-specific cross-site scripting vulnerabilities. 

My list summarizes a handful of the practices recommended in the additional resources I list at the end of this column, and is not exhaustive. Unfortunately, no list of remedies at this time is exhaustive. Cross-site scripting is an ugly, unpleasant reality Web administrators can't entirely avoid or mitigate when they must offer dynamic pages. But if you are careful with scripting at your site, you have a good chance of minimizing this threat.

Additional Resources

The Cross Site Scripting FAQ

Security Focus tips on developing secure Web sites

CERT® Advisory CA-2000-02 Malicious HTML Tags Embedded in Client Web Requests

Configure the Web server to minimize the functionality of programs, scripts, and plug-ins, a practice from the CERT Security Improvement Module

How To Remove Meta-characters From User-Supplied Data In CGI Scripts

Cross Site Scripting Info, Apache.org

Cross Site Scripting Info: Apache Specific

Cross Site Scripting Executive Overview, Microsoft Corporation

Hacking with JavaScript: Stealing Cookies 

Microsoft Internet Explorer Cross Site Scripting Vulnerability

OWASP's Guide to Building Secure Web Applications and Web Services

Copyright© 2002, WatchGuard Technologies, Inc. All rights reserved. WatchGuard, LiveSecurity, Firebox and ServerLock are trademarks or registered trademarks of WatchGuard Technologies, Inc. in the United States and other countries.



Copyright © 1996 - 2002 WatchGuard Technologies, Inc. All rights reserved.