Exploiting XSS in Ajax Web Applications
Following up on yesterdays post Pluck SiteLife software multiple XSS vulnerabilities, let's take a look at how to exploit XSS in JSON responses using Internet Explorer.
Quick introduction to JSON
JSON is a model for encoding data, used by many web applications that want to serve dynamic or updating content within a single web page. It's formatted like so:
{"parameter":"value","next_parameter":"next_value"}
Using a technique called Ajax, JSON data is normally transferred behind the scenes as a web page is loading. Some people may that realize that because Ajax uses the standard HTTP protocol, it's possible to access JSON data directly by navigating the web browser to a specific URL. An example of this is the Twitter API, which allows me to construct a URL that provides a JSON encoded version of my Twitter profile and my last tweet. The JSON code in the response can be accessed directly or used with embedded scripts to display inline information.
<textarea id="nerds" style="width:700; height:34" disabled="true"></textarea>
<script>function callback(twitters){document.getElementById("nerds").value=twitters[0].text}</script>
<script src="https://api.twitter.com/1/statuses/user_timeline.json?include_entities=true&include_rts=true&screen_name=superevr&count=1&callback=callback"></script>
Websites using JSON without proper output encoding are likely to be vulnerable to XSS
Like any other web page, JSON responses are likely to reflect back the values they are given. This becomes problematic when the response contains HTML syntax and characters. Web browsers are designed to render HTML, and as soon as they see it they want to render the code into an image, or a link, or a form field as quickly as possible. When testing for XSS, I inject sample code like the HTML strikeout tag <s >
into one of the request parameters, and see if the browser displays text with a line through it. If it does, then that is a pretty good indication of a cross-site scripting vulnerability.
The catch
In a clever attempt to prevent browsers from incorrectly rendering JSON code, the web server presents these pages with a special Content-Type of application/json or application/x-javascript. This tells the browser that it shouldn't render any code here because it has a special use. Unfortunately, this isn't enough. 1
Content Sniffing for HTML in Internet Explorer
But web browsers really do love rendering code, and will mark-up HTML regardless of the content-type if you give them a good enough excuse. This is called content sniffing, and can be used by attackers in different scenarios to cause malicious JavaScript to run on a website that was thought to be immune to attack. Here are two facts on content sniffing that hackers already know about:
- Internet Explorer relies heavily on the file extension when content sniffing.
- File extensions can be spoofed by the requestor
This means that user/json will be displayed as plaintext, but user/json.htm can render as HTML! Depending on the web server, there are a several ways to spoof the file extension. A few examples:
- /json.htm
- /json.html
- /json/.html (PHP and Asp.NET applications)
- /json;.html (JSP applications) (see three semicolon vulnerabilities)
- /json.cgi?a.html (discovered by Hasegawayosuke)
Trouble Shooting
Content sniffing is not always that easy. Here are some factors that may basic tests for content sniffing2 :
- Unable to add arbitrarily file extensions in the URL path
- Site is using HTTPS
- Site has headers for cache-control: no-cache or pragma: no-cache
- Site has header content-disposition: attachment
- Site Content-Type header is set to image/[anything]
Remediation
To protect against this type of vulnerability, several changes must be made. As always, programs should first validate that user input contains appropriate text characters. Also, any time user input is reflected back to a web browser, that text should be encoded properly (e.g. replace <
with proper unicode escapes like <
or \x3c
\u003C
). Finally, as an extra protection measure, have the web server include the additional header X-Content-Type-Options: nosniff to prevent content sniffing in Internet Explorer 8+ and other browsers.
JSON is generally designed to be processed in the background by JavaScript, so I understand why developers forget or are unaware of the possible consequences that could happen when the JSON data is accessed directly. Hopefully this post can raise awareness of possible security issues.