SQL injection (SQLi)
Improper handling of input during SQL query generation
Select your ecosystem
What is SQL injection?
SQL injection (or SQLi) is one of the most widespread code vulnerabilities. To perform a SQL injection attack, an attacker inserts or "injects" malicious SQL code via the input data of the application. SQL injection allows the attacker to read, change, or delete sensitive data as well as execute administrative operations on the database.
About this lesson
In this lesson, you will learn how SQL injection works and how to protect your code against it. We will begin by using SQL injection to bypass the login screen of a vulnerable web application. We will then dive deeper into the code of that vulnerable application and explain why the SQL injection attack was effective. Finally, we will teach you how to fix and prevent SQL injection in your applications.
To see how easy it is to perform a SQL injection attack, let's play with a vulnerable web application–the login screen of a fictional company called startup.io. We will attempt to bypass their login screen by supplying malicious input to the password field.
Let's start with a valid email and password. The form has already been prepopulated with their correct email and password.
Try it out. After you press the login button, you should see a successful login message. If you type in anything else, you'll get a failed login attempt. This all makes sense so far.
Strange input, strange errors
It’s time to start hacking! Change the password field to evil' password
. You are probably expecting to see a standard login failure message again. But, surprisingly, the app displays:
Incorrect syntax near il’
In fact, you may recognise this error message. Remember the last time you tried to run a malformed SQL query? You likely saw a similar error message returned by the database client you were using at the time. But why are we seeing it as the error of a web application?
The error implies that the input password (evil' password
) was executed as part of a backend SQL query. The apostrophe character which we placed in evil’
made the syntax of that query invalid.
This is excellent news for us: an input string that we control is directly injected into some backend SQL code! Equipped with that knowledge, we can achieve something much more malicious than causing a harmless error.
Bypass login
Now that we know the app’s backend will execute anything we enter into the password field let’s fully exploit it. We’ll use a very particular password: idontknow' or 1=1; --
.
Try to log in with the password above and our standard email (user1@startup.io
).
Bingo! Our new password is obviously incorrect, but we see a successful login message. Congratulations, you’ve just hacked the application! You supplied a peculiar input string that you knew would be injected and run in a SQL query. By doing so, you managed to bypass the login validation of the app. You’ve just executed a SQL injection attack.
But why was it successful? Unlike the previous example, our payload contains a valid SQL (or 1=1; --
) after the apostrophe character. In a nutshell, this changed the meaning of the query which is used to decide if users can log in or not. Read on to discover how it actually worked under the hood! We will do a deep dive on both the backend code and the query in the next section. But first, let’s try one more hack.
Delete the database
Change the password to idontknow'; DROP TABLE credentials; --
and press the login button. You should see an error message:
Email user1@startup.io does not exist.
What is going on here? Our email field has not changed but the application now claims that the email does not exist.
Did you notice the DROP TABLE
credentials part of the password we just supplied? It gives a big hint to what just happened. You've managed to delete all the credential data from the application—the emails and passwords of all users!
What happened when we hacked the login page?
We used the quote character ('
) in the password field to inject an unintended string into the query in order to change the query's logic. Using other metacharacters, such as -
or ;
, is also possible.
Let's look at this example in more detail. First, consider the backend logic of startup.io’s login functionality. The following code sample shows a function which aims to verify the user's credentials.
How does string manipulation work?
The key here is the construction of the query through string concatenation. Let's try to inject the malicious password which we use to bypass the login into the query. After the string concatenation is done, the WHERE
clause of the query becomes:
WHERE email='user1@startup.io'AND password='idontknow' OR 1=1;
If we consider the following:
AND
has precedence overOR
(true for most SQL implementations)password='idontknow'
isFALSE
for the given email1=1
is alwaysTRUE
We can reduce the WHERE
clause as follows:
-> email='user1@startup.io' AND password='idontknow' OR 1=1 -> (TRUE AND FALSE) OR TRUE -> FALSE OR TRUE -> TRUE
By injecting malicious input into the query, we managed to change the query's logic! The query will evaluate as TRUE
even when the supplied password is invalid. Since the query result is used to decide if a given user should be allowed to log in, we’ve bypassed the application's authentication mechanism.
What is the impact of SQL injection?
Now that you know how SQL injection works, imagine the implications. By using SQL injection, an attacker can change the logic of the vulnerable query. They can read or modify any data stored in the database. If the vulnerable query is executed as a privileged user, the attacker can run any administrative operations, including deleting the entire database schema.
Why is SQL injection common?
SQL injection is widespread because it is easily detected and exploited. Any website or application which uses a database is subject to a SQL injection attack. In the past, almost all popular SQL clients and libraries were vulnerable to SQL injection attacks.
How do you mitigate SQL injection?
To protect your app against SQL injection, you need to prevent user-supplied input from affecting the logic of the executed query. One way to achieve that is through parameterized queries which almost all SQL libraries support. In a parameterized query, the input is "escaped" and handled literally.
For instance, if our example used a parameterized query, and you were to insert ' OR 1=1'
as a password input to that query, a database client would not interpret that input as SQL. Instead, it would look for a password string ' OR 1=1'
.
Parameterized queries
We can fix our vulnerable checkLogin
function by using a parameterized query. Take a look at the diff below, which compares the vulnerable code with the fixed code. Notice the use of ? to indicate the location of the query parameters. The parameters are passed as an input to the query
method. Internally, the query
method ensures that the input parameters are interpreted literally and not as separate SQL statements.
Most SQL Frameworks and libraries - whether for MySQL, Postgres, Oracle or your SQL flavor of choice - support parameterized queries. You should aim to use them by default, and avoid any methods in your framework that use string concatenation.
Test your knowledge!
Keep learning
To learn more about SQL injection, check out some other great content produced by Snyk:
- The SQL injection cheat sheet which lists other methods of preventing SQL injection in your code
- Our blog post on SQL injection in ORM for more information on a different prevention method (ORM) and why sometimes it is not enough to mitigate SQL injection
- Snyk OWASP Top 10 for an overview of other common vulnerabilities present in modern web applications