Improper input validation
What's the first rule of input validation? Don't trust user input.
JavaScript
What is improper input validation?
Untrusted input is any data that is provided by an external source. The most common source of untrusted input into a web application is user input —- such as form data, query strings, and POST requests. However, any data that is not generated or managed by our application should also be treated as untrusted input. This includes cookies, external API responses, transferred files, and many more.
Untrusted input has the ability to interact with our application and modify its execution flow. This is a fundamental system requirement —- a decision is made, and an action is performed based on the input to the application.
When an application receives untrusted input that it does not expect, such as incorrect format or its value being outside of an expected range, unintended consequences can occur if this input is not validated correctly. Exploitation can occur when this untrusted input has been generated for malicious purposes. The vulnerability that allows this to happen is "improper input validation".
About this lesson
In this lesson, you will learn about vulnerabilities stemming from improper input validation and how to protect your applications against them. We will step into the shoes of a hacker who exploits BigCorp Hat Co's website for personal gain, all made possible by improper input validation.
We'll show you how to safeguard your applications from untrusted input.
Improper Input validation occurs when untrusted input is not validated for Syntactical and Semantic correctness.
Syntactical validation ensures that the input data is in the correct format (or syntax) that the application expects. For example, a transaction ID should be in Globally Unique Identifier (GUID) format, or a dollar value should be a Number.
If syntactical input validation is not implemented, an attacker can send any form of data to the application to be processed, for example, an SQL injection payload where only a number should be expected.
Semantic validation ensures that the input data is correct in its business context. For example, a start date is before the end date, or a dollar amount is a positive number greater than 0 but less than 1,000,000.
If semantic input validation is not implemented, an attacker can send data to the application which does not adhere to its expected business context. For example, birth date in the future or a negative account on an account withdrawal.
In addition to syntactical and semantic input validation, we need to pay close attention to where within our application the input validation is occurring. Input validation performed on the client side is trivially bypassable and should NEVER be used as a security control. Client-side input validation is ONLY for usability (helping the user complete a form).
Instead, input validation MUST be implemented on the server side before any application function acts on the data.
Posting to the BigCorp Hat Co /basket endpoint may look something like this:
While the quantity input is being validated syntactically, it is not being validated semantically. A negative quantity should not be allowed. This improper input validation allowed Penelope to set a negative value for the quantity of the items in her shopping basket, giving her a negative total value resulting in what is effectively store credit.
What is the impact of improper input validation?
Improper input validation enables an attacker to affect the behavior of an application, resulting in unintended execution flow, data manipulation, or even malicious code execution.
It can be challenging to quantify the impact of improper input validation as it is the initial attack vector for many other vulnerability classes. Improper input validation can lead to SQL injection, OS command injection, cross-site scripting (XSS), denial of service (DoS), buffer overflow, remote code execution (RCE), and many other categories of exploitation.
Effective input validation controls reduce an application's attack surface by placing restrictions (syntactical and semantic validation) on the application's untrusted input.
For basic data types, such as JavaScipt's built-in objects, such as JavaScript Numbers, syntactic validation can be performed by simply checking the variable's type. The Number.isInteger()
method determines whether the passed value is an integer.
For simple validation, basic functions like the one below can be written. This one checks if a number is within a range:
However, validating more complex data types can become complicated quickly. Implementing your own input validation is a tedious and error-prone process. You only need to see some of the complex Regular Expressions (RegEx) available to validate an email address to realize this is not something you should be manually implementing.
Luckily, many input validation libraries and packages are available that take the hard work out of it. A widely used JavaScript library is validator.js. Validator.js has many input functions for validating a wide variety of untrusted inputs.
For example:
- isEmail
- isAlphanumeric
- isAscii
- isCreditCard
- isCurrency
- isDate
- isFQDN
- isJWT
- isLength
- any many more
Many validation libraries (including validator.js) allow you to define your own regular expression or logic for custom validation to enable you to validate your own custom data types and application or business-specific semantic validation.
Keep learning
There's more to learn! Check out these links:
- The always usefull OWASP input validation cheat sheet
- Take a direct loook at CWE-20: Improper input validation