Tag Archives: mistakes

Common mistakes when using input validation and how to avoid them

Input validation is a widely adopted technique in software development to ensure proper user input data to be processed by the system and prevent malformed data from compromising your system. If a robust input validation method is adopted, input validation can significantly reduce the common web attacks, such as injections and XSS, though it should not be used as the primary method to combat these vulnerabilities.

However, to implement a robust validation method is a very challenging task, you may have to consider many aspects, for example, 1) which input validation method should be used, blacklist, whitelist or regex based 2)when input validation should be performed 3)is the input validation efficient. 4) how to ensure input validation is executed in multiple components in a complicated architecture

Without a careful consideration of all these areas, your input validation might be flawed and turns out useless to combat malicious user input.

Common mistakes when implementing input validation

Here are some common mistakes observed when performing penetration tests and code reviewing.

  • Confuse Server Side validation with Client Side Validation
  • Perform Input Validation before proper decoding
  • Poor validation Regex leads to ReDOS
  • Input validation implemented without the context of the entire system
  • Reinvent the wheel by creating your own input validation method
  • Blacklist input validation is not comprehensive

Confuse Server Side Validation with Client Side validation

Client Side validation is for user experience/usability, which is more likely to be performed by your browsers during executing some JavaScript code;  whereas, server side validation is employed for security control, which is used to ensure proper data is supplied to the server or service. In another word, client side validation does not add any security enhancements to your application.

Nowadays, many web frameworks, for example, Angularjs and react, offer client side input validation to improve user experience and make developers life easier. For example, the following input field will validate whether the user input is a valid email address.

<html><script src=”https://ajax.googleapis.com/ajax/libs/angularjs/1.6.9/angular.min.js”></script>  <body ng-app=””>
<p>Try writing an E-mail address in the input field:</p>
<form name=”myForm”><input type=”email” name=”myInput” ng-model=”myInput”></form>

This build-in client side validation gives a wrong feeling to the developers that input validation has been done by the framework already. As a consequence, Server side validation is not implemented and any attacker could bypass the client input validation and launch a potential attack. 

Solutions

Educate your developers and test engineers to understand the difference between client side validation and server side validation so that the correct validation method is implemented.

Perform validation before decoding the input data

As you might be advised when implementing input validation, it should happen as soon as the data is received by the server in order to minimize the risk. That is a true statement and input validation should be executed before the user supplied data is consumed by the server. 

When running some bug bounties programs. I found it is very common that the input validation is executed at the wrong time. Sometimes, the input validation is performed before it is converted to the correct format in which the system would consume.

For example, in one test case, an application is vulnerable to XSS vulnerability through a parameter  https://evils.com/login?para=vuln_code.  An input validation is performed to check whether it contains malicious code, input like javascript:alert(1) or java%09script:alert(1) will be blocked. However, if an attacker changes the payload into Hex format, the input validation method is not able to detect the malicious code.

\x6A\x61\x76\x61\x73\x63\x72\x69\x70\x74:\x64\x6F\x63\x75\x6D\x65\x6E\x74\x2E\x74\x69\x74\x6C\x65\x3D\x61\x6C\x65\x72\x74\x28\x31\x29

Solutions

When input validation is executed, you need to ensure you are validating the user input in the same format in which the System or service would consume. Sometimes, it is necessary to convert and decode the user input before applying input validation functions.

Improper Regex Pattern for validation leads to ReDOS

Many input validations are  leveraging regular expressions to define an allowlist for input validations. This is a great way to create allowlist without adding too much restriction on the user input data. However, developing a robust and functional regex is complicated. If not handled properly, it could do more harm than good to your application.

Take the following regex for example, the regex is used to check whether a HTML page is using application/json format JavaScript code for JSON before scraping it by the server.

var regex =  /<script type=”application\/json”>((.|\s)*?)<\/script>/;

This regex will lead to ReDOS attack because it contains a so-called “evil regex” pattern ((.|\s)*?) which could introduce backtracking problems.

Here is a POC to demonstrate how long it will take to evaluate the regex when increasing the test string.

var regex = /<script type=”application\/json”>((.|\s)*?)<\/script>/;
for(var i = 1; i <= 500; i++) {
var time = Date.now();
var payload = “<script type=\”application/json\”>”+” “.repeat(i)+”test”;
payload.match(regex)
var time_cost = Date.now() – time;
console.log(payload);
console.log(“Trim time : ” + payload.length + “: ” + time_cost+” ms”);
}

A detailed example could be found in another blog post, ReDOS, it could be the cause of your next security incident, that will give you a better explanation about how ReDOS occurs and how it could damage your applications.

Solutions

To create a very robust regex is hard, but here are some common method you might follow

  1. Set the length limitation if possible
  2. Set a time t limitation for the regex matching. If the regex matching is taking too long than expect, just kill the process
  3. Optimize your regex with Atomic grouping to prevent endless backtracking.

Input validation without clear context of the entire system

With more and more businesses adopting microservices, the micro-services architecture sometimes could bring challenges for input validation functions. When data flows between multiple microservice, the input validation implemented for microservice A might not be sufficient for microservice B;  or input validation is not implemented for all the microservices due to lack of centralized input validation functions.

In order to illustrate this common mistake better, I would like to use the following typical AWS microservice diagram as an example.

Here are two scenarios where input validations could go wrong

Scenario 1 Input validations not implemented for all microservice

In some scenarios, there might be multiple services behind the API Gateway to consume the user input data. Some services might have to give response to the user input directly, for example, Service B in the above diagram; whereas, some microservice are designed to handle some background jobs, for example, Service A and Service C.

Since service A  and service C are implemented for some background jobs and they do not respond to the user input directly,  the developers might ignore implementing input validation for these two services if a centralized input validation is not enforced for this microservice architecture.  As a consequence, lack of input validation in service A and Service C could lead to exploitation.

Scenario 2 Input validation for one service is insufficient for its downstream services.

In this scenario, input validation is implemented for microservice B and it is sufficient for microservice B to block malicious user input. However, the input validation might not be sufficient for its downstream Service D.

A good example could be found under my previous blog post  Steal restricted sensitive data with template language The microservice B is validating whether user input is a valid template. The input validation implemented in microservice B is robust for this service.  However, when service D is compiling the user input template validated by microservice B with some data to get the final output, the process could lead to data leakage because microservice D is not validating the compiled template.

Solutions

Before implementing the user input validations, the developers and security engineers should obtain a comprehensive understanding of the entire system and ensure input validation should be applied in all the components/microservices. 

“Reinvent the wheel” by creating your own input validation methods

Another common mistake that I observed when performing code reviewing, is that many engineers are creating their own input validation methods though there are very matured input validation libraries used by other organizations.
For example, if you need to validate whether the input is an email address or the input is a valid credit card number, you have many options to choose from matured input validation libraries. Creating your own validation method is time consuming and it could be defective without robust tests. 

Solution

To avoid “Reinvent the wheel”, you need to figure out the purpose of your input validation and try to search whether there are existing validations already implemented. If there are some popular libraries you could use, try to use the existing libraries instead of creating new ones.

Blacklist is not comprehensive

One of the most popular quotes you are seeing frequently is “You could not control things that you could not measure”. This quote could explain the pain of using the blacklist method for user input validation.

Blacklist approach in Input validation is to define which kind of user inputs should be blocked. With that said, developers and security engineers need to understand what inputs are considered “bad” and should be blocked by the blacklist. The efficiency of the blacklist method is largely dependent on the knowledge of the developers and their expectation of bad user inputs. 

However, security incidents or breaches are most likely to occur when malicious users are injecting something unexpected. 

Solution

In many cases, blacklisting and whitelisting are implemented together to meet the requirement. If possible, try to employ both methods to combat malicious user inputs.

Conclusion

It could not be overemphasized how import input validation could be used to help your organization to combat malicious attacks. Without a robust input validation method in your service or system, you are likely to open the door for potential security incidents. 

It could be super easy to start implementing input validations in your service, but you really need to pay attention to these common mistakes found in many validation methods. Try to understand your system or service, choose the right validation methods suitable for your organization, once decided try to perform a thorough testing against your method. 

Security checklists when implementing your API  Keys

When building modern API endpoints for your customers,  how to keep API keys secure is likely to be the most crucial question to ask at the initial phase of designing your APIs. Though there is no silver bullet for this question as you need to consider the nature, usage and requirement for your API endpoints, there are still some checklists you could refer to help you to avoid or reduce the potential security risks. 

Taken from https://www.cyberark.com/resources/

Checklist 1: Identify the usage of your API Keys

Before you could implement your API keys in a secure way, it is vital to figure out how your API Keys are going to be used by your clients. Is your API key just an identification string for your server to identify and log the API activity for an App. Or the API key is used for authentication purposes.  Based on the usage for the API keys, different security concerns and the corresponding controls should be evaluated.

API keys are mostly used for App (mobile App, or web application) Identification, Application authentication. In some scenarios it could also be used for user authentication (though it should be called access token rather than API keys in most of these scenarios, to be precise). 

API Key Application Identification

API Keys are typically used to identify the application that is making a call to this API. In this scenario,  it is very likely this API Key will be left in your application and they are pretty easy for any users to spot and extract these kind of API keys. 

Take the widely used Google Analytics API for example, just open some major websites using google analytics tool, you should be able to spot the Google API Key in the source code very easily. Below is a screenshot of an application using Google API Key

As API Keys for application identification are just used for App identification purpose,  these keys will be a) residing in the applications and it should b) not bear permissions to perform any sensitive operations.  Due to these nature of this kind of API keys,  we need check how we could make the API Keys hard to extract from you application and ensure restriction is implemented for this API Keys. Details would be expanded under Checklist 3.

API Key for Application Authentication

API keys could be used for project authentication as well.  When a request with this API key reaches the backend. The backend will check whether the calling application has been granted access to call the API and has enabled the API in this project.  

As opposed to the API for project identification, this kind of  API key is not publicly accessible. Only limited users under this project have access to this API key and then use this API key to perform some sensitive operation with the API .  One typical use case is that, this kind of API key could only be retrieved after a user passes the authentication check (for example, the API could be generated under the dashboard after an authenticated user logs in).  

Since these API keys are bearing authentication characters and could be used to perform sensitive operation,  it is important to understand 1) how this kind of API keys could be accessed, are there any protection implemented 2) Is correct permission is granted to these API keys? Details and some real use cases will be explained in the following section.

API key for user authentication

In some scenarios, the API key can also authenticate users -verifying the person making the call is actually the person they claim to be. Different from API key for App authentication, each user is granted with an API key for a more granular access control rather than an identical API key for the entire App.  We will not unfold the security concerns for this kind of API Key (authentication token) because it is kind of totally a new different story. 

For the API keys used for App identification, we could not really control WHO could access this kind of API key, but just to make it harder for unauthorized users to extract and access it as this kind of API keys has to be part of your App.

Checklist 2: Check who could access the API keys for App authentication

However, for API Keys for app authentication, these API keys are not supposed to be publicly accessible. We could control who could access these API keys. That is exactly the common security risks that I observed when performing penetration testing, missing correct access control to restrict who could access the API Keys. I will use

For example, a project has a group of users with different roles, such as admin, coordinator, team users and only the admin users are supposed to extract and access  the API key for this project. However, in many cases, a user under the App with no permission to access the API keys is still granted permission to access the API Keys due to lack of correct access control or mis-configuration. The following two use cases are real use cases that I found and reported under two private Bug bounty program.

Real Use Case 1 – Lack of Access control:  Under a project hosted under https://vulnerable-example1.com/dashboard,  API Key for this project could be extracted under  https://vulnerable-example1.com/dashboard/configuriation after admin user logs in. However, for a normal user under this project log in, this page https://vulnerable-example1.com/dashboard/configuriation is not rendered with the API key. That seems correct, however, the application is only performing a front end validation to disable the rendering of the API Keys when the logged in user is not Admin. The API key is could be still extract by making a backend request https://example.com/configuration/api/getAPIkey with the session cookie of the normal user if the normal user knows the request URL to extract the API key

Real Use Case 2 – Misconfiguration:  API Key is leaked to users with less privilege under different subdomains due to misconfiguration. For example, a service provider has two subdomains, https://dashboard.vulnerable-example2.com/ and https://document.vulnerable-example2.com/. An admin user logs in to the dashboard under https://dashboard.vulnerable-example2.com/ and could extract the API keys under https://dashboard.vulnerable-example2.com/dashboard/configuriation . To give a better user experience, the API token will also be rendered under https://document.vulnerable-example2.com/howtouseAPI (which is a different domain) after the user logs in. Now, a user without admin privilege logs in to the dashboard, he is not able to get the API token even though he bypassed the front end restriction. However, when the user navigates to the another domain https://document.vulnerable-example2.com/howtouseAPI, the API key is rendered because the backend for this document.vulneable-example2.com just check which APP this user belongs to and render the API key under the page as long as the user belongs to this App.

Both use cases are discovered in two private bounty programs and fixed after reporting them.

Checklist 3:  Deploy methods to reduce the attacking surface for API Keys for App Identification

For API keys residing in the APP,  it is not a matter of if the API keys could be stolen or accessed by a potential malicious user, but how much effort to steal it is worth the return, regardless of your efforts to hide it. However, there are still some ways to reduce the potential attacking surface.

Make it harder for un-authorized user to extract the API Keys from your App

We could not really remove the API Keys from our app completely, otherwise the App will not be able to make API calls to the API endpoints. We could reduce the risk by making it harder for unauthorized user to extract it from our App. Under this security blog post, the user listed several ways to improve the API Key security by

  • using hash-based message for each HTTP request to avoid setting API Keys in the HTTP requests
  • Hide the API Keys in the source code by using Code obfuscation
  • Not store API Keys on the device storage.

Apply API Key Restriction

When using API Keys for APP Identification, it is assumed that these API Keys are ONLY used as an identifier when performing any API calls, it should not be granted permissions to operate some sensitive data. However, that is another common API Keys implemented we observed. For example, some APIs provided by analytics software, it is told the API Keys are just used for identification purpose when sending API requests to the API endpoints, no sensitive operation or malicious API requests could be performed with this API Key even though a malicious user steal the API Keys, it turns out, the API Key could be used to change the App configuration and setting.

To ensure the API Keys are implemented correctly, the developers should restrict the API Keys usage and permissions especially when the API Keys are intended to be used as identifier, not for app authentication.

Conclusion

API Keys are generally not considered secure and they are typically accessible to the clients, which makes it easy for someone to steal an API Key. Since API Keys could be implemented and used in different purpose, you’ll need to consider a variety of factors during the implementation. The above checklist is just the beginning to help you to avoid some common API Key security risks, there are more best practices you could find in the security field.

Using JWT? You need avoid these implementation mistakes

JSON Web Token (JWT) is an open standard (RFC 7519) that defines a compact and self-contained way for securely transmitting information between parties as a JSON object. JWT are widely used in authentication, authorization and information exchange. Among these use cases, the most common scenario for using JWT is for authorization purpose. Once a user is logged in, each subsequent request will include the JWT, allowing the user to access routes, services, and resources that are permitted with that token. You may start to ask why not using Session token. One reason is that the JWT is stateless compared with Session tokens, it means that the server side does not have to store the JWT in a centralized DB to perform the validation when a clients sends a request to the server, whereas, the server has to store the session cookies for validation purpose.

To understand the common implementation mistakes of JWT, we need first figure out the structure of the JWT token as most the implementation errors are caused by misunderstanding the structure of JWT token and how each component of JWT are used for.

Structure of JWT Token

A well-formed JWT consists of three concatenated Base64url-encoded strings, separated by dots (.):

  • Header : contains metadata about the type of token and the cryptographic algorithms used to secure its contents. For example
{
  "alg": "HS256",
  "typ": "JWT", 
  "kid": "path/value" //optional
}
  • Payload : contains verifiable security statements, such as the identity of the user and the permissions they are allowed. An example of the payload could be
{
  "username": "test@example.com",
  "id": "new user"
}
  • Signature : it is used to validate that the token is trustworthy and has not been tampered with. When you use a JWT. Signature is piece of encrypted data of the Header and Payload date by using certain encryption method (such as HMAC,SHA256) with a secret which should be residing on a server side.
HMACSHA256(
  base64UrlEncode(header) + "." +
  base64UrlEncode(payload),
  secret)

After putting all them together, a JWT looks like

Common Implementation Error

JWT itself is considered as secure method. Most of the security issues discovered against JWT are caused by implementation mistakes.

Leak the secret key

JWT is created with a secret key and that secret key is private to you which means you will never reveal that to the public or inject inside the JWT token. However, sometimes, the developers could mistakenly leak the signing secret to public. For example, the JWT signing secret is stored on a client side javascript functions or stored on a public accessible files. As a consequence, an attacker could bypass the signature verification easily as the attacker could sign any payloads with the secret to put the server side signature validation in vain. Here is one example listed under Hackerone where the developers are adding hardcoded secret in the client side javascript file.

Using Predictable secret key

It is also possible that the developers are copying a piece of code from a sample code base when implementing the JWT on the server side or they are using a simple word (not randomly generated complex string) as the secret value when implementing the JWT on the server side . A brute force attack could easily extract the secret key as the attacker knows the algorithm, the payload with the original information and the resulting signature of the encrypted payload and header. One example listed under Hackerone shows how a developer is using the ‘secret’ as the secret when implementing the JWT on the server side.

Broken JWT Validation

There are numerous way how the JWT validation could be broken during insecure implementation. Here are some typical examples where a broken JWT validation could happen.

  • Not Validate Signature at all. Some developers implement the JWT validation in a wrong way, where they only decode the JWT payload part without validating the signature part before accepting the JWT as a valid token.
  • Not Validate the JWT correctly if None alg is provided. An attacker could or creates a token by setting alg (the signing algorithm) to None in the header. If the server side is NOT validating whether the None alg was really implemented when the JWT was generated. As a consequence, any JWT provided by the attacker with a None value to the alg will be treated as a valid token. To prevent this, the server side should check whether the alg value is the one when the JWT token is generated. If not, the JWT should be treated as an invalid.
  • Not Validate the JWT correctly when alg is changed to HS256 from RS256. HS256 is a symmetric encryption method, the server side is using the same key to generate and validate the JWT. Whereas, RS256 is an asymmetric encryption method, the JWT payload was signed with its private key and validated with the public key, which is known to the public. Now An attacker creates the token by setting the signing algorithm to a symmetric HS256 instead of an asymmetric RS256, leading the API to blindly verify the token using the HS256 algorithm using the public key as a secret key. Since the RSA public key is known, the attacker can correctly forge valid JWTs and bypass the validation method.

Lack of kid header parameter validation

The kid (key ID) Header Parameter is a hint indicating which key was used to secure the JWS. This parameter allows originators to explicitly signal a change of key to recipients. With the key identifier, the consumer of a JWT can retrieve the proper cryptographic key to verify the signature. This simple mechanism works well when both the issuer and consumer have access to the same cryptographic key store.

Since the attacker can modify the kid header parameter, attacker can supply any values in there when it is sent to the sever. If the server is not validating or sanitizing the kid value correctly, it could lead to severe damage.

For example, the kid value in the JWT header is specifying the location of the JWT secret to sign the JWT payload

{
  "alg": "HS256",
  "typ": "JWT", 
  "kid": "http://localhost:3000/vault/privKey.key"
}

For example, the kid value in the JWT header is specifying the location of the JWT secret to sign the JWT payload

{
  "alg": "HS256",
  "typ": "JWT", 
  "kid": "http://evil.com/privKey.key" //Private key supplied by the attacker
}

Now, if the attacker is providing his own private key to the kid parameter and adjust the JWT token, he would be able to generate any valid JWT token.

Other wrong implementation

The above implementation mistakes could lead to very severe security issues, for example, account hijacking and authentication bypass. There are some other implementation bad practice should be avoid when using JWT token.

  • Use JWT token as session cookie. Some applications are using JWT token as a session cookie and it could cause havoc. Let’s say you have a token and you revoke it due to a user logout or any other action, you could not really invalidate the token before its expiration has been reached unless you keep a DB to store these tokens and build a revocation list, which is not the intention of using a stateless JWT.
  • Leak the JWT in referrer header. It is common that a developer attaches the JWT in a URL, for example, user registering and password reset service, and the HTML page of the URL are loading some 3rd party web pages. As a consequence, the JWT could be leaked to 3rd party web pages via referer header if noreferrer is not probably deployed in the web application.

Conclusion

JWT are adopted by more and more applications over the web. If used correctly,  it could help you to build a robust and agile authentication, authorization and information key functions.However, if misused or implemented incorrectly, this technology may put entire systems at risk. This JWT Security Cheat Sheet provides an overview of all these best practices when using JWT in your applications.