Category Archives: Uncategorized

Mistakes Frequently Encountered in Access Control Implementation

Effective access control is essential for securing your application, but implementing robust access control can sometimes be challenging and problematic. This is precisely why Broken Access Control is listed as the number one issue in the OWASP TOP 10. Below, we’ll highlight some common access control errors identified through code reviews and penetration testing experience.

Common Errors in Access Control Implementation

OAuth2 implementation mistakes 

OAuth2 has become a fundamental component in numerous applications as part of their authentication and authorization processes by providing secure designated access capabilities to these applications.  Despite OAuth2 emerging as the dominant industry-standard authorization framework following its replacement of OAuth1 in 2012, it has been noted that the complexity of OAuth2 and misunderstandings surrounding its implementation have led to it becoming a significant factor contributing to broken access control. 

In a previous article, we listed the common mistakes when implementing the OAuth2 in your organization. 

  •  Missing validation in redirect_uri leads to access token takeover
  •  Missing state parameter validation leads to a CSRF attack
  •  Client_secret mistakenly disclosed to the public
  •  Pre-account takeover
  •  OAuth2 access_token is leaked through the referrer header
  •  OAuth2 login bypass due to lack of access_token validation

In addition to the aforementioned errors mentioned in the blog, Overly permissive scope grant and Opting for an in-house, less mature OAuth2 service instead of a battle-tested OAuth2 solution are two other common mistakes reported against OAuth2 implementation.

  • Overly permissive scope grant 
  • Unmature self-developed OAuth2 server

In most cases, the overly permission scope grant issues comes when the application itself has a very granular control of access, but the scopes defined are not sufficient to match with the granular control. As a consequence, a broader scope could be granted to a user.  In some cases, an attacker might have the opportunity to enhance an access token (whether acquired through theft or by using a malicious client application) by exploiting inadequate validation performed by the OAuth service.

Certain organizations opt to create their own OAuth2 service rather than utilizing a well-established and secure OAuth2 server. In some instances, these internally developed OAuth2 services may lack rigorous testing, potentially harboring security vulnerabilities that could result in access-related problems.

Role-Based Access Control alone may not suffice for a complex system

Role-based access control relies on the users’ role to grant the corresponding permissions to the users. It has been widely used as it is simple to implement and less prone to errors. Below is a piece of sample codes with simple role-based access controls.

However, it has been demonstrated that role-based access control falls short of meeting the needs of complex systems. In the above given example, the author_user can edit any post because it only checks whether the user has the :edit_post permission without validating whether the user is the author of that specific post. The absence of proper validation to check whether a user really owns a specific resource serves as a fundamental reason for numerous access-related problems, including instances of Indirect Object Reference (IDOR) issues. 

For a complicated system with a very granular access control, a more advanced Attribute-based access control could be implemented to ensure the object/resource could be consumed by the users with the right permission. Attribute-based access control leverages multiple dimensions of the data’s and data consumer’s unique attributes to determine whether to grant or deny data access.  

In a well-established application,  it has been approved that using both RBAC and ABCA could be a highly efficient way to perform access control. An illustrative example involves using RBAC as middleware to initially validate whether a role is authorized to access a specific endpoint, followed by the application of ABAC for a final validation once RBAC authorization is confirmed.

Authorization Token or Passcode is improperly handled

Passcodes, stateful sessions, stateless JWT tokens, and Authorization Tokens are very sensitive and play critical roles when implementing a robust access control.  But sometimes, these could be mishandled when implementing access controls. 

Following typical mishandling instances often surface when performing source code reviewing during our development lifecycle.

  • JWT Tokens have a very long expiration time
  • A one-time used JWT Token does not expire once consumed
  • Lack of revocation method when a JWT Token is used
  • Sessions are valid for too long a time, and the session could be reused due to session-fixation issues.
  • Authorizations and JWT tokens are leaked to the log file 
  • Token validation is not robust enough
  • Hard sensitive access token in the source codes

Mishandling such sensitive data can undermine the effectiveness of your access control system and potentially result in authorization bypasses.

Lack of Authentication and Authorization between microservice communication

Microservice architecture brings many benefits, including scalability, flexibility, and ease of deployment and testing.  But it also brings up some security challenges as all the microservices are running independently and they need to communicate with each other, increasing the attaching surfaces from security perspectives. 

On the one hand, there is a shortage of research concerning security within the context of microservices architecture, and this scarcity becomes more pronounced when focusing on practical aspects of authentication and authorization. To compound the issue, certain developers mistakenly assume that implementing authentication between these microservices is unnecessary when they are deployed within an organization’s internal network or infrastructure and the requests from an internal resource should be trusted.

During source code reviews or design assessments, it’s often observed that the authentication and authorization practices between microservices are loosely defined. For instance, many microservices tend to inherently trust requests from any other microservice if both operate within the organization’s internal network. This can be likened to a “Wild West” scenario, where security controls may be lax or insufficiently enforced.

An example might help us to understand: suppose that we have three microservices run internally: a “payment” microservice to handle payment,  an “order” micro-service responsible for handling orders, and an  “inventory” microservice to handle the product inventory. The payment microservice should exclusively accept requests originating from the order service and must reject any requests from the inventory microservice. Additionally, within the “order” microservice, various roles may be assigned specific payment responsibilities. Without proper authentication and authorization mechanisms between these microservices, there is no assurance that the payment service will only handle requests from trusted services and authorized users.

Misunderstanding of Authentication/Authorization

Although it may come as a surprise, developers still encounter misunderstandings regarding Authentication and Authorization. As a consequence, it leads to broken access control issues when implementing access controls. 

Consider a basic web application featuring a Login Form based on Username and Password and various user roles. In this context, the authentication process occurs when a user attempts to log in using the Login Form. Essentially, authentication verifies your identity to confirm whether you are who you claim to be. Authorization, on the other hand, is the subsequent step, ensuring that you possess the necessary permissions to perform actions after being authenticated. Nevertheless, there are cases where developers may overlook the authorization component, mistakenly assuming that once a user logs in, they should automatically be granted all permissions.

Final Remarks

Access control continues to be a crucial element in the realm of cybersecurity and data protection within application security. The task of implementing robust authentication and authorization mechanisms to establish a robust access control system can be intricate and fraught with potential issues, which may result in unintended errors.

For an organization, establishing a strong access control system necessitates a comprehensive approach that includes meticulous design assessments, secure code implementation, rigorous security code reviews, and robust security testing, including function access control unit tests and penetration testing verification. 

Common mistakes when using input validation and how to avoid them

Input validation is a widely adopted technique in software development to ensure proper user input data to be processed by the system and prevent malformed data from compromising your system. If a robust input validation method is adopted, input validation can significantly reduce the common web attacks, such as injections and XSS, though it should not be used as the primary method to combat these vulnerabilities.

However, to implement a robust validation method is a very challenging task, you may have to consider many aspects, for example, 1) which input validation method should be used, blacklist, whitelist or regex based 2)when input validation should be performed 3)is the input validation efficient. 4) how to ensure input validation is executed in multiple components in a complicated architecture

Without a careful consideration of all these areas, your input validation might be flawed and turns out useless to combat malicious user input.

Common mistakes when implementing input validation

Here are some common mistakes observed when performing penetration tests and code reviewing.

  • Confuse Server Side validation with Client Side Validation
  • Perform Input Validation before proper decoding
  • Poor validation Regex leads to ReDOS
  • Input validation implemented without the context of the entire system
  • Reinvent the wheel by creating your own input validation method
  • Blacklist input validation is not comprehensive

Confuse Server Side Validation with Client Side validation

Client Side validation is for user experience/usability, which is more likely to be performed by your browsers during executing some JavaScript code;  whereas, server side validation is employed for security control, which is used to ensure proper data is supplied to the server or service. In another word, client side validation does not add any security enhancements to your application.

Nowadays, many web frameworks, for example, Angularjs and react, offer client side input validation to improve user experience and make developers life easier. For example, the following input field will validate whether the user input is a valid email address.

<html><script src=”https://ajax.googleapis.com/ajax/libs/angularjs/1.6.9/angular.min.js”></script>  <body ng-app=””>
<p>Try writing an E-mail address in the input field:</p>
<form name=”myForm”><input type=”email” name=”myInput” ng-model=”myInput”></form>

This build-in client side validation gives a wrong feeling to the developers that input validation has been done by the framework already. As a consequence, Server side validation is not implemented and any attacker could bypass the client input validation and launch a potential attack. 

Solutions

Educate your developers and test engineers to understand the difference between client side validation and server side validation so that the correct validation method is implemented.

Perform validation before decoding the input data

As you might be advised when implementing input validation, it should happen as soon as the data is received by the server in order to minimize the risk. That is a true statement and input validation should be executed before the user supplied data is consumed by the server. 

When running some bug bounties programs. I found it is very common that the input validation is executed at the wrong time. Sometimes, the input validation is performed before it is converted to the correct format in which the system would consume.

For example, in one test case, an application is vulnerable to XSS vulnerability through a parameter  https://evils.com/login?para=vuln_code.  An input validation is performed to check whether it contains malicious code, input like javascript:alert(1) or java%09script:alert(1) will be blocked. However, if an attacker changes the payload into Hex format, the input validation method is not able to detect the malicious code.

\x6A\x61\x76\x61\x73\x63\x72\x69\x70\x74:\x64\x6F\x63\x75\x6D\x65\x6E\x74\x2E\x74\x69\x74\x6C\x65\x3D\x61\x6C\x65\x72\x74\x28\x31\x29

Solutions

When input validation is executed, you need to ensure you are validating the user input in the same format in which the System or service would consume. Sometimes, it is necessary to convert and decode the user input before applying input validation functions.

Improper Regex Pattern for validation leads to ReDOS

Many input validations are  leveraging regular expressions to define an allowlist for input validations. This is a great way to create allowlist without adding too much restriction on the user input data. However, developing a robust and functional regex is complicated. If not handled properly, it could do more harm than good to your application.

Take the following regex for example, the regex is used to check whether a HTML page is using application/json format JavaScript code for JSON before scraping it by the server.

var regex =  /<script type=”application\/json”>((.|\s)*?)<\/script>/;

This regex will lead to ReDOS attack because it contains a so-called “evil regex” pattern ((.|\s)*?) which could introduce backtracking problems.

Here is a POC to demonstrate how long it will take to evaluate the regex when increasing the test string.

var regex = /<script type=”application\/json”>((.|\s)*?)<\/script>/;
for(var i = 1; i <= 500; i++) {
var time = Date.now();
var payload = “<script type=\”application/json\”>”+” “.repeat(i)+”test”;
payload.match(regex)
var time_cost = Date.now() – time;
console.log(payload);
console.log(“Trim time : ” + payload.length + “: ” + time_cost+” ms”);
}

A detailed example could be found in another blog post, ReDOS, it could be the cause of your next security incident, that will give you a better explanation about how ReDOS occurs and how it could damage your applications.

Solutions

To create a very robust regex is hard, but here are some common method you might follow

  1. Set the length limitation if possible
  2. Set a time t limitation for the regex matching. If the regex matching is taking too long than expect, just kill the process
  3. Optimize your regex with Atomic grouping to prevent endless backtracking.

Input validation without clear context of the entire system

With more and more businesses adopting microservices, the micro-services architecture sometimes could bring challenges for input validation functions. When data flows between multiple microservice, the input validation implemented for microservice A might not be sufficient for microservice B;  or input validation is not implemented for all the microservices due to lack of centralized input validation functions.

In order to illustrate this common mistake better, I would like to use the following typical AWS microservice diagram as an example.

Here are two scenarios where input validations could go wrong

Scenario 1 Input validations not implemented for all microservice

In some scenarios, there might be multiple services behind the API Gateway to consume the user input data. Some services might have to give response to the user input directly, for example, Service B in the above diagram; whereas, some microservice are designed to handle some background jobs, for example, Service A and Service C.

Since service A  and service C are implemented for some background jobs and they do not respond to the user input directly,  the developers might ignore implementing input validation for these two services if a centralized input validation is not enforced for this microservice architecture.  As a consequence, lack of input validation in service A and Service C could lead to exploitation.

Scenario 2 Input validation for one service is insufficient for its downstream services.

In this scenario, input validation is implemented for microservice B and it is sufficient for microservice B to block malicious user input. However, the input validation might not be sufficient for its downstream Service D.

A good example could be found under my previous blog post  Steal restricted sensitive data with template language The microservice B is validating whether user input is a valid template. The input validation implemented in microservice B is robust for this service.  However, when service D is compiling the user input template validated by microservice B with some data to get the final output, the process could lead to data leakage because microservice D is not validating the compiled template.

Solutions

Before implementing the user input validations, the developers and security engineers should obtain a comprehensive understanding of the entire system and ensure input validation should be applied in all the components/microservices. 

“Reinvent the wheel” by creating your own input validation methods

Another common mistake that I observed when performing code reviewing, is that many engineers are creating their own input validation methods though there are very matured input validation libraries used by other organizations.
For example, if you need to validate whether the input is an email address or the input is a valid credit card number, you have many options to choose from matured input validation libraries. Creating your own validation method is time consuming and it could be defective without robust tests. 

Solution

To avoid “Reinvent the wheel”, you need to figure out the purpose of your input validation and try to search whether there are existing validations already implemented. If there are some popular libraries you could use, try to use the existing libraries instead of creating new ones.

Blacklist is not comprehensive

One of the most popular quotes you are seeing frequently is “You could not control things that you could not measure”. This quote could explain the pain of using the blacklist method for user input validation.

Blacklist approach in Input validation is to define which kind of user inputs should be blocked. With that said, developers and security engineers need to understand what inputs are considered “bad” and should be blocked by the blacklist. The efficiency of the blacklist method is largely dependent on the knowledge of the developers and their expectation of bad user inputs. 

However, security incidents or breaches are most likely to occur when malicious users are injecting something unexpected. 

Solution

In many cases, blacklisting and whitelisting are implemented together to meet the requirement. If possible, try to employ both methods to combat malicious user inputs.

Conclusion

It could not be overemphasized how import input validation could be used to help your organization to combat malicious attacks. Without a robust input validation method in your service or system, you are likely to open the door for potential security incidents. 

It could be super easy to start implementing input validations in your service, but you really need to pay attention to these common mistakes found in many validation methods. Try to understand your system or service, choose the right validation methods suitable for your organization, once decided try to perform a thorough testing against your method. 

Develop Secure Codes and Service with Github Advanced Security

Back to Nov, 2020, I got the chance to evaluate Github Code Scanning, mainly CodeQL, as part of our effort to improve the security posture of our source code right after Github announced Code scanning became available in Sep 2020. I selected 4 different services written with Java, Python, C++ and JavaScript respectively and ran CodeQL scanning against them. Though there were some great advantages when it comes to ease of use and collaboration, the overall CodeQL code scanning results were average compared to other traditional commercial SAST tools. The result was not good enough for our team to replace the existing SAST tools.

Recently, our team started to assess Github Advance Security (GHAS) again to understand whether we could use Github Advanced Security Feature as a unified platform to secure our source code by evaluating the three main features Code Scanning, Secret Scanning and Dependency vulnerability in the GHAS. The overall evaluation totally surpassed my expectation as I saw a significant improvement of Github Advanced Security features by comparing with the results of evaluation conducted one and half years ago. 

In this post, I would like to share and highlight  some valuable findings and features the GHAS surprised me with and how they could help your organization to secure its codes and build secure services.

How did we start the re-evaluation?

Before we got involved with Github Advanced Security, we were clear that what we really wanted was a unified platform that could perform code scanning (SAST), secret detection,  software composition analysis (3rd party dependency vulnerability) and it should be easily integrated into our current CI pipelines.

With the previous evaluation and experience with Github Code Scanning, we figured out that GHAS could be a one stop solution to meet all the requirements. However, due to the previous evaluation, I was really concerned about the Code Scanning performance before we started evaluation. Well, it turns out that the concern was over worried.

In order to conduct a thorough evaluation, we selected 15 services/repo to cover all languages supported by Github to evaluate the GHAS , diagnose the findings and compare them with the existing tools that we have deployed. 

How did GHAS outperform others 

With the completion of the GHAS evaluation, the following are some highlights we think that the GHAS are outperforming others tools.

1.Code Scanning: Excellent Auto Build with Flexible Configuration

Nevertheless to say, Code scanning is a resource consuming task. Some of the repos that I am evaluating are monolithic repos, so building and scanning them are time and resource consuming. When scanning one of this monolithic repo with a popular open source tool we were evaluating, my personal laptop got completely frozen after running the scan for 10 minutes as the scanning task was consuming more than 9G memories. 

However, with Github Code Scanning (we only enabled CodeQL scanning by default), we found that this is not an issue because it provides an excellent auto build and scanning process in Github-hosted runners deployed in the Github network. 

If you could use Github-hosted runner to build and scan the service, you don’t have to bother your IT team to set up a self-hosted runner either with your own laptop or a remote server in your network. We were able to use the Github-hosted runner to build and launch the scan against 14 repos out of 15 selected. That means more than 93% percent of the scans could be completed with Github-hosted runners. That is a significant advantage as a high successful rate of using Github-Hosted runners means less resources required from our  organization to build and maintain a self-hosted server to run the scans.

  • Flexible Configuration to add manual build commands 

Some of the codes in the select repos have a non-standard build process. We could NOT simply run the default  maven build or cmake commands provided by the Auto-Build function. Under this situation, the flexibility to add the manual build commands is really necessary and powerful to ensure a successful build process . For example, we were able to build our Java service by adding some customized configuration for the maven setting with the help of defining manual build commands in the yaml configuration file.

  • Github Secret to keep your build information safe

As mentioned above, we have to set up some environment variables in the build process.  These environment variables are very sensitive and they should not be  exposed in the CodeQL yaml configuration files directly. The Github secrets function lets us hide the sensitive secrets in the yaml configuration file.

2. Code Scanning: Less False Positive with high true positive rate

One of the biggest challenges in SAST tool/Code scanning is that it likely yields  a high number of False Positives, which costs tons of time and effort for the engineering team to validate these false findings. The main reason for the high number of false positives is that static code analysis is largely based on assumptions and modeling methods after it builds the call stack (from source to sink), which is different from the DAST tool where the test payloads are actually executed by application codes.

As weeding out false positives is time and resource intensive,  low False Positive rate and high true positive rate are the key factors for the entire evaluation.  During the evaluation, we went through all the critical, high, medium vulnerabilities reported byCodeQL. Here are some key findings

  • Less False Positives than we expected

For the projects writing in Java, C++ and C#, the false positives are really low. The best one is with the Java languages, we saw a false positive rate at 0% with 2 valid findings. We double check it with another popular open source tool, the performance is equivalent where 2 valid findings were reported. Overall, for the compiled languages, most of the SAST tools we compared have a low false positive rate.  

CodeQL stands out when it comes to the Script languages, for example, JavaScript and Ruby. In general, Github CodeQL has less False Positive rate reported in scripting language. For example, CodeQL has false positive rates at 44% compared to 62% false positives rates when scanning a repo written in script language.

  • Relatively high True Positive rates

A good false positive rate does not guarantee the tool is a good one. A SAST tool could produce a 0% false positive rate with zero vulnerability detection. When analyzing the CodeQL scanning results, we calculated that the True Positive detection rate is higher compared to other tools for most of the repos. For example, CodeQL scan reported 25 valid findings against 22  in one repo, and 5 versus 3 findings in another repo when comparing the results generated from one popular SAST tool.

Note: Some reported vulnerabilities are vulnerable but not really exploitable or reachable, under these scenarios, most of these types of vulnerabilities are categorized as  False Positives. 

3. Code Scanning: some vulnerability detections are intelligent

Most of the Code analysis SAST tools are using a set of rules to detect potential vulnerability when scanning the code.  Github CodeQL is NOT an exception. It is utilizing a set of predefined rules to detect the vulnerabilities.  Due to that, many security engineers and developers thinks SAST is just a dumb tool to perform a matching between the code and the rule set in order to detect a vulnerability. That argument is kind of true to a large extent.

However, we found that some vulnerability reported by CodeQL seems to be intelligent and these detections were only reported by CodeQL scanning. Here are a couple of examples based on some real detection we found

  • Inefficient regular expression detection

This detection is to check whether your regex pattern is potentially vulnerable to ReDOS attack.  For example, CodeQL is reporting the Inefficient regular expression  vulnerability against the following code in an open source library.

This is a true security issue, which has been ignored by other tools. An attacker could dramatically slow down the performance of a server with  a malicious string less than 100 characters. You could find a detailed post in another post.

  • Incomplete URL substring sanitization detection

Here is an example where Incomplete URL substring sanitization vulnerability is detected 

These detection looks simple but also intelligent from my perspective. There are many more other smart detections we found with CodeQL, I chose these two on purpose because I found these kinds of vulnerabilities are prevailing in many public Github repos when performing a brief code scanning against a tiny portion of open source repos.

4. Code Scanning: Easy to track the origin of the vulnerable code

This unique advantage makes the entire triaging process much easier and quicker as we could simply use the git blame function to track down which engineers committed the vulnerable code, what the vulnerable codes are supposed to accomplish and the corresponding Jira ticket to these changes.

After collecting all the related information of the vulnerable findings, we could tell the potential impact of the vulnerability, the potential remediation method and how quickly we could fix it. 

5. Code Scanning: multiple languages support in one scan 

Coverage is another factor when evaluating a code analysis SAST tool. Many SAST tools ask you to predefine the language before running the scans as it could only scan one language at a time. Whereas, CodeQL allows you to specify multiple languages and scan them in one scan without requiring some predefined settings.

Even for compiled languages, you could specify multiple builds for different languages in one scan. It is a really useful feature if you have some monolithic repos written with different languages and you want to cover all the code in the repo.

6. Secret Scanning: a powerful feature worthy a try

Secret Scanning is another feature we evaluate partly and we think it is a feature worthy of mention as there are some unique and true values when using it properly.

  • Scan your entire Git history on all presented branches

Github Secret scanning will scan your entire Git history on all branches present in your Github repos to find potential secrets exposed in your code bases. That is a huge difference compared with other tools, where the scan is performed against the main remote branch or local branch when scan is performed locally

  • Empower users to define its own secret patterns

If your secrets or token could not be detected by the Github defaults patterns. You could define custom patterns to identify secrets. 

  • Block Push containing suspected secrets.

Some developers might accidently add secrets into the code when pushing the changes to the remote branch. This could be prevented by enabling push protection, which allows Github to reject the push when the secret scanning finds any suspected secrets.  

7. Code Scanning: clear-text logging of sensitive data detection, a hidden gem 

Insecure logging could cause a security breach or incident in many cases. I  shared some thoughts in one of my blog posts. When analyzing all the code scanning results, It was refreshing to realize  that CodeQL has a detection function to check whether sensitive data is added into log files. 

I believe this detection has great values which are mostly underestimated by many SAST tools. From my experience as a security engineer and a penetration test, I found it is so common for engineers to add sensitive data into the log file for debug purposes, but they eventually forget to remove it before the changs deployed into production. As a consequence, they are collecting some sensitive data from customers by accident.  

With the help of this detection method, many logging issues could be detected before the code is pushed into a production environment.

Limitations in Github Advanced Security (GHAS)

Definitely, I could list more bright sides of how CodeQL is outperforming other tools. But I think it is important to remind people that GHAS, as a newly emerged and growing security tool, has some limitations as many other security tools do.

Here are some limitations that we could summarize from our evaluation.

Limitation 1: Insufficient disk space in Github-hosted runner to build large projects

We were able to use Github-hosted runner to build 14 services out of 15 select repositories. We had issues building one large project using Github-hosted runner as the build was hitting `not enough space on the disk` error all the time no matter how we customize the build commands.   After some analysis, we found the disk space allocated for the Github-hosted runner is really limited for the Windows runners. 

Suggestion: At this moment, there are two types of windows runner supported by Github, windows-2022 and windows-2019. If the Github team could assign specific roles for these two types of windows runner, it might be helpful to resolve the issue. For example, window-2022 should ONLY be used to build Dotnet projects with only Dotnet environment setup in this VM, whereas,window-2019 could be used for other types of build environments. 

Limitation 2: Certain frameworks are not supported in CodeQ

Though the code analysis tool CodeQL supports a large range of frameworks, certain frameworks are not well supported at the moment of the evaluation. For example, Ruby on Rails framework was not supported currently in the CodeQL and we saw some false negatives due to the lack of support for this framework. 

Limitation 3: Some False Positives could be filtered out

Some vulnerabilities reported by CodeQL are vulnerable, but the vulnerable piece of code will not be executed in any case because there are multiple validation or whitelisting methods applied before a user supplied input value to reach the vulnerable code. I believe this kind of filtering could be filtered out by tuning the detection method.

Limitation 4:  Current dependency vulnerability detection is too loose

In my opinion, the software composition analysis (3rd party dependency vulnerability) feature in GHAS is too loose because it mainly scans the package management files, like, pom.xml, package.json files to 1)extract the package name and version. 2)Identify the vulnerability based on the version number.  It means, Github Dependabot  will flag a vulnerability in your code even if you are not calling the vulnerable functions in the vulnerable dependency library.

It seems that the Github team is implementing some changes to check whether your codes are actually calling the vulnerable function rather than based on version number. Once this is fully rolled out, I believe that it will bring the dependency vulnerability detection to a totally new level.

Conclusion

Though Github Advance Security is a relatively new player in the security market, I could say that the code analysis tool CodeQL could compete with any other SAST tools in the market that I have evaluated so far.  With two separate evaluation experiences against GHAS, I observed such a huge improvement of scanning quality and new features adopted in it just in one and half years. That really surprises me and makes me believe the GHAS will be adopted by more and more organizations with the quality of detection and the speed or renovation in the tools.

Github Advance Security (GHAS) is not a silver bullet to catch all the issues by scanning in the code base as it has its own limitations, but this tool is clearly the best of the SAST tools that I have evaluated.

Steal Restricted Sensitive Data with Template Languages

Template language is a language which allows developers defining placeholders that should later on be inserted or replaced with some dynamic data (variables). As it indicates from the definition, the main usage of template language is to give more flexibility to allow developers to insert some dynamic data into a predefined template. The dynamic data could be generated  from a different server or a new service based on the condition of existing sessions or use cases.  There are numerous templating languages widely used in web developments. Among them, Handlebars, EJS, Django, Mustache and Freemarker are very popular ones.  The three main components when using a  template language are , dynamic data(variables), template and the template engine to compile the data and template.

How Template Language works

As template languages provide more flexibility for web developments, it also introduces some security issues due to it. Clearly SSTI is the most notorious vulnerability discovered among various template languages.

Security Concerns beyond SSTI with Template Languages 

SSTI vulnerabilities could be avoid

Server Side Template Injections (SSTI) issues are the most common vulnerabilities discovered among many different languages. Server-side template injection is when an attacker is able to use native template syntax to inject a malicious payload into a template, which is then executed on server-side when the template engine/processor processes the user supplied template. A list of vulnerable template languages and its exploitation injection code could be found here and it is quite comprehensive to understand.

Most of SSTI exploitation leads to arbitrary code execution and server compromise. Due to that, many template languages deploy default  Sandbox  and Sanitization features to prevent the template engine from accessing risky modules by disabling them in default settings. It means, when a user-provided template or data is processed by the engine, it can not access these risky modules even though the malicious template contains a call to the risky modules. For example,  HandleBars introduced a new restriction  to forbidden access prototype properties and methods of the context object by default since 4.6.0 to mitigate the code execution caused by server side template injections. Some applications using template language are also deploying a very strict sanitization method to disallow certain characters or regexes to prevent other vulnerabilities caused by SSTI, such as adding sanitize function against the final output to prevent XSS issues. 

Even though a strong Sandbox added by the template language itself and a robust sanitization method is deployed on the top of it to ensure the template could not be abused by SSTI attack ,  your applications could be still at risk due to improper configuration of how dynamic data could be consumed by the template engine.

Data leakage still occurs when template engines could process data out of the permitted scope.

Take the following instance as an example.


Under one application, an Admin user could create an organization and make sensitive operations through Dashboard or  performing API requests. Once an organization is created, the Admin could add multiple users with limited permission to the Organization settings. A user could invite new users to join the organization by sending them an invitation email. To make the email more dynamic and allow the users to modify the email template, it is using a template language to compile the email template.

Under a standard operation, a user could send an email to invite a new user by taking the following steps. 

Step 1:  A user could create the following email template from the dashboard and use it to send email to a new user.

<h2>Dear Friends </h2>
<div>  
<p>   Please join {{ organization.name }} to share your fun moments by clicking the invitation link {{organization.invitation_link}}. Your friends are waiting for you <p>
 <p>Best {{ user.name }}</p>
<div>

Step 2:  Application will process the email template with the template language engine once the user saves the template.

The application server will a) validate whether there are potential template injection threats by using both the sanitization and sandbox method  b)If the template is safe and syntax is correct, replace the placeholders like {{ organizatioin.name }}, {{ user. name }} with the dynamic data extracted from the server. For example, the App Server could query the DB and get the current Organization and user  data from DB and present it with a JSON object format.

Step 3:  The invitation email will be sent to another user with the final output.

Once the template engine replaces all the placeholders in the email template with the dynamic data to generate the final email output, an email will be sent to the invited user. 

Supposed that security control implemented on the server side is robust enough to prevent Server Side Template Injection attack by its sanitization and sandbox method,  But it could still leave an open security hole due to lack of access control of  dynamic data and insufficient validation when consuming the dynamic data.

Under this case, the organization data pulled from the application server contains more data than the user is permitted to access, for example, the api_key and api_private_token which should NOT be accessible by a team user  in a normal workflow. A non-admin user has no way to extract this sensitive data.

However, a user now could access them by crafting a deliberate template to steal them even without triggering any violations. If the user is using the following crafted template, the organization api_key and api_private_token will be disclosed to them when sending out an inviting email using this template.

<h2>Dear Friends </h2>
<div> 
 <p>   Please join {{ organization.name }} to share your fun moments by clicking the invitation link {{organization.invitation_link}}. Your friends are waiting for you.  <p>
 <p>Best {{ user.name }}</p> 
{{organization.api_key}} {{ogranization.api_private_token}}
<div>

Why does the template engine access more data than the users permitted?

There are various reasons why the server provides more data out of the user’s permission scope to the template engine when processing the template.  Here are three common reasons by referring to a couple of real scenarios that I experienced.

Reason 1:  Sanitization and sandbox method is only applied to check SSTI attacks patterns. 

If the user supplied template is NOT violating certain rules defined to match SSTI attack pattern, the server template engine will proceed the replacement action without validating whether the template is attempting to consume the data beyond its designed scope.

Reason 2:  Insufficient integration testing between micro services

It is very common for a company to have different teams for frontend and backend  service development. The Frontend team will be in charge of providing an interface for users to define a template and validate the user supplied template . Whereas,  the backend team will provide the functions to extract the dynamic data to replace the template once  the frontend passes a validated template to the backend. Both teams seem to perform their responsibility correctly, however, the frontend is blind to what kind of dynamic data the backend service provides and the backend has no way to validate which kind of data is allowed to be consumed by the frontend without a good suite of  integration tests.

Reason 3:  Access Control is not implemented in internal micro services

In a micro service development environment, I have seen many times that no access controls are deployed in the internal micro services. Once the request passes the access control implemented in the public services, the internal micro service is not going to perform another layer of validation when the public service calls the internal service. In this case, the internal service that pulls the organization data from the DB does not validate whether the user has the permission to access certain fields. 

How to prevent data leakage from abusing Template Language

To avoid data leakage caused by taking advantage of the template language, various means are available for developers to adopt during the development phase. 

  • Use a whitelist of dynamic data (variables in the template, {{ }}) rather than blacklist if a whitelist method is possible when validating the user supplied template
  • Perform the sanitization and validation after the user supplied template is compiled by the template engine to check whether there is potential sensitive data after the compilation..
  • Add access control and permission validation between services. If service A is going to consume data from service B, perform a permission check to ensure the user calling service A has the right permission to consume all the data provided by service B.

Besides adopting strict rules when processing template language during the development phase, a comprehensive and thorough test is vital to catch some overlooked areas.

Conclusion

While enjoying the flexibility provided by Template Language, developers and security teams should bear in mind that more flexibility also provides more attacking surface for malicious users. The SSTI issue is not the only security issue that you should be aware of,  you need to pay attention to the potential date leakage caused by insufficient sanitization or lack of access control to sensitive data. It means, your sanitization pattern should not only match potential SSTI attack patterns , but sensitive data patterns as well.

Is your CSP header implemented correctly?

A Study of CSP Headers employed in Alexa Top 100 Websites

Introduction

The Content Security Policy (CSP) is a security mechanism web applications can use to reduce the risk of attacks, such as XSS, code injection or clickjacking, by informing the browser that something should be blocked when loading or parsing the HTML content. The CSP header has become a standard metric to improve the security posture of modern applications as most application security tools would likely flag a security issue in your applications if it detects the absence of the CSP headers.

How Content-Security-Policy works
How Content-Security-Policy Works

Recently I was tasked to add a CSP header to one of our applications to ensure it is fully equipped to combat some potential XSS issues.  After spending a while investigating which CSP policies would be a good candidate to use, I found it is not an easy task to implement a thorough CSP header while avoiding breaking legitimate site functionality. Then I decided to check how other popular web applications are utilizing CSP headers and how I could learn from them to build a robust CSP header.

How Alexa Top 100 websites are adopting CSP header 

I started to evaluate How Alexa Top 100 websites are adopting CSP header to harden its security posture by checking whether these websites are adding CSP headers and analyzing whether these CSP headers are really useful to protect against some common attacks, such as  XSS and Clickjacking.  When analyzing the CSP headers in these top websites, I was using Google CSP Evaluator to check how each CSP directives are defined  in the CSP headers besides manual testing.  The result is kind of bittersweet as there are some unexpected behaviors and implementations of CSP headers on these top websites.  Below are some findings worthy to be mentioned

Findings 

Finding 1: 51 out of Alexa Top 100 websites have CSP header added

Though I was expecting that every website in Alexa Top 100 websites  should have CSP header implemented by considering these websites attract millions of users on a daily basis, it turns out only 51 websites out of Alexa Top 100 have CSP headers enabled in the web application. 

Right, more than 50% of the websites are at least using CSP headers (some of them are use Content-Security-Policy-Report-Only), that is not that bad comparing to the statistics, less than 4% of URLs are carrying CSP headers by referring to a Google Research works

But if you get a closer look at the CSP headers employed in these 51 websites, some of them are only used to protect against Clickjacking attacks, some of them are using the CSP header as Report-Only mode.  The worst part is that most of these CSP headers are not implemented correctly to mitigate potential attacks due to misconfiguration.

Finding 2:   More than half of the websites are suffering from common CSP misconfiguration

Misconfiguration 1:  ’unsafe-inline’ keyword without specifying a nonce defined in script-scr directives

According to Google research ‘unsafe-inline’ within script-src directive is the most common security misconfiguration for Content Security Policy (CSP) and 87.6% CSP employed the ’unsafe-inline’ keyword without specifying a nonce, which essentially disables the protective capabilities of CSP against XSS exploitation. 

There are 34  websites where ‘unsafe-line’ is specified under script-src directive in the CSP configuration. Whereas,  18 out of these 34 websites (roughly 50%) are using the ‘unsafe-inline’ keyworks without specifying a nonce or a hash, which means the CSP header is not configured in a correct way to mitigate XSS exploitation. 

This finding is really astonishing as it means around 50% of these 34 heavily visited websites (including facebook, ebay, shopify) are not configuring CSP header in a correct way. Following is a snapshot where ‘unsafe-inline’ is specified without a nonce in the a CSP header employed by one of the Alexa Top 100.

Content-security-policy: default-src ‘self’ blob: wss: data: https:; img-src ‘self’ data: https:; script-src ‘self’ ‘unsafe-eval’ ‘unsafe-inline’ blob: data: https:; style-src ‘self’ ‘unsafe-inline’ data: https:;  report-uri /csp/report

Misconfiguration 2:  data: URI  schema is allowed in some directives

While around 50% of CSP employed the ‘unsafe-line’ keyword without specifying a nonce, there is another misconfiguration scenario where data: URI scheme is allowed for script-src, frame-src, object-src directive, this misconfigure also defeats the XSS protection of CSP header.

Around 25% of CSP headers employed by the Alexa Top 100 websites are using data:uri under its script-src, frame-src or object-src directives (or default-scr directive when script-src directive is missing).  For example, the following XSS attack is utilizing data:uri schema to pass malicious javascript code under your application

<iframe/src=”data:text/html,<svg onload=alert(1)>”></iframe>
<script src=”data:text/javascript,alert(1)”></script>

Misconfiguration 3:  object-src directive allows * as source or is missing (no fallback due to absence of default-src)

In some CSP headers employed by the Alexa Top 100, * (wildcard) is used in object-src directive or default-src directives, which significantly reduce the protection of CSP header as there are multiple ways to inject malicious javascript code when * is used for these directives.

The following CSP header is extracted from one of  website  

Content-Security-Policy: default-src data: ‘self’ ‘unsafe-inline’ ‘unsafe-eval’ worker-src blob: ‘self’;  connect-src * wss: blob:;  font-src * data: blob:; frame-src * blob: ‘self’;  img-src * data: blob: about:;  media-src * data: blob:;  object-src *;  report-uri /csp/report;

If the website has a XSS vulnerability, an attacker could use the following payload to bypass its CSP header

<object data=”data:text/html;base64,PHNjcmlwdD5hbGVydCgiSGVsbG8iKTs8L3NjcmlwdD4=”></object>

These misconfigurations do not mean the CSP header is not effective at all though these misconfiguration makes CSP protection weak, even useless in some cases.

Finding 3: Some minor issues are ignore in the CSP headers

Ignored issue 1:  unsafe-inline are widely added without nonce for style-src directive

Most security engineers downplay the potential security risks imposed by inline style, which is perfectly proved by the data we collected by reviewing CSP header in Alexa Top 100 websites. Among the CSP headers employed by these websites, much more CSP headers  are allowing inline style compared to allowing inline script

NO. of websites using unsafe-inline keyword without nonce under script-src directive16
NO. of websites using unsafe-inline keyword without nonce under style-src directive22

Though allowing inline style is not as bad as allowing inline script without a nonce, inline style could open the door for a number of attacks like injecting a css keylogger  to steal sensitive data. It means, it still makes some sense to add nonce under style-src directive to prevent potential attack by using inline style

Ignored Issue 2:  No Access Control or throttling method added to the report-uri endpoint to preventing malicious user from abusing it

The ‘report-uri’ is a very powerful feature built into CSP that allows website adminstrator to gain insight on their deployed policy by instructing the user agent to report attempts of violating the CSP to the report uri endpoint .  You can enable CSP’s reporting feature by specifying the URL of your reporting endpoint with a report-uri directive in your policy. Take the CSP header employed by instragram.com for example, all the violation of CSP policy will be reported to https://www.instagram.com/security/csp_report/

Content-security-policy: report-uri https://www.instagram.com/security/csp_report/; default-src ‘self’ https://www.instagram.com; img-src data: blob: https://*.fbcdn.net https://*.instagram.com https://*.cdninstagram.com https://*.facebook.com https://*.fbsbx.com https://*.giphy.com; font-src data: https://*.fbcdn.net https://*.instagram.com https://*.cdninstagram.com; media-src ‘self’ blob: https://www.instagram.com https://*.cdninstagram.com https://*.fbcdn.net; manifest-src ‘self’ https://www.instagram.com; script-src ‘self’ https://instagram.com https://www.instagram.com https://*.www.instagram.com https://*.cdninstagram.com wss://www.instagram.com https://*.facebook.com https://*.fbcdn.net https://*.facebook.net ‘unsafe-inline’ ‘unsafe-eval’ blob:; style-src ‘self’ https://*.www.instagram.com https://www.instagram.com ‘unsafe-inline’; connect-src ‘self’ https://instagram.com https://www.instagram.com https://*.www.instagram.com https://graph.instagram.com https://*.graph.instagram.com https://i.instagram.com/graphql_www https://graphql.instagram.com https://*.cdninstagram.com https://api.instagram.com https://i.instagram.com https://*.i.instagram.com wss://www.instagram.com wss://edge-chat.instagram.com https://*.facebook.com https://*.fbcdn.net https://*.facebook.net chrome-extension://boadgeojelhgndaghljhdicfkmllpafd blob:; worker-src ‘self’ blob: https://www.instagram.com; frame-src ‘self’ https://instagram.com https://www.instagram.com https://*.instagram.com https://staticxx.facebook.com https://www.facebook.com https://web.facebook.com https://connect.facebook.net https://m.facebook.com; object-src ‘none’; upgrade-insecure-requests

There are many benefits of enabling a report-uri directive for CSP as the CSP violation report might indicate some attempts to bypass or violate your CSP policy to exploit some vulnerability.  But this feature also introduces some concerns due to the way how the report-uri endpoint is implemented. 

One concern is that any users could send massive invalid CSP violation reports to the report-uri endpoint as most of these report-uri endpoints have no access control or throttle method to prevent this kind of attack. Due to the massive invalid CSP violation, it may make it really harder to spot legitimate attempts to violate CSP policy. In some scenarios, if the report-uri endpoint is not scalable and a high volume of invalid CSP violation report could cause DOS of the endpoint. 

CSP itself  is a very rich feature as it has a dozen of directives that a user could specify. I am pretty sure that you would spot some other funky or interesting  implementations of CSP implementations under the Alexa Top 100 websites. Besides that, to define a robust CSP policy without breaking the applications is not that easy. For example, some disallowing inline script CSP policy could break desired features of jQuery. That could explain why these top tier websites. 

Conclusion

While CSP could be very helpful as a part of a defense-in-depth strategy, your application should not completely rely on the protection of  CSP headers  as a sole defensive mechanism as misconfigurations could make the protection being bypassed easily. The CSP data collected from the Alexa Top 100 is just a tip of the iceberg. I believe there are much more misconfigurations in the wild.

Applying a DAST tool or SAST tool to find potential vulnerabilities, for example, XSS and Clickjacking, and eliminate  them is the most efficient solution as CSP header does not eliminate the security flaws but make the exploitation hard.

Using HTML Entity Encode to mitigate XSS vulnerability, then double check it

HTML Entity Encode (HTML Encoding) is a commonly deployed escaping/encoding method to mitigate XSS vulnerability as consciousness of XSS is growing.  A very big portion of web applications are using HTML Entity Encoding to handle untrusted data, and this method is robust enough to protect them from XSS attack for most of the time. However, under some situation, you might still expose your web applications under XSS attack even though HTML entity Encoding is implemented.  

A real world example

Following example is a mock up from one client website (the original web application is a single page application where JavaScript Code is heavily implemented), where HTML Entity Encode was deployed but failed to eliminate the XSS vulnerability. Supposed the vulnerable URL is http://www.example/test.jsp?query=userinput and injection point is the query parameter.  After sending a request to it under a modern web browser, the source code looks like,

htmlencode is a customized function on the server side to apply HTML encodings to specified string in order to combat  XSS vulnerability. The above snippet shows two piece of information a) The user input value is HTML encoded and reflected in the response under one <input> field, b) The html encode value was then assigned to innerHTML attribute of  an element when the page is loaded.

HTML Entity Encode is not sufficient here

At the first glance, it seems the mitigation method is robust enough because the user input is HTML encoded correctly and encapsulated under a double quote.   Whereas, it turns out this web application is still bearing XSS vulnerability with it.

When an attacking vector with malicious code http://www.example/test.jsp?query=<img src=x onerror=alert(1)> is requested in a web browser, malicious code <img src=x onerror=alert(1)> is still  parsed by the web browser  and the inherent JavaScript code is executed even though the user input value  is HTML encoded as &lt;img src=x onerror=alert(1)&gt; in the response page  .

What is behind this scenario?

In order to get a closer look to the problem, we might start to analyze the source code of the response from the request with attacking vector.

<body onload=”myFunction()”>

JavaScript code document.getElementById(“search_result”).innerHTML=document.getElementById(“query”).value; is the culprit that spoils the HTML Entity Encode method.  When HTML parser (HTML parse is one of the most complicated and important components of a web browser, it controls how your raw html source code is turned into web pages) runs and builds up the response page for the first time, the attribute value entity <img src=x onerror=alert(1)>in the input field will  be decoded when the html parsers is parsing the value attribute. Though it is decoded at this step, it is not intercepted as HTML content yet. Later, the decoded value is passed to the innerHTML and it will be intercepted as HTML content because the innerHTML indicated the HTML parser to parse it as HTML format content.  In short, the html encoding value in the input field is parsed twice. As a consequence, the injected malicious code will be executed in the web browser and leads to XSS attacks.

Same Flaws observed in some open source web applications

After conducting research on some open source web applications by using Qualys Web Application Scanner,  WAS detected similar XSS vulnerability in some open source web applications even though HTML Entity Encode is applied. The following pattern was observed among these vulnerability where HTML Entity Encode is used.

<input  onfocus=”JavaScriptCodehtmlencode(userinput)JavaSctiptCode” >

In the pattern, the user input is HTML Entity encoded and reflected in the event handler (onfocus is one of the event handlers).  Similar to the scenarios discussed at the beginning, the HTML Entity Encoding is defeated because web browser (actually it is the HTML parser) will HTML decode the value of the event handle before it is executed as JavaScript code.

Conclusion

This example is not a rare or special case. Especially, while building single pages applications is trendy and considered a modern web development practice, it is common to see HTML encoded user input value is reused in a single page.  For web developers and security engineers, it is important to bear in mind that HTML parsing is a very tricky work. When HTML Entity Encode method is used to handle untrusted data, you should not only check whether the encoded user input value is placed correctly in the response, but also pay attention on the whole context of the page.

Handling Cross-Site Scripting As Attacks Get More Sophisticated

Adopting third-party libraries to encode user input in the development phase and using a web application firewall in the deployment phase could fool web security managers into thinking their web applications are completely safe from Cross-Site Scripting (XSS) attacks. While it’s a good idea to employ these techniques, the illusion of safety could prove costly. These protection methods do not guarantee that your web applications are 100% free of XSS vulnerabilities, and XSS attacks that use more sophisticated techniques still occur, so care should still be taken.

Webappsec imageIn the past several months, Yahoo and Facebook patched two critical XSS vulnerabilities.  These clearly show XSS vulnerabilities continue to occur in modern and mature web applications, even for Internet companies. The XSS vulnerability in Yahoo email was straightforward: the input validation was not robust enough to escape malicious code, and the attacker was able to break input validation. The one patched by Facebook was a little trickier because it exploited a bug in the file upload function to upload malicious JavaScript code and then invoked the code by calling it from a different application. Just this week as I was finishing this blog post, an insufficient input validation XSS vulnerability was disclosed in the popular WordPress pluginNinjaForm.

As I have observed from my work experience and as a bug bounty hunter, XSS vulnerabilities are definitely not going away — and many attacks are getting more sophisticated. The golden age of penetration testing, when pen testers could discover XSS vulnerabilities simply by inputting malicious code into a search box, is over. It often takes more skill and effort for webmasters to discover the XSS vulnerabilities they need to protect against.

XSS Hidden by Web Application Firewalls

Web application firewalls (WAFs) are commonly used to protect web applications. They are indeed effective in blocking a large number of web attacks. Some pen testers give up immediately when they determine the web application is deployed behind a web application firewall. However, a WAF is just like a cast that by itself does not fix the broken limb. And like a cast, a WAF is best used as a temporary protection until the underlying issue, in this case a coding error, is fixed and redeployed.

Instead of fixing the issue, WAFs just hide XSS vulnerabilities and makes it harder for attackers to exploit them, which is their purpose; but WAFs also makes it more difficult for penetration testers or automatic scanners to discover these vulnerabilities. According to the research, over 70% of existing WAF rulesets can be bypassed through XSS obfuscation techniques. As a rule-based tool, WAFs trap the main cases for which rules are defined, but not all of the corner cases — the development effort spent to make the perfect ruleset would be better spent fixing the underlying coding error.

Recommendation: When running a security audit, e.g. via automated tools or penetration tests, always disable your WAF so that the XSS vulnerabilities can be discovered to the greatest extent. You want to make it easy on yourself to find XSS vulnerabilities, so you can fix them in your code.

In my work, I have seen many examples where customers claimed that Qualys Web Application Scanning (WAS) had generated false positives, when in fact they were true positives but the customer didn’t see the exploit because they were protected by a WAF. In these cases, I could often exploit the XSS anyway by coding exploits that bypassed the WAF protection.

Example: Methods to bypass WAF could be found on the Internet: Using escape sequences (%00onload%00=%00) instead of regular text; or an alternative XSS method, such as, payload document.body.outerHTML=maliciouscodes could be used to break some WAFs.

More Sophisticated XSS Attacks

Three types of sophisticated XSS attacks are difficult for pen testers and tools to discover. It is not easy to propose new techniques to combat these, except enhancing security implementation during web development and employing a DAST (dynamic application security testing) tool for regular security audits to make sure you catch any that were inadvertently included in your code.

DOM-Based XSS

Attacks against DOM-Based (Direct Object Model) XSS vulnerabilities modify the client side DOM tree in the victim’s browsers and run malicious code; as opposed to a traditional XSS vulnerability which exploits the server side code. DOM-based XSS vulnerabilities are harder to detect than the traditional XSS vulnerability because they resides in the script code from the website and the injection payloads are not reflected directly in the response. It has been estimated that 30% of XSS attacks on live websites are XSS inside JavaScript code and cannot be blocked by a WAF. Due to the difficulty of discovering them, DOM-based XSS vulnerabilities become a blind spot for many scanning tools and penetration tests.

Example: Here is the normal format of a DOM-based XSS attack vector

http://www.some.site/page.html#name=<Malicious-JavaScript-Code>

Multiple Step XSS

Multiple Step XSS vulnerabilities require the user to perform several actions on the applications to execute the attack vector/injected malicious JavaScript code. The main characteristics of multiple step XSS vulnerabilities are that the attack vector is injected in one page and then echoed in another page or application later. Due to this factor, it creates a challenge for penetration tests or ordinary DAST tools to identify this kind of vulnerability.

Example: Qualys WAS reported a XSS vulnerability in a customer’s application. The customer’s security team claimed they could not find the injection point and wanted to flag it as a false positive. After investigation, we found the injection point was from a different subdomain and the malicious code was invoked by making a search with a keyword matching with the malicious code injected in the response.

Path-Based XSS

Targets for Path-Based XSS attacks are applications where the request URLs are rendered directly in the response body without proper encoding or input validation. The following source code snippet is a model demonstrating how path-based XSS vulnerabilities reside in your web applications.

<a href=" <?php echo $_SERVER['REQUEST_URI'];?>">Click Here</a>

Example: Path-Based XSS vulnerabilities are a special breed but it is not rare. I have discovered path based XSS vulnerability in PHPBB3 and some other open source web applications. Meanwhile, multiple path-based XSS vulnerabilities were flagged in our clients’ web applications. The attack vector looks like:https://example.com/public/<Malicious JavaScript Code>/directory1

Conclusion

While WAFs are a great protection measure against attacks on vulnerabilities resulting from coding errors that you have not yet fixed and deployed, you should always disable WAFs for internal testing. This helps ensure you don’t inadvertently overlook any simple XSS vulnerabilities in your code. It’s always safer to fix the underlying vulnerability than to rely on a WAF for long-term protection.

In addition, developers should take care in their coding to look for the more subtle XSS vulnerabilities, since we are regularly finding attacks that try to exploit these.