Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM

Not All Dependencies are Equal: An Empirical Study on

Production Dependencies in NPM

Jasmine Latendresse

Data-driven Analysis of Software (DAS) Lab

Concordia University

Montreal, Canada

jasmine.latendresse@concordia.ca

Suhaib Mujahid

Mozilla Corporation

San Francisco, United States

[email protected]

Diego Elias Costa

LATECE Lab

Université du Québec à Montréal

Montreal, Canada

[email protected]

Emad Shihab

Data-driven Analysis of Software (DAS) Lab

Concordia University

Montreal, Canada

[email protected]

ABSTRACT

Modern software systems are often built by leveraging code written

by others in the form of libraries and packages to accelerate their

development. While there are many benets to using third-party

packages, software projects often become dependent on a large

number of software packages. Consequently, developers are faced

with the dicult challenge of maintaining their project dependen-

cies by keeping them up-to-date and free of security vulnerabilities.

However, how often are project dependencies used in production

where they could pose a threat to their project’s security?

We conduct an empirical study on 100 JavaScript projects using

the Node Package Manager (npm) to quantify how often project

dependencies are released to production and analyze their char-

acteristics and their impact on security. Our results indicate that

less than 1% of the installed dependencies are released to produc-

tion. Our analysis reveals that the functionality of a package is not

enough to determine if it will be released to production or not. In

fact, 59% of the installed dependencies congured as runtime depen-

dencies are not used in production, and 28.2% of the dependencies

congured as development dependencies are used in production,

debunking two common assumptions of dependency management.

Findings also indicate that most security alerts target dependencies

not used in production, making them highly unlikely to be a risk

for the security of the software. Our study unveils a more complex

side of dependency management: not all dependencies are equal.

Dependencies used in production are more sensitive to security

exposure and should be prioritized. However, current tools lack the

appropriate support in identifying production dependencies.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from [email protected].

ASE ’22, October 10–14, 2022, Rochester, MI, USA

ACM ISBN 978-1-4503-9475-8/22/10.. . $15.00

https://doi.org/10.1145/3551349.3556896

KEYWORDS

third-party packages, dependencies, security, npm

ACM Reference Format:

Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab.

2022. Not All Dependencies are Equal: An Empirical Study on Production

Dependencies in NPM. In 37th IEEE/ACM International Conference on Auto-

mated Software Engineering (ASE ’22), October 10–14, 2022, Rochester, MI, USA.

ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3551349.3556896

1 INTRODUCTION

The vast majority of modern software systems are built by using

modular functionalities provided by open source packages. Reports

estimate that more than 90% of open source and proprietary projects

rely substantially on reusing open source packages [

]. As a testa-

ment to the popularity of open source, popular package managers

such as

npm

, host more than 2 million reusable packages, covering

all sorts of software functionalities [25].

While the use of open source packages signicantly reduces

development time and costs [

], it also exposes software

applications to vulnerabilities. In the 2020 State of the Octoverse

security report, GitHub reveals that active repositories with a sup-

ported package ecosystem have a 59% chance of getting a security

alert in the next 12 months [

]. This problem is even more wide-

spread in the JavaScript ecosystem, where nearly 40% of all

npm

packages rely on code with known vulnerabilities [

]. Software vul-

nerabilities may lead to signicant nancial and reputation loss. A

popular example is the 2017 Equifax cybersecurity incident caused

by a web-server vulnerability in the Apache Struts package. The in-

cident led to a data breach of millions of American citizens, costing

Equifax 1.8 billion USD in security upgrades and lawsuits [26].

The problem is that developers struggle to identify what vulner-

abilities may aect their software application [

]. Current security

scanners report on the severity of a vulnerability, but lack a sup-

port to identify if the dependency is 1) used in the code and 2) is

part of the production software the project delivers. Developers

constantly complain that security alert tools report too many false

positives [

], as even the most critical vulnerability may be

unexploitable if the vulnerable dependency is never released in the

production software.

arXiv:2207.14711v2 [cs.SE] 29 Aug 2022

ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab

In this paper, we study how often dependencies are actually part

of a production software and their impact on security based on

their characteristics, usage, and context. We perform this study

on 100 JavaScript projects in

npm

, the largest and fastest growing

software ecosystem to date [

], to answer the following three

research questions:

RQ1.

How many installed dependencies are production dependen-

cies?

RQ2. What are the characteristics of production dependencies?

RQ3.

How often are npm security alerts emitted for production

dependencies?

Findings show that production dependencies represent a very

small fraction of the total number of dependencies in each project.

While projects tend to depend on hundreds of dependencies (both

direct and transitive), 51 projects did not have any production de-

pendencies, and 49 have a median of 5 production dependencies.

Contrary to common assumptions, most dependencies declared as

runtime are not shipped to production while some development de-

pendencies are included in the production software. Consequently,

we nd that dependency usage and context gives better insight

at determine if a dependency will be used in production than the

nature of a dependency itself. Furthermore, our results show that

not all security vulnerabilities reported by

npm

are an actual threat

to the software in production. Our paper makes the following con-

tributions:

•

To the best of our knowledge, this is the rst study to in-

vestigate the discrepancy between installed and production

dependencies in open source projects.

•

We report on results that challenge the assumptions of depen-

dency management and should be revisited by researchers

and practitioners.

•

We investigate the support of current tools in providing

better information for developers regarding the scope and

context of vulnerable dependencies.

•

We make our dataset of 100 projects available

, including

all scripts used to collect and pre-process data, to facilitate

replication and foment more research in the eld.

The rest of the paper is organized as follows: we start by mo-

tivating our problem with an example in Section 2. We describe

and justify our methodology in Section 3 and explain our results in

Section 4. Implications of our ndings are discussed in Section 5. We

present the related work in Section 6, and discuss the limitations to

our study in Section 7. Finally, we conclude our study in Section 8.

2 MOTIVATION & BACKGROUND

To motivate our study and illustrate the terminology used in this

paper, we walk the reader through the creation of a simple applica-

tion using

create-react-app

[

]. The terms used in this example

and throughout this paper are formally dened in Table 1. This

example application is a single-page "Hello World" application that

is provided by React when initializing a Create React App project.

We create our application by simply running the command

npm

create-react-app my-app.

https://zenodo.org/record/6518765

Figure 1: A snippet of the package.json le listing the depen-

dencies of our example project.

1 "dependencies": {

2 "@testing−library/jest−dom": "^5.16.2",

3 "@testing−library/react": "^12.1.3",

4 "@testing−library/user−event": "^13.5.0",

5 "react": "^17.0.2",

6 "react−dom": "^17.0.2",

7 "react−scripts": "5.0.0",

8 "web−vitals": "^2.1.4"

9 },

10 "devDependencies": {

11 "@webpack−cli/generators": "^2.4.2",

12 "css−loader": "^6.6.0",

13 "html−webpack−plugin": "^5.5.0",

14 "prettier": "^2.5.1",

15 "style−loader": "^3.3.1",

16 "webpack": "^5.69.1",

17 "webpack−cli": "^4.9.2",

18 "webpack−dev−server": "^4.7.4",

19 "workbox−webpack−plugin": "^6.5.0"

20 }

How many dependencies in our project?

To achieve this

single-page React application without further programming, our

generated application reuses several open source packages pub-

lished in

npm

. We refer to each of the packages as a dependency of

our project. The dependency conguration of our project is stored

in the

package.json

le, shown in Figure 1. Dependencies are

grouped into two groups: runtime dependencies (“dependencies”)

and development dependencies (“devDependencies”). Runtime depen-

dencies are dependencies required by the application to function,

e.g., as we build a React application, our project depends on

react

version 17.0.2. Development dependencies, on the other hand, are

needed to develop the project, e.g., to format the code (prettier

2.5.1), and are not required by the software to run. As it can be seen

in Figure 1, our small application has 7 runtime dependencies and

9 development dependencies.

Once we install these dependencies locally to build and test

our application (

npm install

) we may be surprised to see that

a total of 1,764 dependencies were installed. The dependencies

shown in Figure 1 are direct dependencies of our project, each of

which have dependencies of their own. For instance, the package

loose-envify

is a dependency of

react

. These are called transitive

dependencies and represent the vast majority of installed depen-

dencies. As such,

loose-envify

is a transitive dependency of our

example application. We use the term installed dependencies to

refer to all dependencies of a project, both direct/transitive and

development/runtime dependencies.

Is our application vulnerable?

Security vulnerabilities are

a widespread problem in

npm

[

]. Given that our application de-

pends on 1,764 installed dependencies, is our application aected

by vulnerabilities? To verify this, we resort to using a Software

Composition Analysis (SCA) tool. SCA tools are used to identify

open source components in software codebases to evaluate security,

license compliance and overall code quality [

]. In our example,

we use npm audit, a native tool of

npm

that reports vulnerabilities

aecting software dependencies and maintains its own database

Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM ASE ’22, October 10–14, 2022, Rochester, MI, USA

Table 1: Concepts and denitions.

Concept Denition Example

Runtime dependency Refers to the "runtime" conguration of a dependency react is a runtime dependency as shown in Figure 1.

in the package.json le and is needed for the

application to function.

Development dependency Refers to the "development" conguration of a dependency webpack is a development dependency as shown in

in the package.json le and indicates that the Figure 1.

dependency is needed to develop the application.

Installed dependency Refers to the dependencies installed in the project The dependencies depicted in Figure 1 are part of the

and the result of the npm install command. installed dependencies, and so are their dependencies.

Depth Refers to the level of a dependency in the dependency tree. npm ls is used to obtain the dependency tree.

Direct dependency Refers to a dependency with a depth of 1. The dependencies shown in Figure 1 are direct

dependencies.

Transitive dependency Refers to a dependency with a depth greater than 1. The dependencies of the dependencies shown in

Figure 1 are transitive dependencies.

Usage Refers to the scope in which a dependency is used Figure 2 shows that react-dom is used in production.

Context Refers to the context of the application in In our example application, webpack is a development

which a dependency is used. dependency used to bundle the application’s resources.

of vulnerabilities. If a dependency is aected by one of more vul-

nerabilities, we refer to the dependency as a vulnerable dependency.

In our example application, upon running

npm audit

, we receive

the report that our simple application contains 6 moderate severity

vulnerabilities, 13 high severity vulnerabilities, and 1 critical sever-

ity vulnerability. That is, without any further programming, our

project already started with an alarming number of vulnerabilities

of moderate, high, and critical severity. Examples of the reported

high severity and critical severity vulnerabilities include Regular

Expression Denial of Service, Template Injection, and Prototype

Pollution.

Can reported vulnerabilities really aect our example ap-

plication in production?

Vulnerable dependencies are problem-

atic and may aect the security of our project in multiple ways.

However, the risk of vulnerable dependencies reaches its peak when

the dependency is needed for the software to run in a production

environment. To nd which dependencies are part of our produc-

tion software, i.e., production dependencies, we use a module bundler.

A module bundler is a tool that assists the building process of a

software by resolving the software dependencies and pruning the

dependencies that are not needed in the production software. The

process of pruning dependencies is referred to as tree shaking. We

use webpack [

], a popular JavaScript module bundler, to build our

production software and export a list of production dependencies.

Upon building our project with

webpack

, the tool generates a

source map le, which contains the list of production dependencies

of our application. From the 1,764 dependencies in our example

project, Figure 2 shows that only 6 are released to production:

react

object-assign

scheduler

react-dom

style-loader

, and

css-

loader

. More so, none of our production dependencies contained

any reported vulnerability, thus, our original report of 15 vulner-

abilities aected dependencies that would not be present in the

application in production.

The problem: security alert fatigue.

Our example showcases

an important problem in current software development. Even small

Figure 2: A snippet of the source map generated by building

our example project.

1 "version": 3,

2 "le": "main.js"

3 "mappings": "KAAK,CAACC,EAAOC,GAAI..."

4 "sources": ["node_modules/css−loader/dist/runtime/api.js",

5 "node_modules/object−assign/index.js",

6 "node_modules/react−dom/cjs/react−dom.production.min.js",

7 "node_modules/react/cjs/react.production.min.js",

8 "node_modules/scheduler/index.js",

9 "node_modules/style−loader/injectStylesIntoStyleTag.js"]

applications may depend on thousands of dependencies and vulner-

abilities are constantly being reported by the open source commu-

nity. Developers face the dicult challenge of separating security

alerts that are relevant to their application security from the long

reports yielded by current SCA tools [

]. In this paper, we

evaluate this problem on a scale of 100 popular JavaScript projects.

3 STUDY DESIGN

The goal of the paper is to study how often project dependencies are

shipped to production and their impact on the security of software

projects. In this section, we describe how we select and curate the

set of study projects (Sections 3.1 and 3.2), and how we identify

production dependencies (Section 3.3). We provide an overview of

our methodology in Figure 3.

3.1 Dataset of Candidate Projects

The focus of our study is to investigate how active software devel-

opment JavaScript projects use their dependencies. To this aim, we

start by collecting data of a large number of JavaScript reposito-

ries as candidate projects for our study. Many studies have used

the number of GitHub stargazers as a way to select candidate

projects [

]. Thus, we start with 11,860 popular JavaScript

projects that were collected on July 27th, 2020 with at least 100

stargazers.

ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab

Finding what dependencies are shipped to production is a very

challenging task making it impractical to apply this analysis on

a large-scale [

]. In our study, we opt to select projects that al-

ready make use of tree shaking (see Section 2 for a more in-depth

explanation). Specically, we select projects using either

webpack

rollup

[

] because they are two of the most popular mod-

ule bundlers for JavaScript projects and they have integrated tree

shaking support.

To nd out which projects use

webpack

and

rollup

, we auto-

matically parse the

package.json

les of the 11,860 projects to

identify 1) if any of the bundlers are declared as a dependency and 2)

the tree shaking algorithm is enabled for the project. Through this

process, we nd that 155 JavaScript projects make use of

webpack

or rollup, and have tree shaking enabled.

3.2 Building Candidate Projects

To assess whether a dependency is used in production, we have to

successfully build each project in a production environment with a

module bundler. During the build of a project, the module bundler

rst looks for all of the dependencies in the project and constructs a

dependency graph (dependency resolution). The dependency graph

is then converted along with source code into a single le (packing)

called the bundle. Source maps are then generated after a successful

build.

To build the candidate projects, we rst clone each of the 155

JavaScript projects locally. We planned to build a framework to auto-

mate the build of all the 155 JavaScript projects. However, we soon

realized that many projects require specic building commands and

setup to be build successfully. In fact, the majority of the projects did

not support the standard build command (

npm run-script build

Furthermore, the environmental settings varied across projects,

e.g., some projects require specic NodeJS versions and identifying

this automatically is very challenging. We then proceed to semi-

manually build each project using the following methodology:

(1) Read projects documentation.

We read the documentation

of all the 100 studied projects to identify the specics of each

project build. The goal of this step is to identify all the steps

of the building process: the build commands, supported Node

versions, supported package manager (e.g. YAML or npm), and

any other specicity of the project building conguration. At

this point, we also conrmed that all selected projects are related

to software development, i.e., are not personal toy-projects.

(2) Install dependencies.

We install all dependencies specied

in the

package.json

le by using the

npm install

command.

This generates a

node_modules

folder in every project’s home

directory which contains all installed dependencies.

Table 2: Descriptive Statistics of the Selected Projects.

Mean Median Min Max

# stars 4827.6 1224 112 74201

# commits 1364.9 496 31 6188

# contributors 62.3 26 4 401

age (years) 5.2 5 1 12

(3) Build project.

Following each project’s documentation, we

build each project in the dataset. The rst author manually fol-

lowed the steps of the building process to ensure the build was

successful, the source maps containing the production depen-

dencies was generated, and the yielded artifacts targeted the

production environment.

(4) Generate source maps.

Upon the successful completion of

the building process, source maps are generated and saved in

the project’s temporary folder or home directory.

After our careful process, we successfully build and generate

source maps for 100 JavaScript projects. From the 55 projects that

failed in our process, the main culprit was the generation of the

source maps le. In most of the failed cases, projects’ conguration

did not have the exibility to generate the source maps le. For ex-

ample, we found some projects created using the

create-react-app

package that does support module bundlers and tree shaking, but

does not have the option to output source maps.

We present descriptive statistics of the 100 projects we success-

ful built and generate source maps in Table 2. The projects of our

dataset are very popular (median 1,224 stargazers), tend to be ma-

ture projects (median of 5 years of development and 496 commits)

and are developed by medium-sized team of developers (median of

26 developers).

3.3 Identifying Dependencies in Production

To identify production dependencies, we rst collect all dependen-

cies found in the source maps of each projects using a mix of source

map parser [

] and regular expression (regex). Then, to obtain the

version of each dependency found in the source maps, we locate its

package.json

le in the respective project’s

node_modules

folder

and parse it. This results in a dataset of production dependencies

with their corresponding version.

In addition to identifying dependencies used in production, we

also want to identify two very important characteristics of all depen-

dencies, as they have an inuence on the risk of vulnerabilities [

1) the dependency scope, runtime or development and 2) whether

the dependency is a direct or transitive dependency of the project.

To classify a dependency into runtime or development, we analyse

the

package.json

le of a project, classifying dependencies con-

gured in the "dependency" section as runtime dependencies, and

classifying dependencies declared in "devDependency" as develop-

ment dependencies. Since transitive dependencies are not listed in

the

package.json

le, we identify the type of the original depen-

dency which determines the type of the transitive dependency.

To classify installed dependencies into direct or transitive depen-

dencies, using the command

npm list

we generate the dependency

tree, a hierarchical representation of relationship between depen-

dencies. The

npm list

command lists all installed dependencies

in json format, including the name, version, path, and depth of

each dependency. From the depth, we identify each dependency as

direct or transitive, i.e., direct dependencies have depth = 1, while

transitive dependencies have depth > 1.

Our methodology has one limitation, we cannot automatically

resolve missing peer dependencies. Peer dependencies are used to

decouple dependencies between projects, to ensure a single version

of the package is installed for all dependencies. For example, in

Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM ASE ’22, October 10–14, 2022, Rochester, MI, USA

Dataset of

candidate projects

(11,680 JavaScript projects) (155 JavaScript projects)

Selecting projects using a

module bundler and tree

shaking

Build projects and generate

source maps

(100 JavaScript projects)

Extract dependencies from

all projects' dependency tree

(219,829 npm dependencies)

Group dependencies by project

and version, aggregate by

minimal depth

(95,902 npm dependencies)

Figure 3: Overview of our approach for ltering projects and collecting dependencies.

applications with many

npm

packages depending on

react

can be declared as a peer dependency to prevent the installation

of multiple (possibly conicting) versions of

react

. Unlike run-

time and development dependencies, peer dependencies are not

automatically installed by

npm

. Instead, they must be included by

the code that uses the package as a dependency. We nd that 37

projects in our dataset have missing peer dependencies. Since it is

not possible to automatically resolve missing peer dependencies

for all 37 projects, we exclude the dependencies from our analysis.

4 RESULTS

In this section, we present the results of our three research questions.

For each research question, we present its motivation, the approach

to answer the question, and the results.

RQ1: How many installed dependencies are

production dependencies?

Motivation:

While reusing packages may reduce development

time, developers have to constantly maintain their dependencies to

x bugs in the packages and mitigate the problems of vulnerable

dependencies [

]. However, identifying dependencies used

in production is not a trivial task making it dicult to prioritize

dependency-related maintenance activities [37].

In this research question, we want to assess how often dependen-

cies of the selected projects are actually production dependencies.

Answering this question is the rst step to understand how often

a runtime and development dependency is used in production. It

will also help us better understand how dependencies are used in

practice and how they impact the security of software.

Approach:

To approach this research question, we use the method-

ology described in Section 3.3. That is, we start by installing all

dependencies from each project to retrieve the list of installed de-

pendencies and their respective versions. To classify an installed

dependency into direct or transitive, we generate the dependency

tree of each studied project. Then, to identify production depen-

dencies, we build all software projects with their respective mod-

ule bundler (

webpack

rollup

). This building process was done

manually by following the building steps specied in the project

documentation, to ensure each project is built correctly and without

errors. After building the project, we analyze the yielded source

maps to identify the production dependencies. Finally, we cross ref-

erence the installed dependencies and production dependencies to

classify each project dependency into production/non-production,

runtime/development, direct/transitive and report our ndings.

Finding 1: Of the 100 projects, 51 projects contain no produc-

tion dependencies.

To make a better sense of our results, we split

Table 3: Dependency prole of projects with and without

production dependencies in absolute numbers and median

of aggregated value per project. The percentages are always

in relation to the # of Installed Dependencies.

Projects with Zero Projects with 1+

Production Deps Production Deps

Dependencies Total Median Total Median

Installed 46,031 (100%) 851 53,421 (100%) 1,017

Runtime 1,005 (2.1%) 0 873 (1.6%) 5

Dev 45,025 (97.9%) 832 52,542 (98.4%) 1,017

Direct 1,539 (3.4%) 40 2,098 (3.9%) 29

Transitive 44,492 (96.6%) 809 51,307 (96.1%) 963

Production – – 497 (0.9%) 5

our dataset of 100 projects into two sets: projects with produc-

tion dependencies (49 projects) and projects without production

dependencies (51 projects). Table 3 shows the total number of de-

pendencies and their characteristics in both sets of projects. The 51

projects with no production dependencies have installed a total of

46 thousand dependencies, including direct and transitive packages,

however, none of the installed dependencies are used in produc-

tion. More interestingly, among the installed dependencies, there

were 1,005 packages that were declared to be runtime dependen-

cies, which is supposedly required at the runtime of the the nal

software, but were not included in the nal production artifact.

We also note that the set of 51 projects with no production

dependencies have a median of runtime dependencies of zero. We

conrm this nding through manual investigation and nd that 39

projects in our dataset only declare development dependencies.

Most of these projects are libraries meant to be used by other

projects as development tools. Examples of such projects are

Vuex

a state management pattern for Vue.js applications;

three.js

, a

popular cross-browser 3D library; and

polished

, a lightweight

toolset for writing styles in Javascript. All those library projects

have the incentive to depend on little to no runtime dependencies,

as the fewer dependencies they have, the less constrained their

users may be to rely on their libraries [15, 20].

Finding 2: From the 49 projects with production dependen-

cies, production dependencies represent less than 1% of the

installed dependencies.

The results show that projects with pro-

duction dependencies have a total of 53,421 dependencies, of which

only 497 dependencies (0.9%) are released to production (see Ta-

ble 3). Analyzing the median number of dependencies per project

ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab

Table 4: Characteristics of dependencies in projects with pro-

duction dependencies.

Direct Transitive Total

Production

Dev 62 77 139

Runtime 175 178 353

Non-production

Dev 1,809 50,594 52,403

Runtime 52 458 510

Total 2,098 51,307 53,405

Figure 4: Number of production dependencies on the 49

project with one or more production dependencies.

(see Median column in Table 3), we nd that projects have in me-

dian 5 production dependencies while depending in median over a

thousand dependencies. However, we notice that not all runtime

dependencies are used in production. The total number of runtime

dependencies installed (873) far exceeds the number of production

dependencies (497), indicating that many runtime dependencies

may be incorrectly congured or not used in the code.

Figure 4 shows the distribution of the number of production

dependencies per project. We can observe that 15 projects contain

a single production dependency and the vast majority of projects

(65.3%) have less than 10 dependencies used in production. Still,

we found some projects that depend heavily on packages in their

production build, with

ProjectMiradormirador

being the project

with the most production dependencies in our dataset with 92.

Finding 3: More than half (59%) of the runtime dependen-

cies are not used in production.

As shown in the "Production"

row of Table 4, we nd that 510 out of 863 of the total runtime

dependencies are not shipped to the production bundle. Runtime

dependencies are dependencies (supposedly) required by the appli-

cation to run. Our results, however, show that in the majority of

the cases, dependencies are declared as runtime but are not actually

used in the code, thus, are excluded by the module bundler during

the build. This nding suggests developers mistakenly maintain

unused dependencies in their project conguration, which indicates

that they lack the necessary information to determine whether a

dependency is actually used by the software in production. This

is corroborated by related work [

], where authors reported that

unused dependencies occur in 80% of studied projects.

51 out of 100 projects do not use any dependencies in production.

The 49 projects that ship dependency to production contain less

than 1% of production dependencies. Contrary to common belief,

59% of runtime dependencies are not used in production.

RQ2: What are the characteristics of production

dependencies?

Motivation:

Production dependencies are the prime security lia-

bility in software systems since they can compromise a running

software [

]. Current SCA tools may not distinguish dependency

scope (i.e., production, non-production) [

], which may lead

to reporting unexploitable vulnerabilities (false positives). They

may also only consider direct dependencies although vulnerabili-

ties can be introduced transitively [

]. The problem is that

assumptions about production dependencies are not always correct.

In fact, RQ1 showed that runtime dependencies are not always in

production. In this research question, we study the characteristics

production dependencies to establish a practical understanding of

how they are used and in what context they are used. Such ndings

help in improving current SCA tools as they provide insights on

how dependencies are used in practice.

Approach:

To identify the characteristics of dependencies used in

production, we consider the production dependencies identied in

RQ1 and classify them based on their scope (runtime, development),

depth, and usage.

In theory, one can identify the scope of a dependency by look-

ing at the nature of the functionality provided by a package. For

instance, packages that provide development utilities should not

become production dependencies. To investigate to what extent the

nature of the package determines if it will be used as a production

dependency, we analyze how packages are released to production

across the 100 studied projects. We analyze a total of 1,269 unique

packages. We then classify the packages in three categories: 1) pack-

ages that are always used in production, 2) packages that are never

used in production, and 3) packages that are sometimes used in

production.

Finding 4: 28.2% of production dependencies are development

dependencies.

The rst section of Table 4 shows the characteris-

tics of production dependencies. We nd that 28.2% of production

dependencies are development dependencies and the remaining

71.7% are declared as runtime dependencies. It is expected that all

dependencies released to production consist of runtime dependen-

cies since they provide the application with specic functionalities

to be used by the client. It is then surprising to nd that almost

30% of the dependencies released to production are development

dependencies since such dependencies are, by default, not included

in the production bundle.

While unusual, having a development dependency in production

occurs in 37 of the projects in our dataset. To better understand

this, we perform an exhaustive inspection of the dependency con-

guration the 37 projects and deduct two possible causes for a

development dependency to be in production. First, the selected

projects use module bundlers, which disregard the conguration

of the

package.json

le and use source code analysis to identify

what should be a production dependency. Developers may not be as

Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM ASE ’22, October 10–14, 2022, Rochester, MI, USA

careful to specify their development dependencies as their building

process does not depend on a correct specication of development

and runtime dependencies [

]. In fact, from the 37 projects with de-

velopment dependencies in production, 4 (10.8%) projects declared

all of their dependencies as development dependencies although all

of them have at least 1 dependency in production. Second, it can be

that the dependency is initially declared under the "dependencies"

property of the

package.json

le, but is intentionally moved by

the developer to "devDependencies" to get rid of security warnings,

as it is explained in a

create-react-app

GitHub issue [

]. The

author of the issue explains that

npm audit

reports vulnerabilities

for code that never runs in production, but strictly at build time in

development. They then suggest to move vulnerable dependencies

to "devDependencies" to get rid of the security warning. We believe

development production dependencies are unlikely to happen in

projects that do not use module bundlers. By default,

npm

does

not include development dependencies in a production build. This

means that a project that requires a development dependency at

runtime will not function because of the missing dependency.

Finding 5: The majority of production dependencies (51.8%)

are transitive dep endencies.

Looking at the Transitive columns

of Table 4, we notice that 51.8% of production dependencies are

dependencies of their direct project dependencies. These results

suggest that developers may not have control over the majority of

production dependencies. Naturally, a transitive dependency can

only be released to production if the original dependency is also

released to production. Hence, developers should be extra careful

when selecting production dependencies, preferably by selecting

packages that have little to no production dependencies on their

own, to reduce the attack surface through vulnerable dependencies.

Finding 6: The 237 production dependencies come from 183

unique npm packages. From these, 43 are sometimes not used

in production in other projects.

To put things in perspective,

we evaluate the number of unique

npm

packages in our dataset

by grouping the packages by name and obtain 1,269 unique

npm

packages. From this, we nd that 1,086 (85.6%) are never used in

production since they oer functionalities that are development-

only. For example,

eslint

installed in 79 projects, is a static code

analysis tool that is used to identify problematic patterns found in

JavaScript code,

@babel/core

installed in 70 projects, is a command

line interface tool that facilitates working with

babel

, and

rollup

installed in 65 projects, is a build tool for JavaScript projects.

For the rest of the packages, we nd that 183 packages are used

in production at least once. Taking a closer look at the production

packages, we nd that 140 (76.5%) packages are always used in

production when installed in a project and that such packages

do not occur frequently. In fact, they occur at most in 2 dierent

projects and are installed as runtime dependencies. For example,

is-promise

, a library that tests whether an object is a

promises-a+

promise,

query-string

, a library that parses and stringies URL

query strings, and

react-fast-compare

, a library that provides

specic handling of fast deep equality comparison for React, are

all installed in 2 projects, and used in production 100% of the time

they are installed.

Interestingly, we nd that 43 (23.5%) of the 183 production pack-

ages are not always shipped to production. This indicates that some

Table 5: Frequently installed packages that are b oth used

and not used in production.

Package # Production Total # % in

Installations Installations Production

react 4 40 10%

react-dom 3 37 8.1%

prop-types 13 23 56.5%

@babelruntime 10 19 52.65%

lodash 4 14 28.6%

core-js 5 13 38.5%

classnames 5 8 62.5%

react-is 1 5 20%

react-redux 4 5 80%

packages are used dierently (in production and not in produc-

tion) across projects regardless of their functionalities. We show in

Table 5 10 examples of such packages, and how often they are in

production versus how often they are installed. The results show

that

react

, a library for building user interfaces, is the most fre-

quently installed package appearing in 40 projects, but is only

released to production in 4 projects. In contrast,

react-redux

, a

React binding for Redux allowing React components to read data

from a Redux store, only appears in 5 projects, but is released to

production in 4 out of 5 projects. In only one project (redux-little-

router) is

react-redux

not released to production and declared as a

development dependency. We further inspect the

package.json

redux-little-router

and nd that

react-redux

is a peer depen-

dency, thus, it is not included in the production bundle of the project.

It is also worth noting that redux-little-router is a lightweight li-

brary that provides exible React bindings and components. Thus,

the project makes a conscious eort to mitigate dependency bloat,

declaring most of its dependencies as development dependencies,

and including

react

react-dom

react-redux

, and

redux

as peer

dependencies.

The main takeaway from this nding is that we cannot identify

production dependencies by looking at the functionalities of a pack-

age alone. As we have shown, the scope of a dependency can vary

based on the context and usage of a package, which means it may

dier from project to project. For example, a module bundler may

be used in production in one application since it uses some of its

functionalities at runtime, but may only be used in development in

another application. Thus, it is important for SCA tools to include

this scope analysis in their approach so that developers can more

easily identify production dependencies based on their own usage

and context.

Our ndings indicate that 28.2% of production dependencies

come from development dependencies and that 51.8% come from

transitive dependencies. The functionality of the package alone

does not determine if they will be shipped to production: 43 of

183 packages encountered in production in one project are not

shipped to production in other projects.

ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab

RQ3: How often are npm security alerts emitted

for production dependencies?

Motivation:

The observations made in RQ1 suggest that the ma-

jority of the dependencies are not used in production. While vulner-

abilities in non-production dependencies may aect the develop-

ment environment (e.g., installing packages with malicious code),

it is when a vulnerable dependency is released to production that

the threat of exploitation reaches its peak [

]. Developers should

constantly run scanners to identify security alerts in their project

and prioritize xes in production dependencies, to avoid having

their software compromised. The problem is that tools such as

npm

audit

often report many false alerts for deployed code, making

vulnerability reports noisy and bloating audit resources [9, 39]. In

this research question, we investigate how often security alerts are

emitted for production dependencies compared to non-production

dependencies and the characteristics of vulnerable dependencies.

Approach:

To investigate how often vulnerabilities are encoun-

tered in production and non-production dependencies, we rst

generate the

npm

vulnerability report of each project by using the

npm audit

tool. Next, to obtain the

npm audit

reports in a parseable

csv format, we adapt the npm-deps-parser [6], a tool that parses,

summarizes, and prints

npm audit

json output to markdown. From

this, each vulnerability report is identied with the project name,

the vulnerable dependency and version, the severity, and a unique

link to the GitHub Advisory Database (GAD) [

], a database of

security advisories aecting the open source world. To obtain the

scope and depth of each vulnerable dependency, we cross-reference

the set of vulnerable dependencies with the set of production de-

pendencies and installed dependencies for each project. Because of

the limitations discussed in Section 3.3, we could not identify the

scope and depth of 29 vulnerable dependencies and exclude them

from further analysis.

Finding 7: A total of 608 security alerts are emitted for de-

pendencies of 32 projects, yet none are related to production

dependencies.

In our dataset, no security alerts were emitted for 68

projects. The remaining 32 projects reported a total of 608 security

alerts for 456 vulnerable dependencies, i.e., the same dependency

may issue multiple security alerts. In median, these 32 projects re-

ported 16 security alerts, none related to production dependencies.

There are a few reasons as to why security alerts may have

been emitted only to non-production dependencies. First, as seen

in RQ1, the vast majority of dependencies are not released to pro-

duction (99%), the chances of vulnerabilities being encountered in

non-production are 99x higher than in production dependencies.

Second, developers of the selected projects are likely making the

conscious eort of updating production dependencies to mitigate

security vulnerabilities, since they may be aware of what dependen-

cies may be used in production (e.g., developers open a PR in the

project

InstantSearch

to update a vulnerable dependency [

]).

The problem, however, is that tools such as

npm audit

make no

distinction whether security alerts are referring to non-production

dependencies. Developers have to know themselves which depen-

dencies are released to production to lter out relevant security

alerts that need urgent action, making it harder to prioritize man-

agement eorts. This is shown in related work [

], where authors

Table 6: Characteristics of vulnerable dependencies reported

by npm vulnerability alerts.

Direct Transitive Total

Development 9 410 419

Runtime 1 7 8

Total 10 417 427

Table 7: Count of vulnerability reports per severity level

with the npm recommended action.

Vulnerability Severity Dependency

Severity Recommended action Runtime Dev

low Address at your discretion 3 33

moderate Address as time allows 1 226

high Address as quickly as possible 5 263

critical Address immediately 0 45

reported that 69% of the surveyed developers claimed to be unaware

of their vulnerable dependencies and that dependency updates are

perceived as extra workload and responsibility.

Finding 8: 98.1% (419) of the vulnerable dependencies are de-

velopment dependencies.

In this analysis, we switch from secu-

rity alert reports to vulnerable dependencies, as multiple reports

may be issued for the same dependency under dierent vulnerabil-

ities. The rst row of Table 6 shows the number of development

and runtime vulnerable dependencies reported by our experiment.

The

npm audit

tool reports security alerts from a total of 419

vulnerable development dependencies, representing 98.1% of all

vulnerable dependencies identied. From the 419 vulnerable devel-

opment dependencies, 9 (2.1%) are direct dependencies, and 410

(97.9%) are transitive dependencies. Next, we analyze the severity

of the vulnerability reports in relation to the characteristics of vul-

nerable dependencies as shown in Table 7. We nd that all critical

and most of the high-severity reports are emitted for development

dependencies.

In some cases, tools such as

npm audit

allow developers to lter

out development dependencies from the security reports, as they

are supposedly not released to production [

]. It is dangerous, how-

ever, to completely ignore the security maintenance of development

dependencies. Some development dependencies are used in produc-

tion, as seen in RQ2, and hence, have the risk of being exploited in

a production environment. Vulnerable transitive dependencies that

are released to production are equally dangerous since even if they

are reported by npm, developers are not in control of their update.

32 projects in our dataset reported a total of 608 security alerts,

but none of the alerts referred to a production dependency.

Projects have in median 16 security alerts, but the vast majority

refer to development non-production dependencies which does

not represent a threat for their running application.

Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM ASE ’22, October 10–14, 2022, Rochester, MI, USA

5 DISCUSSION

In this section, we discuss the implications from our work and

possible solutions for current source code based tools.

5.1 Implications

Tracking production dependencies is very challenging.

While

our study focuses on a selected number of 100 popular JavaScript

projects, the results showcase the diculties of mapping a project’s

production dependencies. This diculty arises primarily because

assumptions commonly held by the development community re-

garding dependency management do not hold in practice:

(1)

Assumption 1: Runtime dependencies are always shipped to

production. Our results showed that the majority of depen-

dencies declared as runtime are not used in production (RQ1).

Developers may spend time ineectively managing runtime

dependencies due to security alerts, without conrming that

such dependencies are bundled in their delivered software.

(2)

Assumption 2: Development dependencies are never shipped

to production. In projects that use module bundler, dependen-

cies declared as development may be shipped to production

(RQ2). In fact, development dependencies represent a third of

all production dependencies identied in our study. Develop-

ers may disregard all their development dependencies as being

irrelevant for security upgrades, when in fact, some vulnera-

ble development dependencies are shipped in their delivered

software.

(3)

Assumption 3: The functionality provided by the package is

sucient to determine if it is a production dependency. Partic-

ularly in cases where packages provide runtime utilities, our

results show that 43 out of 183 packages are released to pro-

duction in some projects but not in others. Thus, the package’s

functionality is not sucient to determine whether a package

is used in production (RQ2).

These assumptions have the potential to aect the security of

the delivered software, as developers may wrongly assume what

dependencies are sensitive to security exploits.

Not all vulnerabilities in dependencies are a security risk

for the software in production.

Prior research has shown that

not all vulnerabilities are relevant for the software in production

[

]. In this paper, we expand on this by studying the rele-

vance of dependencies for the security risk of a software in pro-

duction. Our ndings indicate that, given by the prominence of

non-production dependencies (RQ1), the vast majority of security

alerts will be emitted for dependencies that do not impact the secu-

rity of a software in production (RQ3).

To put things in perspective, we analyze the types of vulnerabili-

ties reported by

npm audit

. The most common type of vulnerability

identied in the studied projects is Regular Expression Denial of

Service (ReDoS), accounting for 25.3% of all reported vulnerabili-

ties and for 27% of high severity vulnerabilities. While a diligent

dependency management is of utmost importance to mitigate secu-

rity risks, developers should be mindful of the types of alerts they

should prioritize. In the case of a ReDoS attack, the performance of

an application is compromised if there is a regular expression that,

with malicious input, slows it down exponentially. However, we

nd that 97.3% of the dependencies aected by a ReDoS vulnerabil-

ity are development-only, which tend not to be part of a production

software. Previous research shows that developers tend to ignore

security alerts when they receive a lot of them [

]. Our approach

allows them to focus on the important ones rst (i.e., security alerts

for vulnerable dependencies in production).

Source maps and tree shaking can benet developers beyond

client-side applications.

In this paper, we use module bundlers

to accurately dierentiate between installed dependencies and pro-

duction dependencies. Module bundlers are most commonly used

in client-side applications, which are generally dened as libraries

or frameworks running in a Web browser (e.g., React, Vue, Angular)

to support the development of Web applications. Module bundlers,

however, can benet far beyond just client-side applications by

helping developers:

(1)

Prioritize addressing security alerts on production dependen-

cies. As security alerts are very commonly issued for projects

that rely on open source code, developers should prioritize ad-

dressing security issues that have the potential to aect their

production software, by identifying vulnerabilities aecting

their production dependencies.

(2)

Prioritize maintenance tasks on production dependencies. As

projects depend on increasingly high number of software depen-

dencies, updating all dependencies in every release may become

increasingly prohibitive. Updating dependencies always have

the risk of breaking changes [

], leading to software bugs

and mistrust between project maintainers [

]. Hence, devel-

opers should prioritize updating production dependencies to

focus their maintenance tasks on packages that may aect their

delivered software.

5.2 Towards Better Tool Support

SCA tools are constantly used by software projects to control the

risk related to software dependencies, such as vulnerabilities, and

compliance to open source licenses[

]. To understand the sup-

port current SCA tools provide to production dependencies, we

investigate four popular tools: npm audit, Snyk, Dependabot, and

OSWAP Dependency-Check. We analyze the documentation of the

SCA tools, as well as apply them to some of our studied projects to

assess their capabilities and limitations.

We present in Table 8 an overview of the features related to

dependency scope and usage from four popular SCA tools. All SCA

tools we assess cover all dependencies of a software project, includ-

ing both direct and transitive dependencies. Given a project may

have thousands of installed dependencies, we now dive into the

ltering capabilities of the tools. We note that only

npm audit

and

Snyk [

] provide ways of ltering security alerts based on whether

the vulnerability aects runtime/development dependencies or di-

rect/transitive dependencies. The ltering of runtime/development

dependencies is based on project conguration (e.g.,

package.json

le), thus, it is subject to limitations when it comes identifying pro-

duction versus non-production dependencies. Neither Dependabot

nor OSWAP Dependency-Check allow users to lter security alerts

based on the scope or depth of their dependencies.

It is worth noting that none of the tools provide a way to dier-

entiate between production versus non-production dependencies.

ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab

Table 8: Vulnerable dep endency (VD) characteristics based metrics reported by current tools and support to locate vulnerable

code (VC) in JavaScript projects.

Cover all Filter Security Alerts by Locate

Tool dependencies Runtime Development Direct Transitive In production dep code

npm audit Yes Yes Yes Yes Yes No No

Snyk Yes Yes Yes Yes Yes No No

Dependabot Yes No No No No No No

OSWAP Dependency-Check Yes No No No No No No

There is no support, for instance, to input source map les in the

tools, to help lter out vulnerabilities that concern non-production

dependencies. We believe adding source map support to SCA tools

would oer developers better insight on their production bundle

without relying so much on the dependency congurations that

have shown to be inconsistent across dierent projects.

Finally, we nd that none of the tools provide a way to locate

where the vulnerable dependency is used in code (column “Locate

dep code”). Developers have to rely on their own set of static/dy-

namic analysis tools to know exactly where the vulnerable depen-

dency is used in the codebase. We believe static analysis tools would

benet from using the features provided by module bundlers that

scan the code for

import

statements to provide the path to the

source le in which a dependency is imported and used.

6 RELATED WORK

In this section, we discuss the related literature divided into three

aspects. First, we discuss works that have focused on the challenges

related to the Software Bill of Materials. Then, we discuss works

describing the challenges of dependency management in software

ecosystems. Finally, we discuss existing tools and approaches to

detect vulnerable dependencies.

6.1 Software Bill of Materials

The Cybersecurity and Infrastructure Security Agency (CISA) de-

nes the Bill of Materials (BOM) as a nested inventory of compo-

nents in a piece of software [

]. The process of identifying produc-

tion dependencies is part of the constructing the BOM of a software.

Several studies have proposed approaches to consolidate the BOM

of software applications [

]. Zajdel et al. discussed that

users of open source softwares tend to arbitrarily download the

software into their build systems, but rarely keep track of which

versions they use which results in unnecessary software being left

in the application, increasing the risk of potential vulnerabilities

[

]. Coelho et al. proposed a data-driven approach to measure the

level of maintenance activity in GitHub projects [

]. The authors

found that 16% of the studied open source projects have become

unmaintained over the course of one year. They also reported that

software tools such as compilers and editors have the highest main-

tenance activity over time and proposed that a metric about the

level of maintenance activity of GitHub projects can help developers

in selecting open source projects.

These prior studies focus on the importance of selecting well-

maintained software libraries, and propose approaches to alleviate

the challenges related to open source code reuse. However, the

challenges of constructing the BOM for dynamic languages like

JavaScript are still present. Thus, our study focuses on JavaScript

projects and leverages existing approaches (i.e., module bundlers

and tree shaking) to help developers maintain their softwares and

decrease the risks related to open source code reuse by analyzing

the software’s dependencies and reporting on the ones that are

actually used in the code.

6.2 Dependency Studies

Package ecosystems and the presence of vulnerable dependencies

have been studied in the literature [

]. Hejderup et al.

report that one-third of the

npm

packages use vulnerable dependen-

cies [

]. Similar to our study, the authors suggest context of usage

of a package to be a possible reason for not xing the vulnerable

dependencies. Abdalkareem et al. conduct an empirical analysis on

security vulnerabilities in Python packages [

]. They nd that the

number of vulnerabilities in the

PyPi

ecosystem increases over time

and that it takes, on median, more than 3 years to get discovered,

regardless of their severity. They emphasize on the need for more

eective process to detect vulnerabilities in open source packages

since both

npm

and

PyPi

allows to publish a package release to

the registry with no security checks. Lauinger et al. conduct the

rst large scale of JavaScript open source projects and investigate

the relationship between outdated dependencies and dependen-

cies with known vulnerabilities [

]. They report that transitive

dependencies are more likely to be vulnerable since developers

may not be aware of them and have less control over them, which

further corroborates with our ndings of RQ3. Similarly, Williams

et al. report that 26% of open source Maven packages have known

vulnerabilities and refer to a lack of meaningful controls of the com-

ponents used in the proprietary projects as a partial explanation to

this high number of vulnerable dependencies [46].

These prior studies focus on the presence of known vulnerabili-

ties in popular package ecosystems and the reason why the number

of vulnerable dependencies is so high. However, their analysis does

not consider the scope of dependencies (i.e., they do not distinguish

production and non-production dependencies). As a result, the

studied vulnerable dependencies may not be exploitable. Zapata et

al. investigate vulnerable dependency migrations of

npm

packages

and evaluate the impact of a vulnerability in the

package on 60

JavaScript projects using the vulnerable version of the package [

The authors nd that up to 73.3% of the dependent applications

were safe from the vulnerability since they did not actually used

the vulnerable code. The study also highlights that it is not trivial

to map vulnerable code to client usage for JavaScript, which further

corroborates with our ndings in RQ1.

Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM ASE ’22, October 10–14, 2022, Rochester, MI, USA

6.3 Detecting Vulnerable Dependencies

Alfadel et al. study the use of Dependabot security pull requests in

2,904 JavaScript open source GitHub projects [

]. Results show that

the vast majority (65.42%) of the security-related pull requests are

often merged within a day and that the severity of the vulnerable de-

pendency or potential risk for breaking changes are not associated

with the merge time. Ponta et al. propose a pragmatic approach to

facilitate the assessment of vulnerable dependencies in open source

libraries by mapping patch-based changes of vulnerabilities onto

the aected components of the application [

]. Seja et al. present

Amalfi

, a machine-learning based approach for automatically de-

tecting potentially malicious packages [

]. The authors evaluate

their approach on 96,287

npm

package versions published over the

course of one week and identify 95 previously unknown vulnera-

bilities. Pashchenko et al. propose Vuln4Real, a methodology that

addresses the over-ination problem of academic and industrial

approach for reporting vulnerable dependencies in free and open

source software (FOSS) [

]. Vuln4Real extends state-of-the-art ap-

proaches to analyzing dependencies by ltering development-only

dependencies, grouping dependencies by project, and assessing

dead dependencies. Their evaluation of Vuln4Real shows that the

methodology signicantly reduces the number of false alerts for

code in production (i.e., dependencies wrongly agged as vulnera-

ble). Pashchenko et al’s. work is the closest to ours since it considers

similar aspects in relationship to the relevance of vulnerable depen-

dencies: exploitability and dependency scope. Our study touches

on another aspect that is not discussed in Vuln4Real and that is the

context in which a dependency is used versus how it is congured.

Our paper shows that there is a discrepancy between the cong-

uration of dependencies and its usage, and that this discrepancy

may aect the exploitability of a vulnerability (i.e., its relevance to

the application).

Imtiaz et al. [

] present an in-depth case study by comparing

the analysis reports of 9 SCA tools on OpenMRS, a large web appli-

cation composed of Maven and

npm

projects. The study shows that

the tools vary in their vulnerability reporting and that the count

of vulnerable dependencies reported for

npm

projects ranges from

32 to 239. From the 9 studied SCA tools, 4 freely available tools

could be applied to

npm

projects: OWASP Dependency-Check, Snyk,

Dependabot, and npm audit. The results show that all 4 tools de-

tect vulnerable dependencies across all scopes and depths and that

reported vulnerabilities are mostly introduced through transitive

dependencies, except for Dependabot. While the authors of this

paper report on the coverage capabilities of SCA tools, our study

mainly focuses on the data that is shown to the user. For exam-

ple, npm audit covers dependencies of all scope when reporting

for vulnerabilities, but it is the user’s responsibility to lter the

vulnerable dependencies by scope (production or development).

That is, SCA tools don’t explicitly report on the scope of vulnerable

dependencies, and when it is done manually by users, this analysis

depends on the project’s dependency congurations rather than

dependency usage.

7 THREATS TO VALIDITY

Threats to internal validity

considers the experimenter’s bias

and errors. Our method of analysis relies on building software

projects with their congured module bundler to identify produc-

tion dependencies, and errors in this process may introduce false

positives/negatives in our analysis. We mitigate this threat by 1)

only selecting projects that already use module bundlers to min-

imize any intervention that could introduce bugs in the process,

2) building each project manually by following the projects docu-

mentation, 3) manually inspecting the built artifacts (e.g., installed

dependencies, source map les), and 4) removing 55 projects that

showed evidence of failed builds (e.g., errors, empty source map

les). To further conrm the validity of this process, we also sam-

pled 7 projects from our dataset and asked contributors to validate

our results, by checking the accuracy of the yielded production

dependencies. We received responses from 4 projects and contribu-

tors conrmed the yielded classications, helping us validate the

soundness of our methodology.

Threats to eternal validity

considers the generalizability of

the ndings. We purposefully select projects that already use mod-

ule bundlers which could limit the type of project our ndings gen-

eralize. First, module bundlers tend to be used primarily by projects

that want to minimize their production dependencies, such as client-

side packages such as web-applications and libraries. In fact, our

nding that development dependencies are shipped to production

are unlikely to occur in projects that do not use module bundlers.

Second, our dataset is strictly composed of open source JavaScript

projects, thus, our results may dier if a study is performed on

proprietary projects or projects written in other languages.

8 CONCLUSION AND FUTURE WORK

This research investigates projects dependencies that are released

to production and their impact on security and dependency man-

agement. We conducted our study on 100

npm

projects, one of

the largest and fastest growing software ecosystems. Our results

showed that production dependencies are rare among the installed

dependencies of a project, but are dicult to identify. Commonly

held assumptions of dependency management do not hold in prac-

tice and context is more important in determining the scope of a de-

pendency as opposed to its conguration. Furthermore, we evaluate

how often security alerts are reported for production dependencies,

and found that none of the vulnerability reports are emitted for

dependencies released to production. Rather, the majority of the

alerts are emitted for development, transitive dependencies which

has two main implications: 1) not every vulnerability is a threat to

the software in production, and 2) vulnerabilities can be introduced

transitively regardless of their scope, which further motivates the

need for SCA tools to provide such an analysis.

Our paper outlines directions for future work. Using module

bundlers as a way to identify production dependencies may aug-

ment current SCA tools to provide better insights on the scope of

their dependencies within their project’s context and usage. Con-

sequently, module bundlers or similar tools, may benet far more

than just client-side applications and should be part of the build

process of projects that extensively rely on open source code.

REFERENCES

[1] [n. d.]. Software Bill of Materials | CISA. https://www.cisa.gov/sbom

[2]

2019. 2019 State of the Software Supply Chain. https://www.sonatype.com/

hubfs/SSC/2019%20SSC/SON_SSSC-Report-2019_jun16-DRAFT.pdf

ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab

[3]

2019. Eight Key Findings Illustrating How to Make Open Source Work Even

Better for Developers. https://cdn2.hubspot.net/hubfs/4008838/Resources/The-

2019-Tidelift-managed-open-source-survey-results.pdf

[4] 2019. webpack. https://webpack.js.org/

[5]

2020. Do "dependencies" and "devDependencies" matter when using Web-

pack? https://jsramblings.com/do-dependencies-devdependencies-matter-

when-using-webpack/

[6] 2020. npm-deps-parser. https://github.com/nVisium/npm-deps-parser

[7]

2020. Securing the World’s Software. https://octoverse.github.com/static/github-

octoverse-2020-security-report.pdf

[8] 2021. Create react app. https://create-react-app.dev/

[9]

2021. Help, ‘npm audit‘ says I have a vulnerability in react-scripts!

Issue

#11174

facebook/create-react-app. https://github.com/facebook/create-react-

app/issues/11174

[10] 2021. rollup.js. https://rollupjs.org/guide/en/

[11]

2022. The Complete Guide to Software Composition Analysis - FOSSA. https:

//fossa.com/complete-guide-software-composition-analysis

[12] 2022. GitHub Advisory Database. https://github.com/advisories

[13] 2022. Snyk | Developer security | Develop fast. Stay secure. https://snyk.io/

[14]

Rabe Abdalkareem, Olivier Nourry, Sultan Wehaibi, Suhaib Mujahid, and Emad

Shihab. 2017. Why do developers use trivial packages? an empirical case study

on npm. Proceedings of the 2017 11th Joint Meeting on Foundations of Software

Engineering (08 2017). https://doi.org/10.1145/3106237.3106267

[15]

Rabe Abdalkareem, Vinicius Oda, Suhaib Mujahid, and Emad Shihab. 2020. On

the impact of using trivial packages: an empirical case study on npm and PyPI.

Empirical Software Engineering 25 (01 2020), 1168–1204. https://doi.org/10.1007/

s10664-019-09792-9

[16]

Mahmoud Alfadel, Diego Elias Costa, Emad Shihab, and Mouafak Mkhallalati.

2021. On the Use of Dependabot Security Pull Requests. In 2021 IEEE/ACM

18th International Conference on Mining Software Repositories (MSR). 254–265.

https://doi.org/10.1109/MSR52588.2021.00037

[17]

Md Atique, Reza Chowdhury, Rabe Abdalkareem, and Emad Shihab. 2019. On the

Untriviality of Trivial Packages: An Empirical Study of npm JavaScript Packages.

Journal of IEEE Transactions on Software Engineering 01 (2019). http://das.encs.

concordia.ca/uploads/atique_tse2021.pdf

[18]

Victor R. Basili, Lionel C. Briand, and Walcélio L. Melo. 1996. How reuse inuences

productivity in object-oriented systems. Commun. ACM 39 (10 1996), 104–116.

https://doi.org/10.1145/236156.236184

[19]

Chris Bogart, Christian Kästner, James Herbsleb, and Ferdian Thung. 2021. When

and How to Make Breaking Changes. ACM Transactions on Software Engineering

and Methodology 30 (07 2021), 1–56. https://doi.org/10.1145/3447245

[20]

Xiaowei Chen, Rabe Abdalkareem, Suhaib Mujahid, Emad Shihab, and Xin Xia.

2021. Helping or not Helping? Why and How Trivial Packages Impact the npm

Ecosystem. Empirical Software Engineering 26 (03 2021). https://doi.org/10.1007/

s10664-020-09904-w

[21]

Jailton Coelho, Marco Túlio Valente, Luciano Milen, and Luciana Lourdes Silva.

2020. Is this GitHub Project Maintained? Measuring the Level of Maintenance

Activity of Open-Source Projects. CoRR abs/2003.04755 (2020). arXiv:2003.04755

https://arxiv.org/abs/2003.04755

[22]

Diego Elias Costa, Suhaib Mujahid, Rabe Abdalkareem, and Emad Shihab. 2021.

Breaking Type-Safety in Go: An Empirical Study on the Usage of the unsafe

Package. IEEE Transactions on Software Engineering (2021), 1–1. https://doi.org/

10.1109/TSE.2021.3057720

[23]

Diego Elias Costa, Suhaib Mujahid, Rabe Abdalkareem, and Emad Shihab. 2021.

Breaking Type-Safety in Go: An Empirical Study on the Usage of the unsafe

Package. IEEE Transactions on Software Engineering (2021), 1–1. https://doi.org/

10.1109/TSE.2021.3057720

[24]

Joel Cox, Eric Bouwers, Marko van Eekelen, and Joost Visser. 2015. Mea-

suring Dependency Freshness in Software Systems. In 2015 IEEE/ACM 37th

IEEE International Conference on Software Engineering, Vol. 2. 109–118. https:

//doi.org/10.1109/ICSE.2015.140

[25]

Alexandre Decan, Tom Mens, and Philippe Grosjean. 2019. An Empirical Compar-

ison of Dependency Network Evolution in Seven Software Packaging Ecosystems.

Empirical Software Engineering 24 (02 2019). https://doi.org/10.1007/s10664-017-

9589-y

[26]

Josh Fruhlinger. 2020. Equifax data breach FAQ: What happened, who was

aected, what was the impact? https://www.csoonline.com/article/3444488/

equifax-data-breach-faq-what-happened-who-was-aected-what-was-the-

impact.html

[27]

Emitza Guzman, David Azócar, and Yang Li. 2014. Sentiment Analysis of

Commit Comments in GitHub: An Empirical Study. In Proceedings of the 11th

Working Conference on Mining Software Repositories (Hyderabad, India) (MSR

2014). Association for Computing Machinery, New York, NY, USA, 352–355.

https://doi.org/10.1145/2597073.2597118

[28]

J. I. Hejderup. 2015. In Dependencies We Trust: How vulnerable

are dependencies in software modules? repository.tudelft.nl (2015).

https://repository.tudelft.nl/islandora/object/uuid:3a15293b-16f6-4e9d-b6a2-

f02cd52f1a9e?collection=education

[29]

Nasif Imtiaz, Seaver Thorn, and Laurie Williams. 2021. A comparative study of

vulnerability reporting by software composition analysis tools. Proceedings of

the 15th ACM / IEEE International Symposium on Empirical Software Engineering

and Measurement (ESEM) (10 2021). https://doi.org/10.1145/3475716.3475769

[30]

Abbas Javan Jafari, Diego Elias Costa, Rabe Abdalkareem, Emad Shihab, and Niko-

laos Tsantalis. 2021. Dependency Smells in JavaScript Projects. IEEE Transactions

on Software Engineering (2021), 1–1. https://doi.org/10.1109/tse.2021.3106247

[31]

Riivo Kikas, Georgios Gousios, Marlon Dumas, and Dietmar Pfahl. 2017. Structure

and Evolution of Package Dependency Networks. In Proceedings of the 14th Inter-

national Conference on Mining Software Repositories (Buenos Aires, Argentina)

(MSR ’17). IEEE Press, 102–112. https://doi.org/10.1109/MSR.2017.55

[32]

Raula Gaikovina Kula, Daniel M. German, Ali Ouni, Takashi Ishio, and Katsuro

Inoue. 2017. Do developers update their library dependencies? Empirical Software

Engineering 23, 1 (may 2017), 384–417. https://doi.org/10.1007/s10664-017-9521-5

[33]

Tobias Lauinger, Abdelberi Chaabane, Sajjad Arshad, William Robertson, Christo

Wilson, and Engin Kirda. 2017. Thou Shalt Not Depend on Me: Analysing the

Use of Outdated JavaScript Libraries on the Web. In Proceedings 2017 Network

and Distributed System Security Symposium. Internet Society. https://doi.org/10.

14722/ndss.2017.23414

[34]

Suhaib Mujahid, Diego Elias Costa, Rabe Abdalkareem, Emad Shihab, Mo-

hamed Aymen Saied, and Bram Adams. 2021. Toward Using Package Centrality

Trend to Identify Packages in Decline. IEEE Transactions on Engineering Manage-

ment (2021), 1–15. https://doi.org/10.1109/tem.2021.3122012

[35]

Emerson Murphy-Hill, Ciera Jaspan, Caitlin Sadowski, David Shepherd, Michael

Phillips, Collin Winter, Andrea Knight, Edward Smith, and Matt Jorde. 2019.

What Predicts Software Developers’ Productivity? IEEE Transactions on Software

Engineering (2019), 1–1. https://doi.org/10.1109/tse.2019.2900308

[36]

Stack Overow. [n. d.]. Stack Overow Developer Survey 2021. https://insights.

stackoverow.com/survey/2021

[37]

Ivan Pashchenko, Henrik Plate, Serena Ponta, Antonino Sabetta, and Fabio Mas-

sacci. 2018. Vulnerable open source dependencies: counting those that matter.

1–10. https://doi.org/10.1145/3239235.3268920

[38]

Ivan Pashchenko, Henrik Plate, Serena Ponta, Antonino Sabetta, and Fabio

Massacci. 2020. Vuln4Real: A Methodology for Counting Actually Vulnera-

ble Dependencies. IEEE Transactions on Software Engineering PP (09 2020), 1–1.

https://doi.org/10.1109/TSE.2020.3025443

[39]

Ivan Pashchenko, Duc-Ly Vu, and Fabio Massacci. 2020. A Qualitative Study of

Dependency Management and Its Security Implications. Association for Computing

Machinery, New York, NY, USA, 1513–1531. https://doi.org/10.1145/3372297.

3417232

[40]

Henrik Plate, Serena Ponta, and Antonino Sabetta. 2015. Impact assessment for

vulnerabilities in open-source software libraries. 411–420. https://doi.org/10.

1109/ICSM.2015.7332492

[41]

Baishakhi Ray, Daryl Posnett, Vladimir Filkov, and Premkumar Devanbu. 2014. A

Large Scale Study of Programming Languages and Code Quality in Github. In Pro-

ceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Soft-

ware Engineering (Hong Kong, China) (FSE 2014). Association for Computing Ma-

chinery, New York, NY, USA, 155–165. https://doi.org/10.1145/2635868.2635922

[42]

Adriana Seja and Max Schäfer. 2022. Practical Automated Detection of Malicious

npm Packages. arXiv preprint arXiv:2202.13953 (2022).

[43] unisil. 2021. Source Map Parser. https://github.com/unisil/source-map-parser

[44]

Haroen Viaene. 2021. feat(dependencies): update algoliasearch-helper. https:

//github.com/algolia/instantsearch.js/pull/4936. (Accessed on 05/04/2022).

[45]

Stefan Wagner and Emerson Murphy-Hill. 2019. Factors That Inuence Productiv-

ity: A Checklist. 69–84. https://doi.org/10.1007/978-1-4842-4221-6_8

[46]

Je Williams and Arshan Dabirsiaghi. 2012. The unfortunate reality of insecure

libraries. Asp. Secur. Inc (2012), 1–26.

[47]

Stan Zajdel, Diego Elias Costa, and Hafedh Mili. 2022. Open Source Software: An

Approach to Controlling Usage and Risk in Application Ecosystems. In Proceed-

ings of the 26TH ACM International Systems and Software Product Line Conference.

arXiv. https://doi.org/10.48550/ARXIV.2206.10358

[48]

Rodrigo Zapata, Raula Kula, Bodin Chinthanet, Takashi Ishio, Kenichi Matsumoto,

and Akinori Ihara. 2018. Towards Smoother Library Migrations: A Look at Vul-

nerable Dependency Migrations at Function Level for npm JavaScript Packages.

559–563. https://doi.org/10.1109/ICSME.2018.00067