Not All Dependencies are Equal: An Empirical Study on
Production Dependencies in NPM
Jasmine Latendresse
Data-driven Analysis of Software (DAS) Lab
Concordia University
Montreal, Canada
jasmine.latendresse@concordia.ca
Suhaib Mujahid
Mozilla Corporation
San Francisco, United States
Diego Elias Costa
LATECE Lab
Université du Québec à Montréal
Montreal, Canada
Emad Shihab
Data-driven Analysis of Software (DAS) Lab
Concordia University
Montreal, Canada
ABSTRACT
Modern software systems are often built by leveraging code written
by others in the form of libraries and packages to accelerate their
development. While there are many benets to using third-party
packages, software projects often become dependent on a large
number of software packages. Consequently, developers are faced
with the dicult challenge of maintaining their project dependen-
cies by keeping them up-to-date and free of security vulnerabilities.
However, how often are project dependencies used in production
where they could pose a threat to their project’s security?
We conduct an empirical study on 100 JavaScript projects using
the Node Package Manager (npm) to quantify how often project
dependencies are released to production and analyze their char-
acteristics and their impact on security. Our results indicate that
less than 1% of the installed dependencies are released to produc-
tion. Our analysis reveals that the functionality of a package is not
enough to determine if it will be released to production or not. In
fact, 59% of the installed dependencies congured as runtime depen-
dencies are not used in production, and 28.2% of the dependencies
congured as development dependencies are used in production,
debunking two common assumptions of dependency management.
Findings also indicate that most security alerts target dependencies
not used in production, making them highly unlikely to be a risk
for the security of the software. Our study unveils a more complex
side of dependency management: not all dependencies are equal.
Dependencies used in production are more sensitive to security
exposure and should be prioritized. However, current tools lack the
appropriate support in identifying production dependencies.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from [email protected].
ASE ’22, October 10–14, 2022, Rochester, MI, USA
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9475-8/22/10.. . $15.00
https://doi.org/10.1145/3551349.3556896
KEYWORDS
third-party packages, dependencies, security, npm
ACM Reference Format:
Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab.
2022. Not All Dependencies are Equal: An Empirical Study on Production
Dependencies in NPM. In 37th IEEE/ACM International Conference on Auto-
mated Software Engineering (ASE ’22), October 10–14, 2022, Rochester, MI, USA.
ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3551349.3556896
1 INTRODUCTION
The vast majority of modern software systems are built by using
modular functionalities provided by open source packages. Reports
estimate that more than 90% of open source and proprietary projects
rely substantially on reusing open source packages [
3
,
7
]. As a testa-
ment to the popularity of open source, popular package managers
such as
npm
, host more than 2 million reusable packages, covering
all sorts of software functionalities [25].
While the use of open source packages signicantly reduces
development time and costs [
18
,
35
,
45
], it also exposes software
applications to vulnerabilities. In the 2020 State of the Octoverse
security report, GitHub reveals that active repositories with a sup-
ported package ecosystem have a 59% chance of getting a security
alert in the next 12 months [
7
]. This problem is even more wide-
spread in the JavaScript ecosystem, where nearly 40% of all
npm
packages rely on code with known vulnerabilities [
2
]. Software vul-
nerabilities may lead to signicant nancial and reputation loss. A
popular example is the 2017 Equifax cybersecurity incident caused
by a web-server vulnerability in the Apache Struts package. The in-
cident led to a data breach of millions of American citizens, costing
Equifax 1.8 billion USD in security upgrades and lawsuits [26].
The problem is that developers struggle to identify what vulner-
abilities may aect their software application [
32
]. Current security
scanners report on the severity of a vulnerability, but lack a sup-
port to identify if the dependency is 1) used in the code and 2) is
part of the production software the project delivers. Developers
constantly complain that security alert tools report too many false
positives [
37
,
38
], as even the most critical vulnerability may be
unexploitable if the vulnerable dependency is never released in the
production software.
arXiv:2207.14711v2 [cs.SE] 29 Aug 2022
ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab
In this paper, we study how often dependencies are actually part
of a production software and their impact on security based on
their characteristics, usage, and context. We perform this study
on 100 JavaScript projects in
npm
, the largest and fastest growing
software ecosystem to date [
36
], to answer the following three
research questions:
RQ1.
How many installed dependencies are production dependen-
cies?
RQ2. What are the characteristics of production dependencies?
RQ3.
How often are npm security alerts emitted for production
dependencies?
Findings show that production dependencies represent a very
small fraction of the total number of dependencies in each project.
While projects tend to depend on hundreds of dependencies (both
direct and transitive), 51 projects did not have any production de-
pendencies, and 49 have a median of 5 production dependencies.
Contrary to common assumptions, most dependencies declared as
runtime are not shipped to production while some development de-
pendencies are included in the production software. Consequently,
we nd that dependency usage and context gives better insight
at determine if a dependency will be used in production than the
nature of a dependency itself. Furthermore, our results show that
not all security vulnerabilities reported by
npm
are an actual threat
to the software in production. Our paper makes the following con-
tributions:
To the best of our knowledge, this is the rst study to in-
vestigate the discrepancy between installed and production
dependencies in open source projects.
We report on results that challenge the assumptions of depen-
dency management and should be revisited by researchers
and practitioners.
We investigate the support of current tools in providing
better information for developers regarding the scope and
context of vulnerable dependencies.
We make our dataset of 100 projects available
1
, including
all scripts used to collect and pre-process data, to facilitate
replication and foment more research in the eld.
The rest of the paper is organized as follows: we start by mo-
tivating our problem with an example in Section 2. We describe
and justify our methodology in Section 3 and explain our results in
Section 4. Implications of our ndings are discussed in Section 5. We
present the related work in Section 6, and discuss the limitations to
our study in Section 7. Finally, we conclude our study in Section 8.
2 MOTIVATION & BACKGROUND
To motivate our study and illustrate the terminology used in this
paper, we walk the reader through the creation of a simple applica-
tion using
create-react-app
[
8
]. The terms used in this example
and throughout this paper are formally dened in Table 1. This
example application is a single-page "Hello World" application that
is provided by React when initializing a Create React App project.
We create our application by simply running the command
npm
create-react-app my-app.
1
https://zenodo.org/record/6518765
Figure 1: A snippet of the package.json le listing the depen-
dencies of our example project.
1 "dependencies": {
2 "@testinglibrary/jestdom": "^5.16.2",
3 "@testinglibrary/react": "^12.1.3",
4 "@testinglibrary/userevent": "^13.5.0",
5 "react": "^17.0.2",
6 "reactdom": "^17.0.2",
7 "reactscripts": "5.0.0",
8 "webvitals": "^2.1.4"
9 },
10 "devDependencies": {
11 "@webpackcli/generators": "^2.4.2",
12 "cssloader": "^6.6.0",
13 "htmlwebpackplugin": "^5.5.0",
14 "prettier": "^2.5.1",
15 "styleloader": "^3.3.1",
16 "webpack": "^5.69.1",
17 "webpackcli": "^4.9.2",
18 "webpackdevserver": "^4.7.4",
19 "workboxwebpackplugin": "^6.5.0"
20 }
How many dependencies in our project?
To achieve this
single-page React application without further programming, our
generated application reuses several open source packages pub-
lished in
npm
. We refer to each of the packages as a dependency of
our project. The dependency conguration of our project is stored
in the
package.json
le, shown in Figure 1. Dependencies are
grouped into two groups: runtime dependencies (“dependencies”)
and development dependencies (“devDependencies”). Runtime depen-
dencies are dependencies required by the application to function,
e.g., as we build a React application, our project depends on
react
version 17.0.2. Development dependencies, on the other hand, are
needed to develop the project, e.g., to format the code (prettier
2.5.1), and are not required by the software to run. As it can be seen
in Figure 1, our small application has 7 runtime dependencies and
9 development dependencies.
Once we install these dependencies locally to build and test
our application (
npm install
) we may be surprised to see that
a total of 1,764 dependencies were installed. The dependencies
shown in Figure 1 are direct dependencies of our project, each of
which have dependencies of their own. For instance, the package
loose-envify
is a dependency of
react
. These are called transitive
dependencies and represent the vast majority of installed depen-
dencies. As such,
loose-envify
is a transitive dependency of our
example application. We use the term installed dependencies to
refer to all dependencies of a project, both direct/transitive and
development/runtime dependencies.
Is our application vulnerable?
Security vulnerabilities are
a widespread problem in
npm
[
2
]. Given that our application de-
pends on 1,764 installed dependencies, is our application aected
by vulnerabilities? To verify this, we resort to using a Software
Composition Analysis (SCA) tool. SCA tools are used to identify
open source components in software codebases to evaluate security,
license compliance and overall code quality [
11
]. In our example,
we use npm audit, a native tool of
npm
that reports vulnerabilities
aecting software dependencies and maintains its own database
Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM ASE ’22, October 10–14, 2022, Rochester, MI, USA
Table 1: Concepts and denitions.
Concept Denition Example
Runtime dependency Refers to the "runtime" conguration of a dependency react is a runtime dependency as shown in Figure 1.
in the package.json le and is needed for the
application to function.
Development dependency Refers to the "development" conguration of a dependency webpack is a development dependency as shown in
in the package.json le and indicates that the Figure 1.
dependency is needed to develop the application.
Installed dependency Refers to the dependencies installed in the project The dependencies depicted in Figure 1 are part of the
and the result of the npm install command. installed dependencies, and so are their dependencies.
Depth Refers to the level of a dependency in the dependency tree. npm ls is used to obtain the dependency tree.
Direct dependency Refers to a dependency with a depth of 1. The dependencies shown in Figure 1 are direct
dependencies.
Transitive dependency Refers to a dependency with a depth greater than 1. The dependencies of the dependencies shown in
Figure 1 are transitive dependencies.
Usage Refers to the scope in which a dependency is used Figure 2 shows that react-dom is used in production.
Context Refers to the context of the application in In our example application, webpack is a development
which a dependency is used. dependency used to bundle the application’s resources.
of vulnerabilities. If a dependency is aected by one of more vul-
nerabilities, we refer to the dependency as a vulnerable dependency.
In our example application, upon running
npm audit
, we receive
the report that our simple application contains 6 moderate severity
vulnerabilities, 13 high severity vulnerabilities, and 1 critical sever-
ity vulnerability. That is, without any further programming, our
project already started with an alarming number of vulnerabilities
of moderate, high, and critical severity. Examples of the reported
high severity and critical severity vulnerabilities include Regular
Expression Denial of Service, Template Injection, and Prototype
Pollution.
Can reported vulnerabilities really aect our example ap-
plication in production?
Vulnerable dependencies are problem-
atic and may aect the security of our project in multiple ways.
However, the risk of vulnerable dependencies reaches its peak when
the dependency is needed for the software to run in a production
environment. To nd which dependencies are part of our produc-
tion software, i.e., production dependencies, we use a module bundler.
A module bundler is a tool that assists the building process of a
software by resolving the software dependencies and pruning the
dependencies that are not needed in the production software. The
process of pruning dependencies is referred to as tree shaking. We
use webpack [
4
], a popular JavaScript module bundler, to build our
production software and export a list of production dependencies.
Upon building our project with
webpack
, the tool generates a
source map le, which contains the list of production dependencies
of our application. From the 1,764 dependencies in our example
project, Figure 2 shows that only 6 are released to production:
react
,
object-assign
,
scheduler
,
react-dom
,
style-loader
, and
css-
loader
. More so, none of our production dependencies contained
any reported vulnerability, thus, our original report of 15 vulner-
abilities aected dependencies that would not be present in the
application in production.
The problem: security alert fatigue.
Our example showcases
an important problem in current software development. Even small
Figure 2: A snippet of the source map generated by building
our example project.
1 "version": 3,
2 "le": "main.js"
3 "mappings": "KAAK,CAACC,EAAOC,GAAI..."
4 "sources": ["node_modules/cssloader/dist/runtime/api.js",
5 "node_modules/objectassign/index.js",
6 "node_modules/reactdom/cjs/reactdom.production.min.js",
7 "node_modules/react/cjs/react.production.min.js",
8 "node_modules/scheduler/index.js",
9 "node_modules/styleloader/injectStylesIntoStyleTag.js"]
applications may depend on thousands of dependencies and vulner-
abilities are constantly being reported by the open source commu-
nity. Developers face the dicult challenge of separating security
alerts that are relevant to their application security from the long
reports yielded by current SCA tools [
37
,
38
]. In this paper, we
evaluate this problem on a scale of 100 popular JavaScript projects.
3 STUDY DESIGN
The goal of the paper is to study how often project dependencies are
shipped to production and their impact on the security of software
projects. In this section, we describe how we select and curate the
set of study projects (Sections 3.1 and 3.2), and how we identify
production dependencies (Section 3.3). We provide an overview of
our methodology in Figure 3.
3.1 Dataset of Candidate Projects
The focus of our study is to investigate how active software devel-
opment JavaScript projects use their dependencies. To this aim, we
start by collecting data of a large number of JavaScript reposito-
ries as candidate projects for our study. Many studies have used
the number of GitHub stargazers as a way to select candidate
projects [
22
,
27
,
41
]. Thus, we start with 11,860 popular JavaScript
projects that were collected on July 27th, 2020 with at least 100
stargazers.
ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab
Finding what dependencies are shipped to production is a very
challenging task making it impractical to apply this analysis on
a large-scale [
48
]. In our study, we opt to select projects that al-
ready make use of tree shaking (see Section 2 for a more in-depth
explanation). Specically, we select projects using either
webpack
or
rollup
[
10
] because they are two of the most popular mod-
ule bundlers for JavaScript projects and they have integrated tree
shaking support.
To nd out which projects use
webpack
and
rollup
, we auto-
matically parse the
package.json
les of the 11,860 projects to
identify 1) if any of the bundlers are declared as a dependency and 2)
the tree shaking algorithm is enabled for the project. Through this
process, we nd that 155 JavaScript projects make use of
webpack
or rollup, and have tree shaking enabled.
3.2 Building Candidate Projects
To assess whether a dependency is used in production, we have to
successfully build each project in a production environment with a
module bundler. During the build of a project, the module bundler
rst looks for all of the dependencies in the project and constructs a
dependency graph (dependency resolution). The dependency graph
is then converted along with source code into a single le (packing)
called the bundle. Source maps are then generated after a successful
build.
To build the candidate projects, we rst clone each of the 155
JavaScript projects locally. We planned to build a framework to auto-
mate the build of all the 155 JavaScript projects. However, we soon
realized that many projects require specic building commands and
setup to be build successfully. In fact, the majority of the projects did
not support the standard build command (
npm run-script build
).
Furthermore, the environmental settings varied across projects,
e.g., some projects require specic NodeJS versions and identifying
this automatically is very challenging. We then proceed to semi-
manually build each project using the following methodology:
(1) Read projects documentation.
We read the documentation
of all the 100 studied projects to identify the specics of each
project build. The goal of this step is to identify all the steps
of the building process: the build commands, supported Node
versions, supported package manager (e.g. YAML or npm), and
any other specicity of the project building conguration. At
this point, we also conrmed that all selected projects are related
to software development, i.e., are not personal toy-projects.
(2) Install dependencies.
We install all dependencies specied
in the
package.json
le by using the
npm install
command.
This generates a
node_modules
folder in every project’s home
directory which contains all installed dependencies.
Table 2: Descriptive Statistics of the Selected Projects.
Mean Median Min Max
# stars 4827.6 1224 112 74201
# commits 1364.9 496 31 6188
# contributors 62.3 26 4 401
age (years) 5.2 5 1 12
(3) Build project.
Following each project’s documentation, we
build each project in the dataset. The rst author manually fol-
lowed the steps of the building process to ensure the build was
successful, the source maps containing the production depen-
dencies was generated, and the yielded artifacts targeted the
production environment.
(4) Generate source maps.
Upon the successful completion of
the building process, source maps are generated and saved in
the project’s temporary folder or home directory.
After our careful process, we successfully build and generate
source maps for 100 JavaScript projects. From the 55 projects that
failed in our process, the main culprit was the generation of the
source maps le. In most of the failed cases, projects’ conguration
did not have the exibility to generate the source maps le. For ex-
ample, we found some projects created using the
create-react-app
package that does support module bundlers and tree shaking, but
does not have the option to output source maps.
We present descriptive statistics of the 100 projects we success-
ful built and generate source maps in Table 2. The projects of our
dataset are very popular (median 1,224 stargazers), tend to be ma-
ture projects (median of 5 years of development and 496 commits)
and are developed by medium-sized team of developers (median of
26 developers).
3.3 Identifying Dependencies in Production
To identify production dependencies, we rst collect all dependen-
cies found in the source maps of each projects using a mix of source
map parser [
43
] and regular expression (regex). Then, to obtain the
version of each dependency found in the source maps, we locate its
package.json
le in the respective project’s
node_modules
folder
and parse it. This results in a dataset of production dependencies
with their corresponding version.
In addition to identifying dependencies used in production, we
also want to identify two very important characteristics of all depen-
dencies, as they have an inuence on the risk of vulnerabilities [
29
]:
1) the dependency scope, runtime or development and 2) whether
the dependency is a direct or transitive dependency of the project.
To classify a dependency into runtime or development, we analyse
the
package.json
le of a project, classifying dependencies con-
gured in the "dependency" section as runtime dependencies, and
classifying dependencies declared in "devDependency" as develop-
ment dependencies. Since transitive dependencies are not listed in
the
package.json
le, we identify the type of the original depen-
dency which determines the type of the transitive dependency.
To classify installed dependencies into direct or transitive depen-
dencies, using the command
npm list
we generate the dependency
tree, a hierarchical representation of relationship between depen-
dencies. The
npm list
command lists all installed dependencies
in json format, including the name, version, path, and depth of
each dependency. From the depth, we identify each dependency as
direct or transitive, i.e., direct dependencies have depth = 1, while
transitive dependencies have depth > 1.
Our methodology has one limitation, we cannot automatically
resolve missing peer dependencies. Peer dependencies are used to
decouple dependencies between projects, to ensure a single version
of the package is installed for all dependencies. For example, in
Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM ASE ’22, October 10–14, 2022, Rochester, MI, USA
Dataset of
candidate projects
(11,680 JavaScript projects) (155 JavaScript projects)
Selecting projects using a
module bundler and tree
shaking
Build projects and generate
source maps
(100 JavaScript projects)
Extract dependencies from
all projects' dependency tree
(219,829 npm dependencies)
Group dependencies by project
and version, aggregate by
minimal depth
(95,902 npm dependencies)
Figure 3: Overview of our approach for ltering projects and collecting dependencies.
applications with many
npm
packages depending on
react
,
react
can be declared as a peer dependency to prevent the installation
of multiple (possibly conicting) versions of
react
. Unlike run-
time and development dependencies, peer dependencies are not
automatically installed by
npm
. Instead, they must be included by
the code that uses the package as a dependency. We nd that 37
projects in our dataset have missing peer dependencies. Since it is
not possible to automatically resolve missing peer dependencies
for all 37 projects, we exclude the dependencies from our analysis.
4 RESULTS
In this section, we present the results of our three research questions.
For each research question, we present its motivation, the approach
to answer the question, and the results.
RQ1: How many installed dependencies are
production dependencies?
Motivation:
While reusing packages may reduce development
time, developers have to constantly maintain their dependencies to
x bugs in the packages and mitigate the problems of vulnerable
dependencies [
16
,
17
,
30
]. However, identifying dependencies used
in production is not a trivial task making it dicult to prioritize
dependency-related maintenance activities [37].
In this research question, we want to assess how often dependen-
cies of the selected projects are actually production dependencies.
Answering this question is the rst step to understand how often
a runtime and development dependency is used in production. It
will also help us better understand how dependencies are used in
practice and how they impact the security of software.
Approach:
To approach this research question, we use the method-
ology described in Section 3.3. That is, we start by installing all
dependencies from each project to retrieve the list of installed de-
pendencies and their respective versions. To classify an installed
dependency into direct or transitive, we generate the dependency
tree of each studied project. Then, to identify production depen-
dencies, we build all software projects with their respective mod-
ule bundler (
webpack
or
rollup
). This building process was done
manually by following the building steps specied in the project
documentation, to ensure each project is built correctly and without
errors. After building the project, we analyze the yielded source
maps to identify the production dependencies. Finally, we cross ref-
erence the installed dependencies and production dependencies to
classify each project dependency into production/non-production,
runtime/development, direct/transitive and report our ndings.
Finding 1: Of the 100 projects, 51 projects contain no produc-
tion dependencies.
To make a better sense of our results, we split
Table 3: Dependency prole of projects with and without
production dependencies in absolute numbers and median
of aggregated value per project. The percentages are always
in relation to the # of Installed Dependencies.
Projects with Zero Projects with 1+
Production Deps Production Deps
Dependencies Total Median Total Median
Installed 46,031 (100%) 851 53,421 (100%) 1,017
Runtime 1,005 (2.1%) 0 873 (1.6%) 5
Dev 45,025 (97.9%) 832 52,542 (98.4%) 1,017
Direct 1,539 (3.4%) 40 2,098 (3.9%) 29
Transitive 44,492 (96.6%) 809 51,307 (96.1%) 963
Production 497 (0.9%) 5
our dataset of 100 projects into two sets: projects with produc-
tion dependencies (49 projects) and projects without production
dependencies (51 projects). Table 3 shows the total number of de-
pendencies and their characteristics in both sets of projects. The 51
projects with no production dependencies have installed a total of
46 thousand dependencies, including direct and transitive packages,
however, none of the installed dependencies are used in produc-
tion. More interestingly, among the installed dependencies, there
were 1,005 packages that were declared to be runtime dependen-
cies, which is supposedly required at the runtime of the the nal
software, but were not included in the nal production artifact.
We also note that the set of 51 projects with no production
dependencies have a median of runtime dependencies of zero. We
conrm this nding through manual investigation and nd that 39
projects in our dataset only declare development dependencies.
Most of these projects are libraries meant to be used by other
projects as development tools. Examples of such projects are
Vuex
,
a state management pattern for Vue.js applications;
three.js
, a
popular cross-browser 3D library; and
polished
, a lightweight
toolset for writing styles in Javascript. All those library projects
have the incentive to depend on little to no runtime dependencies,
as the fewer dependencies they have, the less constrained their
users may be to rely on their libraries [15, 20].
Finding 2: From the 49 projects with production dependen-
cies, production dependencies represent less than 1% of the
installed dependencies.
The results show that projects with pro-
duction dependencies have a total of 53,421 dependencies, of which
only 497 dependencies (0.9%) are released to production (see Ta-
ble 3). Analyzing the median number of dependencies per project
ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab
Table 4: Characteristics of dependencies in projects with pro-
duction dependencies.
Direct Transitive Total
Production
Dev 62 77 139
Runtime 175 178 353
Non-production
Dev 1,809 50,594 52,403
Runtime 52 458 510
Total 2,098 51,307 53,405
Figure 4: Number of production dependencies on the 49
project with one or more production dependencies.
(see Median column in Table 3), we nd that projects have in me-
dian 5 production dependencies while depending in median over a
thousand dependencies. However, we notice that not all runtime
dependencies are used in production. The total number of runtime
dependencies installed (873) far exceeds the number of production
dependencies (497), indicating that many runtime dependencies
may be incorrectly congured or not used in the code.
Figure 4 shows the distribution of the number of production
dependencies per project. We can observe that 15 projects contain
a single production dependency and the vast majority of projects
(65.3%) have less than 10 dependencies used in production. Still,
we found some projects that depend heavily on packages in their
production build, with
ProjectMiradormirador
being the project
with the most production dependencies in our dataset with 92.
Finding 3: More than half (59%) of the runtime dependen-
cies are not used in production.
As shown in the "Production"
row of Table 4, we nd that 510 out of 863 of the total runtime
dependencies are not shipped to the production bundle. Runtime
dependencies are dependencies (supposedly) required by the appli-
cation to run. Our results, however, show that in the majority of
the cases, dependencies are declared as runtime but are not actually
used in the code, thus, are excluded by the module bundler during
the build. This nding suggests developers mistakenly maintain
unused dependencies in their project conguration, which indicates
that they lack the necessary information to determine whether a
dependency is actually used by the software in production. This
is corroborated by related work [
30
], where authors reported that
unused dependencies occur in 80% of studied projects.
51 out of 100 projects do not use any dependencies in production.
The 49 projects that ship dependency to production contain less
than 1% of production dependencies. Contrary to common belief,
59% of runtime dependencies are not used in production.
RQ2: What are the characteristics of production
dependencies?
Motivation:
Production dependencies are the prime security lia-
bility in software systems since they can compromise a running
software [
48
]. Current SCA tools may not distinguish dependency
scope (i.e., production, non-production) [
32
,
37
], which may lead
to reporting unexploitable vulnerabilities (false positives). They
may also only consider direct dependencies although vulnerabili-
ties can be introduced transitively [
24
,
33
,
37
]. The problem is that
assumptions about production dependencies are not always correct.
In fact, RQ1 showed that runtime dependencies are not always in
production. In this research question, we study the characteristics
production dependencies to establish a practical understanding of
how they are used and in what context they are used. Such ndings
help in improving current SCA tools as they provide insights on
how dependencies are used in practice.
Approach:
To identify the characteristics of dependencies used in
production, we consider the production dependencies identied in
RQ1 and classify them based on their scope (runtime, development),
depth, and usage.
In theory, one can identify the scope of a dependency by look-
ing at the nature of the functionality provided by a package. For
instance, packages that provide development utilities should not
become production dependencies. To investigate to what extent the
nature of the package determines if it will be used as a production
dependency, we analyze how packages are released to production
across the 100 studied projects. We analyze a total of 1,269 unique
packages. We then classify the packages in three categories: 1) pack-
ages that are always used in production, 2) packages that are never
used in production, and 3) packages that are sometimes used in
production.
Finding 4: 28.2% of production dependencies are development
dependencies.
The rst section of Table 4 shows the characteris-
tics of production dependencies. We nd that 28.2% of production
dependencies are development dependencies and the remaining
71.7% are declared as runtime dependencies. It is expected that all
dependencies released to production consist of runtime dependen-
cies since they provide the application with specic functionalities
to be used by the client. It is then surprising to nd that almost
30% of the dependencies released to production are development
dependencies since such dependencies are, by default, not included
in the production bundle.
While unusual, having a development dependency in production
occurs in 37 of the projects in our dataset. To better understand
this, we perform an exhaustive inspection of the dependency con-
guration the 37 projects and deduct two possible causes for a
development dependency to be in production. First, the selected
projects use module bundlers, which disregard the conguration
of the
package.json
le and use source code analysis to identify
what should be a production dependency. Developers may not be as
Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM ASE ’22, October 10–14, 2022, Rochester, MI, USA
careful to specify their development dependencies as their building
process does not depend on a correct specication of development
and runtime dependencies [
5
]. In fact, from the 37 projects with de-
velopment dependencies in production, 4 (10.8%) projects declared
all of their dependencies as development dependencies although all
of them have at least 1 dependency in production. Second, it can be
that the dependency is initially declared under the "dependencies"
property of the
package.json
le, but is intentionally moved by
the developer to "devDependencies" to get rid of security warnings,
as it is explained in a
create-react-app
GitHub issue [
9
]. The
author of the issue explains that
npm audit
reports vulnerabilities
for code that never runs in production, but strictly at build time in
development. They then suggest to move vulnerable dependencies
to "devDependencies" to get rid of the security warning. We believe
development production dependencies are unlikely to happen in
projects that do not use module bundlers. By default,
npm
does
not include development dependencies in a production build. This
means that a project that requires a development dependency at
runtime will not function because of the missing dependency.
Finding 5: The majority of production dependencies (51.8%)
are transitive dep endencies.
Looking at the Transitive columns
of Table 4, we notice that 51.8% of production dependencies are
dependencies of their direct project dependencies. These results
suggest that developers may not have control over the majority of
production dependencies. Naturally, a transitive dependency can
only be released to production if the original dependency is also
released to production. Hence, developers should be extra careful
when selecting production dependencies, preferably by selecting
packages that have little to no production dependencies on their
own, to reduce the attack surface through vulnerable dependencies.
Finding 6: The 237 production dependencies come from 183
unique npm packages. From these, 43 are sometimes not used
in production in other projects.
To put things in perspective,
we evaluate the number of unique
npm
packages in our dataset
by grouping the packages by name and obtain 1,269 unique
npm
packages. From this, we nd that 1,086 (85.6%) are never used in
production since they oer functionalities that are development-
only. For example,
eslint
installed in 79 projects, is a static code
analysis tool that is used to identify problematic patterns found in
JavaScript code,
@babel/core
installed in 70 projects, is a command
line interface tool that facilitates working with
babel
, and
rollup
installed in 65 projects, is a build tool for JavaScript projects.
For the rest of the packages, we nd that 183 packages are used
in production at least once. Taking a closer look at the production
packages, we nd that 140 (76.5%) packages are always used in
production when installed in a project and that such packages
do not occur frequently. In fact, they occur at most in 2 dierent
projects and are installed as runtime dependencies. For example,
is-promise
, a library that tests whether an object is a
promises-a+
promise,
query-string
, a library that parses and stringies URL
query strings, and
react-fast-compare
, a library that provides
specic handling of fast deep equality comparison for React, are
all installed in 2 projects, and used in production 100% of the time
they are installed.
Interestingly, we nd that 43 (23.5%) of the 183 production pack-
ages are not always shipped to production. This indicates that some
Table 5: Frequently installed packages that are b oth used
and not used in production.
Package # Production Total # % in
Installations Installations Production
react 4 40 10%
react-dom 3 37 8.1%
prop-types 13 23 56.5%
@babelruntime 10 19 52.65%
lodash 4 14 28.6%
core-js 5 13 38.5%
classnames 5 8 62.5%
react-is 1 5 20%
react-redux 4 5 80%
packages are used dierently (in production and not in produc-
tion) across projects regardless of their functionalities. We show in
Table 5 10 examples of such packages, and how often they are in
production versus how often they are installed. The results show
that
react
, a library for building user interfaces, is the most fre-
quently installed package appearing in 40 projects, but is only
released to production in 4 projects. In contrast,
react-redux
, a
React binding for Redux allowing React components to read data
from a Redux store, only appears in 5 projects, but is released to
production in 4 out of 5 projects. In only one project (redux-little-
router) is
react-redux
not released to production and declared as a
development dependency. We further inspect the
package.json
of
redux-little-router
and nd that
react-redux
is a peer depen-
dency, thus, it is not included in the production bundle of the project.
It is also worth noting that redux-little-router is a lightweight li-
brary that provides exible React bindings and components. Thus,
the project makes a conscious eort to mitigate dependency bloat,
declaring most of its dependencies as development dependencies,
and including
react
,
react-dom
,
react-redux
, and
redux
as peer
dependencies.
The main takeaway from this nding is that we cannot identify
production dependencies by looking at the functionalities of a pack-
age alone. As we have shown, the scope of a dependency can vary
based on the context and usage of a package, which means it may
dier from project to project. For example, a module bundler may
be used in production in one application since it uses some of its
functionalities at runtime, but may only be used in development in
another application. Thus, it is important for SCA tools to include
this scope analysis in their approach so that developers can more
easily identify production dependencies based on their own usage
and context.
Our ndings indicate that 28.2% of production dependencies
come from development dependencies and that 51.8% come from
transitive dependencies. The functionality of the package alone
does not determine if they will be shipped to production: 43 of
183 packages encountered in production in one project are not
shipped to production in other projects.
ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab
RQ3: How often are npm security alerts emitted
for production dependencies?
Motivation:
The observations made in RQ1 suggest that the ma-
jority of the dependencies are not used in production. While vulner-
abilities in non-production dependencies may aect the develop-
ment environment (e.g., installing packages with malicious code),
it is when a vulnerable dependency is released to production that
the threat of exploitation reaches its peak [
48
]. Developers should
constantly run scanners to identify security alerts in their project
and prioritize xes in production dependencies, to avoid having
their software compromised. The problem is that tools such as
npm
audit
often report many false alerts for deployed code, making
vulnerability reports noisy and bloating audit resources [9, 39]. In
this research question, we investigate how often security alerts are
emitted for production dependencies compared to non-production
dependencies and the characteristics of vulnerable dependencies.
Approach:
To investigate how often vulnerabilities are encoun-
tered in production and non-production dependencies, we rst
generate the
npm
vulnerability report of each project by using the
npm audit
tool. Next, to obtain the
npm audit
reports in a parseable
csv format, we adapt the npm-deps-parser [6], a tool that parses,
summarizes, and prints
npm audit
json output to markdown. From
this, each vulnerability report is identied with the project name,
the vulnerable dependency and version, the severity, and a unique
link to the GitHub Advisory Database (GAD) [
12
], a database of
security advisories aecting the open source world. To obtain the
scope and depth of each vulnerable dependency, we cross-reference
the set of vulnerable dependencies with the set of production de-
pendencies and installed dependencies for each project. Because of
the limitations discussed in Section 3.3, we could not identify the
scope and depth of 29 vulnerable dependencies and exclude them
from further analysis.
Finding 7: A total of 608 security alerts are emitted for de-
pendencies of 32 projects, yet none are related to production
dependencies.
In our dataset, no security alerts were emitted for 68
projects. The remaining 32 projects reported a total of 608 security
alerts for 456 vulnerable dependencies, i.e., the same dependency
may issue multiple security alerts. In median, these 32 projects re-
ported 16 security alerts, none related to production dependencies.
There are a few reasons as to why security alerts may have
been emitted only to non-production dependencies. First, as seen
in RQ1, the vast majority of dependencies are not released to pro-
duction (99%), the chances of vulnerabilities being encountered in
non-production are 99x higher than in production dependencies.
Second, developers of the selected projects are likely making the
conscious eort of updating production dependencies to mitigate
security vulnerabilities, since they may be aware of what dependen-
cies may be used in production (e.g., developers open a PR in the
project
InstantSearch
to update a vulnerable dependency [
44
]).
The problem, however, is that tools such as
npm audit
make no
distinction whether security alerts are referring to non-production
dependencies. Developers have to know themselves which depen-
dencies are released to production to lter out relevant security
alerts that need urgent action, making it harder to prioritize man-
agement eorts. This is shown in related work [
32
], where authors
Table 6: Characteristics of vulnerable dependencies reported
by npm vulnerability alerts.
Direct Transitive Total
Development 9 410 419
Runtime 1 7 8
Total 10 417 427
Table 7: Count of vulnerability reports per severity level
with the npm recommended action.
Vulnerability Severity Dependency
Severity Recommended action Runtime Dev
low Address at your discretion 3 33
moderate Address as time allows 1 226
high Address as quickly as possible 5 263
critical Address immediately 0 45
reported that 69% of the surveyed developers claimed to be unaware
of their vulnerable dependencies and that dependency updates are
perceived as extra workload and responsibility.
Finding 8: 98.1% (419) of the vulnerable dependencies are de-
velopment dependencies.
In this analysis, we switch from secu-
rity alert reports to vulnerable dependencies, as multiple reports
may be issued for the same dependency under dierent vulnerabil-
ities. The rst row of Table 6 shows the number of development
and runtime vulnerable dependencies reported by our experiment.
The
npm audit
tool reports security alerts from a total of 419
vulnerable development dependencies, representing 98.1% of all
vulnerable dependencies identied. From the 419 vulnerable devel-
opment dependencies, 9 (2.1%) are direct dependencies, and 410
(97.9%) are transitive dependencies. Next, we analyze the severity
of the vulnerability reports in relation to the characteristics of vul-
nerable dependencies as shown in Table 7. We nd that all critical
and most of the high-severity reports are emitted for development
dependencies.
In some cases, tools such as
npm audit
allow developers to lter
out development dependencies from the security reports, as they
are supposedly not released to production [
9
]. It is dangerous, how-
ever, to completely ignore the security maintenance of development
dependencies. Some development dependencies are used in produc-
tion, as seen in RQ2, and hence, have the risk of being exploited in
a production environment. Vulnerable transitive dependencies that
are released to production are equally dangerous since even if they
are reported by npm, developers are not in control of their update.
32 projects in our dataset reported a total of 608 security alerts,
but none of the alerts referred to a production dependency.
Projects have in median 16 security alerts, but the vast majority
refer to development non-production dependencies which does
not represent a threat for their running application.
Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM ASE ’22, October 10–14, 2022, Rochester, MI, USA
5 DISCUSSION
In this section, we discuss the implications from our work and
possible solutions for current source code based tools.
5.1 Implications
Tracking production dependencies is very challenging.
While
our study focuses on a selected number of 100 popular JavaScript
projects, the results showcase the diculties of mapping a project’s
production dependencies. This diculty arises primarily because
assumptions commonly held by the development community re-
garding dependency management do not hold in practice:
(1)
Assumption 1: Runtime dependencies are always shipped to
production. Our results showed that the majority of depen-
dencies declared as runtime are not used in production (RQ1).
Developers may spend time ineectively managing runtime
dependencies due to security alerts, without conrming that
such dependencies are bundled in their delivered software.
(2)
Assumption 2: Development dependencies are never shipped
to production. In projects that use module bundler, dependen-
cies declared as development may be shipped to production
(RQ2). In fact, development dependencies represent a third of
all production dependencies identied in our study. Develop-
ers may disregard all their development dependencies as being
irrelevant for security upgrades, when in fact, some vulnera-
ble development dependencies are shipped in their delivered
software.
(3)
Assumption 3: The functionality provided by the package is
sucient to determine if it is a production dependency. Partic-
ularly in cases where packages provide runtime utilities, our
results show that 43 out of 183 packages are released to pro-
duction in some projects but not in others. Thus, the package’s
functionality is not sucient to determine whether a package
is used in production (RQ2).
These assumptions have the potential to aect the security of
the delivered software, as developers may wrongly assume what
dependencies are sensitive to security exploits.
Not all vulnerabilities in dependencies are a security risk
for the software in production.
Prior research has shown that
not all vulnerabilities are relevant for the software in production
[
38
,
39
]. In this paper, we expand on this by studying the rele-
vance of dependencies for the security risk of a software in pro-
duction. Our ndings indicate that, given by the prominence of
non-production dependencies (RQ1), the vast majority of security
alerts will be emitted for dependencies that do not impact the secu-
rity of a software in production (RQ3).
To put things in perspective, we analyze the types of vulnerabili-
ties reported by
npm audit
. The most common type of vulnerability
identied in the studied projects is Regular Expression Denial of
Service (ReDoS), accounting for 25.3% of all reported vulnerabili-
ties and for 27% of high severity vulnerabilities. While a diligent
dependency management is of utmost importance to mitigate secu-
rity risks, developers should be mindful of the types of alerts they
should prioritize. In the case of a ReDoS attack, the performance of
an application is compromised if there is a regular expression that,
with malicious input, slows it down exponentially. However, we
nd that 97.3% of the dependencies aected by a ReDoS vulnerabil-
ity are development-only, which tend not to be part of a production
software. Previous research shows that developers tend to ignore
security alerts when they receive a lot of them [
32
]. Our approach
allows them to focus on the important ones rst (i.e., security alerts
for vulnerable dependencies in production).
Source maps and tree shaking can benet developers beyond
client-side applications.
In this paper, we use module bundlers
to accurately dierentiate between installed dependencies and pro-
duction dependencies. Module bundlers are most commonly used
in client-side applications, which are generally dened as libraries
or frameworks running in a Web browser (e.g., React, Vue, Angular)
to support the development of Web applications. Module bundlers,
however, can benet far beyond just client-side applications by
helping developers:
(1)
Prioritize addressing security alerts on production dependen-
cies. As security alerts are very commonly issued for projects
that rely on open source code, developers should prioritize ad-
dressing security issues that have the potential to aect their
production software, by identifying vulnerabilities aecting
their production dependencies.
(2)
Prioritize maintenance tasks on production dependencies. As
projects depend on increasingly high number of software depen-
dencies, updating all dependencies in every release may become
increasingly prohibitive. Updating dependencies always have
the risk of breaking changes [
19
], leading to software bugs
and mistrust between project maintainers [
30
]. Hence, devel-
opers should prioritize updating production dependencies to
focus their maintenance tasks on packages that may aect their
delivered software.
5.2 Towards Better Tool Support
SCA tools are constantly used by software projects to control the
risk related to software dependencies, such as vulnerabilities, and
compliance to open source licenses[
11
]. To understand the sup-
port current SCA tools provide to production dependencies, we
investigate four popular tools: npm audit, Snyk, Dependabot, and
OSWAP Dependency-Check. We analyze the documentation of the
SCA tools, as well as apply them to some of our studied projects to
assess their capabilities and limitations.
We present in Table 8 an overview of the features related to
dependency scope and usage from four popular SCA tools. All SCA
tools we assess cover all dependencies of a software project, includ-
ing both direct and transitive dependencies. Given a project may
have thousands of installed dependencies, we now dive into the
ltering capabilities of the tools. We note that only
npm audit
and
Snyk [
13
] provide ways of ltering security alerts based on whether
the vulnerability aects runtime/development dependencies or di-
rect/transitive dependencies. The ltering of runtime/development
dependencies is based on project conguration (e.g.,
package.json
le), thus, it is subject to limitations when it comes identifying pro-
duction versus non-production dependencies. Neither Dependabot
nor OSWAP Dependency-Check allow users to lter security alerts
based on the scope or depth of their dependencies.
It is worth noting that none of the tools provide a way to dier-
entiate between production versus non-production dependencies.
ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab
Table 8: Vulnerable dep endency (VD) characteristics based metrics reported by current tools and support to locate vulnerable
code (VC) in JavaScript projects.
Cover all Filter Security Alerts by Locate
Tool dependencies Runtime Development Direct Transitive In production dep code
npm audit Yes Yes Yes Yes Yes No No
Snyk Yes Yes Yes Yes Yes No No
Dependabot Yes No No No No No No
OSWAP Dependency-Check Yes No No No No No No
There is no support, for instance, to input source map les in the
tools, to help lter out vulnerabilities that concern non-production
dependencies. We believe adding source map support to SCA tools
would oer developers better insight on their production bundle
without relying so much on the dependency congurations that
have shown to be inconsistent across dierent projects.
Finally, we nd that none of the tools provide a way to locate
where the vulnerable dependency is used in code (column “Locate
dep code”). Developers have to rely on their own set of static/dy-
namic analysis tools to know exactly where the vulnerable depen-
dency is used in the codebase. We believe static analysis tools would
benet from using the features provided by module bundlers that
scan the code for
import
statements to provide the path to the
source le in which a dependency is imported and used.
6 RELATED WORK
In this section, we discuss the related literature divided into three
aspects. First, we discuss works that have focused on the challenges
related to the Software Bill of Materials. Then, we discuss works
describing the challenges of dependency management in software
ecosystems. Finally, we discuss existing tools and approaches to
detect vulnerable dependencies.
6.1 Software Bill of Materials
The Cybersecurity and Infrastructure Security Agency (CISA) de-
nes the Bill of Materials (BOM) as a nested inventory of compo-
nents in a piece of software [
1
]. The process of identifying produc-
tion dependencies is part of the constructing the BOM of a software.
Several studies have proposed approaches to consolidate the BOM
of software applications [
14
,
23
,
34
]. Zajdel et al. discussed that
users of open source softwares tend to arbitrarily download the
software into their build systems, but rarely keep track of which
versions they use which results in unnecessary software being left
in the application, increasing the risk of potential vulnerabilities
[
47
]. Coelho et al. proposed a data-driven approach to measure the
level of maintenance activity in GitHub projects [
21
]. The authors
found that 16% of the studied open source projects have become
unmaintained over the course of one year. They also reported that
software tools such as compilers and editors have the highest main-
tenance activity over time and proposed that a metric about the
level of maintenance activity of GitHub projects can help developers
in selecting open source projects.
These prior studies focus on the importance of selecting well-
maintained software libraries, and propose approaches to alleviate
the challenges related to open source code reuse. However, the
challenges of constructing the BOM for dynamic languages like
JavaScript are still present. Thus, our study focuses on JavaScript
projects and leverages existing approaches (i.e., module bundlers
and tree shaking) to help developers maintain their softwares and
decrease the risks related to open source code reuse by analyzing
the software’s dependencies and reporting on the ones that are
actually used in the code.
6.2 Dependency Studies
Package ecosystems and the presence of vulnerable dependencies
have been studied in the literature [
15
,
25
,
30
,
31
]. Hejderup et al.
report that one-third of the
npm
packages use vulnerable dependen-
cies [
28
]. Similar to our study, the authors suggest context of usage
of a package to be a possible reason for not xing the vulnerable
dependencies. Abdalkareem et al. conduct an empirical analysis on
security vulnerabilities in Python packages [
15
]. They nd that the
number of vulnerabilities in the
PyPi
ecosystem increases over time
and that it takes, on median, more than 3 years to get discovered,
regardless of their severity. They emphasize on the need for more
eective process to detect vulnerabilities in open source packages
since both
npm
and
PyPi
allows to publish a package release to
the registry with no security checks. Lauinger et al. conduct the
rst large scale of JavaScript open source projects and investigate
the relationship between outdated dependencies and dependen-
cies with known vulnerabilities [
33
]. They report that transitive
dependencies are more likely to be vulnerable since developers
may not be aware of them and have less control over them, which
further corroborates with our ndings of RQ3. Similarly, Williams
et al. report that 26% of open source Maven packages have known
vulnerabilities and refer to a lack of meaningful controls of the com-
ponents used in the proprietary projects as a partial explanation to
this high number of vulnerable dependencies [46].
These prior studies focus on the presence of known vulnerabili-
ties in popular package ecosystems and the reason why the number
of vulnerable dependencies is so high. However, their analysis does
not consider the scope of dependencies (i.e., they do not distinguish
production and non-production dependencies). As a result, the
studied vulnerable dependencies may not be exploitable. Zapata et
al. investigate vulnerable dependency migrations of
npm
packages
and evaluate the impact of a vulnerability in the
ws
package on 60
JavaScript projects using the vulnerable version of the package [
48
].
The authors nd that up to 73.3% of the dependent applications
were safe from the vulnerability since they did not actually used
the vulnerable code. The study also highlights that it is not trivial
to map vulnerable code to client usage for JavaScript, which further
corroborates with our ndings in RQ1.
Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM ASE ’22, October 10–14, 2022, Rochester, MI, USA
6.3 Detecting Vulnerable Dependencies
Alfadel et al. study the use of Dependabot security pull requests in
2,904 JavaScript open source GitHub projects [
16
]. Results show that
the vast majority (65.42%) of the security-related pull requests are
often merged within a day and that the severity of the vulnerable de-
pendency or potential risk for breaking changes are not associated
with the merge time. Ponta et al. propose a pragmatic approach to
facilitate the assessment of vulnerable dependencies in open source
libraries by mapping patch-based changes of vulnerabilities onto
the aected components of the application [
40
]. Seja et al. present
Amalfi
, a machine-learning based approach for automatically de-
tecting potentially malicious packages [
42
]. The authors evaluate
their approach on 96,287
npm
package versions published over the
course of one week and identify 95 previously unknown vulnera-
bilities. Pashchenko et al. propose Vuln4Real, a methodology that
addresses the over-ination problem of academic and industrial
approach for reporting vulnerable dependencies in free and open
source software (FOSS) [
38
]. Vuln4Real extends state-of-the-art ap-
proaches to analyzing dependencies by ltering development-only
dependencies, grouping dependencies by project, and assessing
dead dependencies. Their evaluation of Vuln4Real shows that the
methodology signicantly reduces the number of false alerts for
code in production (i.e., dependencies wrongly agged as vulnera-
ble). Pashchenko et al’s. work is the closest to ours since it considers
similar aspects in relationship to the relevance of vulnerable depen-
dencies: exploitability and dependency scope. Our study touches
on another aspect that is not discussed in Vuln4Real and that is the
context in which a dependency is used versus how it is congured.
Our paper shows that there is a discrepancy between the cong-
uration of dependencies and its usage, and that this discrepancy
may aect the exploitability of a vulnerability (i.e., its relevance to
the application).
Imtiaz et al. [
29
] present an in-depth case study by comparing
the analysis reports of 9 SCA tools on OpenMRS, a large web appli-
cation composed of Maven and
npm
projects. The study shows that
the tools vary in their vulnerability reporting and that the count
of vulnerable dependencies reported for
npm
projects ranges from
32 to 239. From the 9 studied SCA tools, 4 freely available tools
could be applied to
npm
projects: OWASP Dependency-Check, Snyk,
Dependabot, and npm audit. The results show that all 4 tools de-
tect vulnerable dependencies across all scopes and depths and that
reported vulnerabilities are mostly introduced through transitive
dependencies, except for Dependabot. While the authors of this
paper report on the coverage capabilities of SCA tools, our study
mainly focuses on the data that is shown to the user. For exam-
ple, npm audit covers dependencies of all scope when reporting
for vulnerabilities, but it is the user’s responsibility to lter the
vulnerable dependencies by scope (production or development).
That is, SCA tools don’t explicitly report on the scope of vulnerable
dependencies, and when it is done manually by users, this analysis
depends on the project’s dependency congurations rather than
dependency usage.
7 THREATS TO VALIDITY
Threats to internal validity
considers the experimenter’s bias
and errors. Our method of analysis relies on building software
projects with their congured module bundler to identify produc-
tion dependencies, and errors in this process may introduce false
positives/negatives in our analysis. We mitigate this threat by 1)
only selecting projects that already use module bundlers to min-
imize any intervention that could introduce bugs in the process,
2) building each project manually by following the projects docu-
mentation, 3) manually inspecting the built artifacts (e.g., installed
dependencies, source map les), and 4) removing 55 projects that
showed evidence of failed builds (e.g., errors, empty source map
les). To further conrm the validity of this process, we also sam-
pled 7 projects from our dataset and asked contributors to validate
our results, by checking the accuracy of the yielded production
dependencies. We received responses from 4 projects and contribu-
tors conrmed the yielded classications, helping us validate the
soundness of our methodology.
Threats to eternal validity
considers the generalizability of
the ndings. We purposefully select projects that already use mod-
ule bundlers which could limit the type of project our ndings gen-
eralize. First, module bundlers tend to be used primarily by projects
that want to minimize their production dependencies, such as client-
side packages such as web-applications and libraries. In fact, our
nding that development dependencies are shipped to production
are unlikely to occur in projects that do not use module bundlers.
Second, our dataset is strictly composed of open source JavaScript
projects, thus, our results may dier if a study is performed on
proprietary projects or projects written in other languages.
8 CONCLUSION AND FUTURE WORK
This research investigates projects dependencies that are released
to production and their impact on security and dependency man-
agement. We conducted our study on 100
npm
projects, one of
the largest and fastest growing software ecosystems. Our results
showed that production dependencies are rare among the installed
dependencies of a project, but are dicult to identify. Commonly
held assumptions of dependency management do not hold in prac-
tice and context is more important in determining the scope of a de-
pendency as opposed to its conguration. Furthermore, we evaluate
how often security alerts are reported for production dependencies,
and found that none of the vulnerability reports are emitted for
dependencies released to production. Rather, the majority of the
alerts are emitted for development, transitive dependencies which
has two main implications: 1) not every vulnerability is a threat to
the software in production, and 2) vulnerabilities can be introduced
transitively regardless of their scope, which further motivates the
need for SCA tools to provide such an analysis.
Our paper outlines directions for future work. Using module
bundlers as a way to identify production dependencies may aug-
ment current SCA tools to provide better insights on the scope of
their dependencies within their project’s context and usage. Con-
sequently, module bundlers or similar tools, may benet far more
than just client-side applications and should be part of the build
process of projects that extensively rely on open source code.
REFERENCES
[1] [n. d.]. Software Bill of Materials | CISA. https://www.cisa.gov/sbom
[2]
2019. 2019 State of the Software Supply Chain. https://www.sonatype.com/
hubfs/SSC/2019%20SSC/SON_SSSC-Report-2019_jun16-DRAFT.pdf
ASE ’22, October 10–14, 2022, Rochester, MI, USA Jasmine Latendresse, Suhaib Mujahid, Diego Elias Costa, and Emad Shihab
[3]
2019. Eight Key Findings Illustrating How to Make Open Source Work Even
Better for Developers. https://cdn2.hubspot.net/hubfs/4008838/Resources/The-
2019-Tidelift-managed-open-source-survey-results.pdf
[4] 2019. webpack. https://webpack.js.org/
[5]
2020. Do "dependencies" and "devDependencies" matter when using Web-
pack? https://jsramblings.com/do-dependencies-devdependencies-matter-
when-using-webpack/
[6] 2020. npm-deps-parser. https://github.com/nVisium/npm-deps-parser
[7]
2020. Securing the World’s Software. https://octoverse.github.com/static/github-
octoverse-2020-security-report.pdf
[8] 2021. Create react app. https://create-react-app.dev/
[9]
2021. Help, ‘npm audit‘ says I have a vulnerability in react-scripts!
·
Issue
#11174
·
facebook/create-react-app. https://github.com/facebook/create-react-
app/issues/11174
[10] 2021. rollup.js. https://rollupjs.org/guide/en/
[11]
2022. The Complete Guide to Software Composition Analysis - FOSSA. https:
//fossa.com/complete-guide-software-composition-analysis
[12] 2022. GitHub Advisory Database. https://github.com/advisories
[13] 2022. Snyk | Developer security | Develop fast. Stay secure. https://snyk.io/
[14]
Rabe Abdalkareem, Olivier Nourry, Sultan Wehaibi, Suhaib Mujahid, and Emad
Shihab. 2017. Why do developers use trivial packages? an empirical case study
on npm. Proceedings of the 2017 11th Joint Meeting on Foundations of Software
Engineering (08 2017). https://doi.org/10.1145/3106237.3106267
[15]
Rabe Abdalkareem, Vinicius Oda, Suhaib Mujahid, and Emad Shihab. 2020. On
the impact of using trivial packages: an empirical case study on npm and PyPI.
Empirical Software Engineering 25 (01 2020), 1168–1204. https://doi.org/10.1007/
s10664-019-09792-9
[16]
Mahmoud Alfadel, Diego Elias Costa, Emad Shihab, and Mouafak Mkhallalati.
2021. On the Use of Dependabot Security Pull Requests. In 2021 IEEE/ACM
18th International Conference on Mining Software Repositories (MSR). 254–265.
https://doi.org/10.1109/MSR52588.2021.00037
[17]
Md Atique, Reza Chowdhury, Rabe Abdalkareem, and Emad Shihab. 2019. On the
Untriviality of Trivial Packages: An Empirical Study of npm JavaScript Packages.
Journal of IEEE Transactions on Software Engineering 01 (2019). http://das.encs.
concordia.ca/uploads/atique_tse2021.pdf
[18]
Victor R. Basili, Lionel C. Briand, and Walcélio L. Melo. 1996. How reuse inuences
productivity in object-oriented systems. Commun. ACM 39 (10 1996), 104–116.
https://doi.org/10.1145/236156.236184
[19]
Chris Bogart, Christian Kästner, James Herbsleb, and Ferdian Thung. 2021. When
and How to Make Breaking Changes. ACM Transactions on Software Engineering
and Methodology 30 (07 2021), 1–56. https://doi.org/10.1145/3447245
[20]
Xiaowei Chen, Rabe Abdalkareem, Suhaib Mujahid, Emad Shihab, and Xin Xia.
2021. Helping or not Helping? Why and How Trivial Packages Impact the npm
Ecosystem. Empirical Software Engineering 26 (03 2021). https://doi.org/10.1007/
s10664-020-09904-w
[21]
Jailton Coelho, Marco Túlio Valente, Luciano Milen, and Luciana Lourdes Silva.
2020. Is this GitHub Project Maintained? Measuring the Level of Maintenance
Activity of Open-Source Projects. CoRR abs/2003.04755 (2020). arXiv:2003.04755
https://arxiv.org/abs/2003.04755
[22]
Diego Elias Costa, Suhaib Mujahid, Rabe Abdalkareem, and Emad Shihab. 2021.
Breaking Type-Safety in Go: An Empirical Study on the Usage of the unsafe
Package. IEEE Transactions on Software Engineering (2021), 1–1. https://doi.org/
10.1109/TSE.2021.3057720
[23]
Diego Elias Costa, Suhaib Mujahid, Rabe Abdalkareem, and Emad Shihab. 2021.
Breaking Type-Safety in Go: An Empirical Study on the Usage of the unsafe
Package. IEEE Transactions on Software Engineering (2021), 1–1. https://doi.org/
10.1109/TSE.2021.3057720
[24]
Joel Cox, Eric Bouwers, Marko van Eekelen, and Joost Visser. 2015. Mea-
suring Dependency Freshness in Software Systems. In 2015 IEEE/ACM 37th
IEEE International Conference on Software Engineering, Vol. 2. 109–118. https:
//doi.org/10.1109/ICSE.2015.140
[25]
Alexandre Decan, Tom Mens, and Philippe Grosjean. 2019. An Empirical Compar-
ison of Dependency Network Evolution in Seven Software Packaging Ecosystems.
Empirical Software Engineering 24 (02 2019). https://doi.org/10.1007/s10664-017-
9589-y
[26]
Josh Fruhlinger. 2020. Equifax data breach FAQ: What happened, who was
aected, what was the impact? https://www.csoonline.com/article/3444488/
equifax-data-breach-faq-what-happened-who-was-aected-what-was-the-
impact.html
[27]
Emitza Guzman, David Azócar, and Yang Li. 2014. Sentiment Analysis of
Commit Comments in GitHub: An Empirical Study. In Proceedings of the 11th
Working Conference on Mining Software Repositories (Hyderabad, India) (MSR
2014). Association for Computing Machinery, New York, NY, USA, 352–355.
https://doi.org/10.1145/2597073.2597118
[28]
J. I. Hejderup. 2015. In Dependencies We Trust: How vulnerable
are dependencies in software modules? repository.tudelft.nl (2015).
https://repository.tudelft.nl/islandora/object/uuid:3a15293b-16f6-4e9d-b6a2-
f02cd52f1a9e?collection=education
[29]
Nasif Imtiaz, Seaver Thorn, and Laurie Williams. 2021. A comparative study of
vulnerability reporting by software composition analysis tools. Proceedings of
the 15th ACM / IEEE International Symposium on Empirical Software Engineering
and Measurement (ESEM) (10 2021). https://doi.org/10.1145/3475716.3475769
[30]
Abbas Javan Jafari, Diego Elias Costa, Rabe Abdalkareem, Emad Shihab, and Niko-
laos Tsantalis. 2021. Dependency Smells in JavaScript Projects. IEEE Transactions
on Software Engineering (2021), 1–1. https://doi.org/10.1109/tse.2021.3106247
[31]
Riivo Kikas, Georgios Gousios, Marlon Dumas, and Dietmar Pfahl. 2017. Structure
and Evolution of Package Dependency Networks. In Proceedings of the 14th Inter-
national Conference on Mining Software Repositories (Buenos Aires, Argentina)
(MSR ’17). IEEE Press, 102–112. https://doi.org/10.1109/MSR.2017.55
[32]
Raula Gaikovina Kula, Daniel M. German, Ali Ouni, Takashi Ishio, and Katsuro
Inoue. 2017. Do developers update their library dependencies? Empirical Software
Engineering 23, 1 (may 2017), 384–417. https://doi.org/10.1007/s10664-017-9521-5
[33]
Tobias Lauinger, Abdelberi Chaabane, Sajjad Arshad, William Robertson, Christo
Wilson, and Engin Kirda. 2017. Thou Shalt Not Depend on Me: Analysing the
Use of Outdated JavaScript Libraries on the Web. In Proceedings 2017 Network
and Distributed System Security Symposium. Internet Society. https://doi.org/10.
14722/ndss.2017.23414
[34]
Suhaib Mujahid, Diego Elias Costa, Rabe Abdalkareem, Emad Shihab, Mo-
hamed Aymen Saied, and Bram Adams. 2021. Toward Using Package Centrality
Trend to Identify Packages in Decline. IEEE Transactions on Engineering Manage-
ment (2021), 1–15. https://doi.org/10.1109/tem.2021.3122012
[35]
Emerson Murphy-Hill, Ciera Jaspan, Caitlin Sadowski, David Shepherd, Michael
Phillips, Collin Winter, Andrea Knight, Edward Smith, and Matt Jorde. 2019.
What Predicts Software Developers’ Productivity? IEEE Transactions on Software
Engineering (2019), 1–1. https://doi.org/10.1109/tse.2019.2900308
[36]
Stack Overow. [n. d.]. Stack Overow Developer Survey 2021. https://insights.
stackoverow.com/survey/2021
[37]
Ivan Pashchenko, Henrik Plate, Serena Ponta, Antonino Sabetta, and Fabio Mas-
sacci. 2018. Vulnerable open source dependencies: counting those that matter.
1–10. https://doi.org/10.1145/3239235.3268920
[38]
Ivan Pashchenko, Henrik Plate, Serena Ponta, Antonino Sabetta, and Fabio
Massacci. 2020. Vuln4Real: A Methodology for Counting Actually Vulnera-
ble Dependencies. IEEE Transactions on Software Engineering PP (09 2020), 1–1.
https://doi.org/10.1109/TSE.2020.3025443
[39]
Ivan Pashchenko, Duc-Ly Vu, and Fabio Massacci. 2020. A Qualitative Study of
Dependency Management and Its Security Implications. Association for Computing
Machinery, New York, NY, USA, 1513–1531. https://doi.org/10.1145/3372297.
3417232
[40]
Henrik Plate, Serena Ponta, and Antonino Sabetta. 2015. Impact assessment for
vulnerabilities in open-source software libraries. 411–420. https://doi.org/10.
1109/ICSM.2015.7332492
[41]
Baishakhi Ray, Daryl Posnett, Vladimir Filkov, and Premkumar Devanbu. 2014. A
Large Scale Study of Programming Languages and Code Quality in Github. In Pro-
ceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Soft-
ware Engineering (Hong Kong, China) (FSE 2014). Association for Computing Ma-
chinery, New York, NY, USA, 155–165. https://doi.org/10.1145/2635868.2635922
[42]
Adriana Seja and Max Schäfer. 2022. Practical Automated Detection of Malicious
npm Packages. arXiv preprint arXiv:2202.13953 (2022).
[43] unisil. 2021. Source Map Parser. https://github.com/unisil/source-map-parser
[44]
Haroen Viaene. 2021. feat(dependencies): update algoliasearch-helper. https:
//github.com/algolia/instantsearch.js/pull/4936. (Accessed on 05/04/2022).
[45]
Stefan Wagner and Emerson Murphy-Hill. 2019. Factors That Inuence Productiv-
ity: A Checklist. 69–84. https://doi.org/10.1007/978-1-4842-4221-6_8
[46]
Je Williams and Arshan Dabirsiaghi. 2012. The unfortunate reality of insecure
libraries. Asp. Secur. Inc (2012), 1–26.
[47]
Stan Zajdel, Diego Elias Costa, and Hafedh Mili. 2022. Open Source Software: An
Approach to Controlling Usage and Risk in Application Ecosystems. In Proceed-
ings of the 26TH ACM International Systems and Software Product Line Conference.
arXiv. https://doi.org/10.48550/ARXIV.2206.10358
[48]
Rodrigo Zapata, Raula Kula, Bodin Chinthanet, Takashi Ishio, Kenichi Matsumoto,
and Akinori Ihara. 2018. Towards Smoother Library Migrations: A Look at Vul-
nerable Dependency Migrations at Function Level for npm JavaScript Packages.
559–563. https://doi.org/10.1109/ICSME.2018.00067