My Notes on npm and Yarn. You’re Welcome!

Image for post
Image for post

What is this post all about?

This post is a by-product of this learning, perhaps you can benefit from it. It groups important topics related to npm and Yarn, each covered in many other blog posts. Reading it may ease the learning process for you so you can stay focused on what’s important.

These sections are not necessarily related. Feel free to jump between them or read the whole post through.

  • History of npm and Yarn
  • How npm resolves dependencies
  • Package.lock / Shrinkwarp / Yarn.lock
  • Scoped packages
  • NPX
  • Yarn selective dependency resolution
  • Yarn workspaces
  • npm hooks

History of npm and Yarn

To supply this full solution, npm consists of 3 parts:

  1. CLI (Command Line Interface) tool — used to download the packages and manage the dependency structure. This part is what we will focus on.
  2. Registry — a repository of open-source packages hosted by npm e.g., Loadash and React.
  3. Website — serves as a catalog for the registry.

In 2014, with a rise in the popularity of npm, came npm Inc. which became the main maintainer of npm and started to offer premium services, like private packages which we cover later on.

In 2015 npm 3 was released and introduced a core change in the way it resolves dependencies, more on that later.

In 2016 Yarn was released by Facebook, as an alternative CLI tool for downloading and managing the packages from the npm registry.

Yarn came to solve the problems Facebook faced while using npm:

- Consistency issue when installing dependencies across different machines and users.

- The amount of time it took to pull dependencies in.

- Security concerns with the way the npm client executes code.

Yarn: A new package manager for JavaScript

The great thing about Yarn is that it pushed npm forward. Features of Yarn were created later in npm.

Today npm is on version 6.11.3

How npm resolves dependencies

The entry point for npm to start looking for dependencies is your app’s package.json dependencies section. When you run npm install, npm goes over this section and for each dependency downloads its packages from the npm registry. Each package contains its own package.json which npm goes over as well, and there you have a recursive iteration which eventually results in a node_modules directory, that contains all of the needed dependencies to run your app.

Let’s take for example a case where your app depends on 2 modules — A and C. These 2 modules depends on another module, called B, but they require different versions of B:

Image for post
Image for post
From how npm works

Which version of B should be installed? v1 or v2? This problem is called Dependency Hell because as you go deeper and handle more dependencies and sub dependencies, you will encounter this decision more and more.

What npm does to overcome it, is it installs both versions of B, but in a flat way:

Image for post
Image for post
From how npm works

This is the flow of how npm does it:

  1. Scan app’s package.json
"dependencies": {
"mod-a": "^1.0", // requires mod-b v1.0 in it's own package.json
"mod-c": "^1.0", // requires mod-b v2.0 in it's own package.json
}

2. Download package A v1.0 and put it under node_modules dir.

3. Looks for dependencies of A — B v1.0 — Does it exist in our node_modules? If not, it downloads it and puts it under node_modules.

4. Module B does not require other dependencies so we’re good to continue to module C and download it.

5. Module C depends on B v2.0 — Does it exist in our node_modules? no, only version 1.0 exists there.

6. Download Module B v2.0 and put it under the nested node_modules directory of module C:

Image for post
Image for post

Let’s add another dependency to our main app — module D v1.0.

Module D like module C, requires B v2.0.

Now when we run npm install we encounter Module D which does not exist under node_modules, so we download it and also look for its own dependencies — module B v2.0. Again, it does not exist on the top-level node_modules, so we install it as we did for module C.

Image for post
Image for post
From how npm works

We got a duplication here, it would make more sense for B-v2 to be installed only once, under the top level of the node_module, and B-v1 to be under A.

npm dedupe

Running this command would restructure node_modules and remove the duplication:

Image for post
Image for post

This is of course just an optimization. I personally do not encounter it much, but it is a good thing to know.

Before finishing this part, I would like to touch on one more topic — npm non-determinism. You may have noticed that when you run npm install, the node_modules directory on your machine is not always identical to the one on your teammates’ machine, or the one installed on your server. This is because the dependency tree depends on the installation order.

Let’s take a look at the following example:

John create a new app with the following dependencies inside package.json.

"dependencies": {
"mod-a": "^1.0.0", // depends on B v1.0
"mod-c": "^1.0.0", // depends on B v2.0
"mod-d": "^1.0.0", // depends on B v2.0
"mod-e": "^1.0.0" // depends on B v1.0
}

John runs npm install which results in this node_modules structure on his machine:

Image for post
Image for post
From how npm works

After a while, John creates a new feature which requires him to upgrade module A to v2.0. Module A v2.0 requires module B also in v2.0, so after running npm install again, John’s node_modules looks like this:

Image for post
Image for post
From how npm works

John pushes the new changes and another developer in the team, Dave, pulls the project for the first time. Now when Dave run npm install he get the following node_modules structure:

Image for post
Image for post
From how npm works

Same project, same package.json, but different node_modules tree than John’s.

This happens due to the installation order. Dave installed the node_modules from scratch where module A was the first to be installed, so this time, B v2.0 was installed on the top level of the node_modules dir.

This difference in the node_modules tree structure is ok, the application will work the same on both machines, just with a different configuration. npm docs say it loud and clear.

Package.lock / Shrinkwrap / Yarn.lock

As we said, package.json tells npm the version of the dependencies it has to download. If we used only fixed versions in package.json, every developer who is using the project and every deployment of it would always get the exact version of the packages’ node_modules.

// Fixed versions
dependencies: {
"chalk": "2.4.1",
"commander": "2.19.0",
"concurrently": "3.4.0",
}}

But we rarely see it, as we prefer using the semantic versions symbols ^ ~ which gives us the latest minor/patch version of the package:

// Using semver
dependencies: {
"chalk": "^2.0.0",
"commander": "^2.0.0",
"concurrently": "~3.4.0",
}

Let’s make sure we understand what would be the result of this example:

John created the above configuration. John then ran npm install and got the latest for the required major version of these packages:

chalk — 2.5.0

commander — 2.2.1

concurrently: 3.4.4

A week later Jane started to work on the project, but during that week the packages released new versions, so she got the following node modules installed on her machine after running npm install:

chalk — 2.6.0

commander — 2.5.0

concurrently: 3.4.7

Jane added a new feature and then deployed the project to production. By that time, the packages were updated once more and now the server in production got the following node modules installed:

chalk — 2.7.0

commander — 2.6.0

concurrently: 3.4.8

Now John, Jane and the server in production all got different versions of these dependencies. This fact can result in unexpected behavior in case of new bugs introduced in these dependencies, or the authors of dependencies do not comply with the conventions of semantic versioning.

The solution to this problem is to lock the versions of dependencies. This is done by all three techniques: Package.lock, Shrinkwrap, or Yarn.lock.

Shrinkwrap was the first and it exists since the early days of npm. If you run npm shrinkwrap, you would get npm-shrinkwrap.json which includes metadata for locking the dependencies to lock them to a specific version.

{
"name": "A",
"version": "0.1.0",
"dependencies": {
"B": {
"version": "0.0.1",
"from": "B@^0.0.1",
"resolved": "https://registry.npmjs.org/B/-/B-0.0.1.tgz",
"dependencies": {
"C": {
"version": "0.0.1",
"from": "org/C#v0.0.1",
"resolved": "git://github.com/org/C.git#5c380ae319fc4efe9e7f2d9c78b0faa588fd99b4"
}
}
}
}
}

From npm docs

This new file needs to be committed together with your package.json. Next time the project is deployed to a server, or a developer gets the code and runs npm install, dependencies will get installed according to the metadata from npm-shrinkwrap.json.

With Shrinkwrap you had to say explicitly that you want to create the file, and you also had to update it each time that you make a change in package.json is changed.

In 2016 came out Yarn and introduced Yarn.lock that uses the same idea of a metadata file for specific versions.

Yarn.lock is automatically created for you, and is automatically updated each time you make a change in package.json. This feature caused developers to be more aware of the version lock options, as it made it more accessible than npm’s shrinkwrap.

npm saw the impact of the auto-generated Yarn.lock, and in version 5 introduced package.lock to replace shrinkwrap. It uses the exact same structure of npm-shrinkwrap.json to allow backward compatibility. The Shrinkwrap command still exists in newer versions of npm for backward compatibility.

To sum this part up — in case you want to lock versions, with npm you would commit package.lock, with Yarn you would commit yarn.lock.

Caveats — these lock files are auto-generated and you should not modify them manually. If you decide to switch between npm and yarn, make sure you remove the unnecessary lock file to avoid confusion with your teammates.

Scoped packages

  1. Put related packages together under one scope.
  2. Use the proper name for your package without having to worry it is taken by someone else.
  3. Make your packages private so only your organization can use them.

Let’s take a look at these scoped packages:

@angular/cli

@vue/cli

Each of these packages is a CLI tool but is under a different scope.

Let’s say that you work for an organization called abc and you also write a CLI tool, for your own purposes. You can still call it cli and place it with the other packages of your organization — @abc/cli

In order to scope your package, you have to prefix the package name:

// package.json
{
"name": "@abc/cli",
}

Only the user abc, with the right permissions after logging to npm, can create a scoped package under @abc.

Scoped packages can be public or private. By default, a scoped package is private, which means that only your organization is allowed to access it using authorization. Privately scoped packages in contrast to `public` is a premium feature.

Making your scoped package public for everyone to use, like the ones we saw above — @angulr/cli, @vue/cli and much more, can be done by passing the access flag with the value public when you publish your package:

npm publish — access public

NPX

Let’s say you want to create a new react application and you decide to use the create-react-app generator for it. Because this generator is not part of your app dependencies, what you would do prior to NPX, is install it globally by running: npm create-react-app -g. Thing is, that once a package is installed globally by npm, it is not automatically updated. So you might have installed create-react-app a few months ago, used it, and now after a while when you decide to use it again, find that you are not using the latest version of it.

NPX solves this problem — when you run npx create-react-app, NPX downloads the latest version of create-react-app, installs it, executes it and then removes it.

So what do you gain by using NPX?

  1. You guarantee that you are using the latest version of the tool you wish to execute.
  2. You are using one command only to run your tool and after that, it is removed completely. Because of that fact, it also makes it easier for “tools developers” to have others more engaged to use their tools.

A word of caution — in case you already have a package installed globally, for example, create-react-app, running npx create-react-app will not download the latest one, but use the global one! so in case you decide to use NPX, make sure you remove your global packages first or at least cherry-pick them.

Yarn’s selective dependency resolution

In the following example, we are overriding left-pad, a dependency of our dependency — d2.

{
"name": "project",
"version": "1.0.0",
"dependencies": {
"d2": "1.0.0"
},

// This is the new addition:
"resolutions": {
"d2/left-pad": "1.1.1",
}
}

Why would you want to do that?

You may be depending on a package that is not updated frequently, which depends on another package that got an important upgrade. In this case, if the version range specified by your direct dependency does not cover the new sub-dependency version, you are stuck waiting for the author.

A sub-dependency of your project got an important security update and you don’t want to wait for your direct-dependency to issue a minimum version update.

You are relying on an unmaintained but working package and one of its dependencies got upgraded. You know the upgrade would not break things and you also don’t want to fork the package you are relying on, just to update a minor dependency.

Your dependency defines a broad version range and your sub-dependency just got a problematic update so you want to pin it to an earlier version.

From selective dependency resolutions

Yarn workspaces

What you would normally do while developing a package that depends on another one you own, is create a symbolic link using npm link or yarn link, or pack and install it using the npm pack / yarn pack commands.

Workspaces use symbolic links internally, but it saves you the compiling of each package and the link/unlink necessity which tediously requires you to change your package.json.

Let’s say that you have 3 packages: A, B, C.

A is using the other two, it depends on them. What you would do, is create a root package.json and configure it this way:

{
"private": true,
"workspaces": ["A", "B", "C"]
}

Now each time you modify one of them, running yarn install would build them all and reflect the changes, so the module is aware of them.

To see a practical example, step by step I recommend this excellent post.

npm hooks

Each time a package you’re subscribed to is updated, npm will send an HTTP post payload to a URL you configure. This way you can set up automated builds, deployment, testing, integration with Slack alerts, or whatever automated process you can think of.

Example:

npm hook add lodash http://your-server.com/handleLodashRelease

See npm-hook docs

That’s it for today, I hope you gained a thing or two after reading this.

Until next time,

Moshe Kerbel

Many thanks to Itai Ben David and Lee Baror who helped me with this post!

Written by

Frontend developer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store