Pin exact dependency versions

Contents

TL;DR
Tragedy in three terminal windows
Betrayed by a dependency
Non-breaking vs bug-free changes
But I have a dependency lock file!
What are lock files?
Why are lock files great?
Why are lock files not enough?
Two rules for dependency management
How to pin by default
When not to pin
Summary

Buckle up, for some of you this may be controversial, but maybe for some others – obvious. This is how to prevent a problem causing your application to out of the sudden stop behaving correctly, or, even more funny, stop behaving correctly only on one developer’s computer. We can solve this with one two simple tricks.

TL;DR

Use a dependency manager that creates a lock file and commit it to the repository. Even then, pin your dependencies – explicitly specify their exact versions. You can probably configure your dependency manager to do it by default.

Not convinced? Want an explanation? Let’s start it with some story-time.

Tragedy in three terminal windows

Actors:

Developer 1
Developer 2

Tools:

dep_manager – a dependency management tool

Developer 1:

cd projects/great_project

dep_manager add great_library
> Installing...
> New dependency great_library v1.0.0 installed!

./run_tests.sh
> All tests successful!

git add -A

git commit -m 'Added a great_library'

git push

Developer 2 – some time later:

cd projects/great_project

git pull
> Updating f4b5383..4069ef7

dep_manager install
> Installing...
> 1 new dependency installed:
>   - great_library v1.0.1

./run_tests.sh
> Some tests failed!

Developer 1 – trying to reproduce the problem:

cd projects/great_project

git pull
> Already up to date.

dep_manager install
> No new dependencies to install

./run_tests.sh
> All tests successful!

Betrayed by a dependency

Even without 20 years of experience in the field, I still saw a situation like the above several times. Maybe I’m just lucky.

Almost every project, no matter the size, uses some external dependencies – libraries and frameworks. That’s reasonable – very rarely someone pays us to reinvent the wheel. We manage dependencies with, well, dependency managers. Each language has its own or a number of them, and we will look closely at them a little bit later.

Those dependencies, and the way we manage them, can cause tricky to find problems coming out of nowhere. An example of it we see above.

A new external dependency was added by the first developer and everything worked well. Then, the other developer pulled the project, installed dependencies, and everything blown up. When the first developer tried to reproduce the issue, nothing changed for him. On his workstation, everything was still fine.

Can you spot the problem?

When the first developer installed the great_library, the latest version of it, v1.0.0, was added. Then the second developer installed dependencies. At this time there already was a new, updated version v1.0.1 available, and that was the one that landed on his workstation. Unfortunately, this library release happened to introduce a bug that broke the application.

And why the first developer couldn’t reproduce the problem, even though he run the same commands? Well, the dependency manager is a little bit lazy. Since he already had the great_library installed, it did not fetch the newer version for him.

Effect? Both of them did the same operations and ended up with different results.

Believe me or not, but this is the default behavior for quite a lot of dependency managers.

Before we dig into how we can prevent this for different dependency managers, we need to understand two concepts. Semantic versioning and lock files.

Non-breaking vs bug-free changes

Dependency managers allow providing a range of valid versions for each dependency. Moreover, some of them by default add dependencies with versions range instead of a single, concrete version. For example, when you run:

npm install lodash

then your package.json is updated with something like: "lodash": "^4.15.0".

This little caret symbol (^) says that the minimal valid version of the lodash library is the 4.15.0. But it also says that it accepts any next version as long as it’s lower than 5.0.0.

Why? Because theoretically, if your code works with version 4.15.0, then it should be safe to use any next 4.x.x version. This is because the de facto standard for libraries versioning is something called SemVer, or Semantic Versioning.

In short, each next 4.x.x version should be fully compatible with the previous one, without any breaking changes. Breaking changes, which may cause our code to stop working, can be introduced only in the next “major” version, 5.0.0.

That’s the theory. Of course, everyone does their best to keep the reality as close to it as possible. But sometimes new bugs are introduced in the new releases. Bugs do not care if that’s a version 4.16.0 or 5.0.0. If you are lucky enough that the new bug affects your application, it can break after what should be a “safe” dependency update.

Of course, not all libraries follow the Semantic Versioning. With them, your expectation of non-breaking change when bumping up from 1.2.0 to 1.3.0 has no basis whatsoever.

So we probably would not like updates like this to happen on their own without our direct action, right?

But I have a dependency lock file!

Some dependency managers create an additional file when installing libraries. A “lock” file. For example, the npm creates the package-lock.json.

What are lock files?

The lock file contains a list of all installed dependencies and their versions. This includes both the dependencies we specified, and their dependencies, and their dependencies, and their dependencies, … In other words, direct and transitive dependencies.

As a result, we have two files with dependencies list. First is the “regular” one, with a list of direct dependencies we specified. The second is the generated lock file. Both should be committed to the repository.

File	# of dependencies
`package.json`	1 (`"webpack": "5.0.0"`)
`package-lock.json`	124

The number of transitive dependencies may be huge (not only in Node.js world)

Then, when installing dependencies on another machine, the dependency manager uses the lock file to determine what to install. Thanks to that, it gives us all libraries in exactly the same versions as used previously. Even if some libraries were updated in the meantime.

This is true even if in the “regular” dependencies file we specified version with a range (like the ^ symbol above). I must admit that until recently I was convinced that doing a simple npm install on a fresh environment would cause fetching the newest possible dependencies (matching the range), disregarding the lock file. It took me around 3 minutes of testing to make sure it does not. When the lock file is present, always the versions specified in it are installed. The only exception is when the version range in the regular file does not match the version in the lock file.

Why are lock files great?

They keep track of the exact versions of all our dependencies. And by “all” I mean really all, so not only the ones we specify directly.

This way they provide us reproducible builds. Especially when everything works well at your workstation, you want it to build and deployed through the CI pipeline with exactly the same dependencies.

Why are lock files not enough?

Unfortunately, the lock files are not a solution for everything. They are long, not easily-readable, and auto-generated. That means the developers rarely focus on them, and just commit any generated changes.

There are a few reasons to still pin exact versions for direct dependencies we declare.

Visibility. If I want to know the version of the library in the project, the first thing I do is opening the dependencies file. If the version there does not contain any ^~> then my search is over. Otherwise, I need to find a proper command for the dependency manager I use to get the actual installed version of the library. Finding it manually in the lock file is neither a quick or effortless task.

Upgrading. Every once in a while somebody will decide to upgrade all dependencies. Our dependency manager will then update all libraries to the latest possible versions, still taking into account the version range we declared for top-level dependencies. I will argue here, that upgrading transitive dependencies have a lower chance of breaking our application than upgrading the top-level (direct) ones. They are often smaller, more-widely used, and better tested. On the other hand, the top-level dependencies (and changes in them) tend to have the biggest impact on our code. For this reason, I prefer to upgrade direct dependencies knowingly, by manually changing the pinned version for each one of them in the “regular” dependencies file.

And, related, if things go south and the lock file is broken after a fatal merge or another cataclysm, you have a higher chance to recover by installing the top-level dependencies in the same versions you had before and only installing the latest matching versions of transitive dependencies.

Two rules for dependency management

Use dependency manager that generates a lock file and commit that lock file to the repository
Despite that, declare an exact dependency version (“pin it”) in the “regular” dependencies file

How to pin by default

Most dependency management tools, when using a command to add a new library, will save them with a version range. Here is how to change this behavior for few popular managers.

Language	Tool	Lock file	Pin by default
JavaScript	npm	yes ✅	`npm config set save-exact true`
	yarn	yes ✅	`yarn config set save-prefix ""` (respects also the config for npm)
Python	pip	no ❌	dependencies usually added manually to the `requirements.txt` file¹
	Poetry	yes ✅	no option
Java	Maven	no ❌ – not needed²	dependencies usually added manually to the `pom.xml` file
	Gradle	optional ✅ (docs)	dependencies usually added manually to the `build.gradle` file
PHP	Composer	yes ✅	no option

¹There is an approach with doing pip freeze > requirements.txt, but that has some drawbacks. ²As per discussion in comments, with Maven you usually provide exact versions also in libraries and Maven resolves transitive dependencies versions with dependency mediation.

This list could be, of course, a lot, lot longer. The JavaScript alone has probably a dozen different managers.

When not to pin

There is one important exception from everything I wrote above. When creating a library that will be meant to be a dependency itself, you should not use exact dependencies versions. Instead, provide the biggest possible (but still safe) range for dependencies. Most often it will mean sticking to the same major version. This is the place where all those fancy range constraints have use.

This is because when both your library and another library will require the same dependency, but in different versions, the dependency manager will have two choices:

install both and assure that each library has its own copy of the dependency in a version it wants, which increases the overall size of an application
or raise an error because of conflicting versions and do not install it at all

Depending on the language and technology, different dependency managers will do one of those two things. That’s why it’s much better to accept a broader range of versions, hoping that some common ground will be found with other libraries the application use.

This, however, does not apply to Java, where you usually provide exact versions also in libraries. Both Maven and Gradle have resolution strategies to handle this (however different ones – in Maven, in Gradle).

Summary

This is the approach I use for quite a few years now. I saw how not using it can cause problems, so I think it makes sense.

Are you doing stuff differently and don’t think pinning exact versions is a way to go? At least use a dependency manager that creates a lock file. If you think pinning versions is not needed, I will be happy to hear why – share a comment.

Category: Programming

Tags: dependencies