Many of the scripts and executables allow providing some command line arguments. They may be required or optional. There are flags, that are just switches changing command behavior. There are, of course, arguments with values. And there are so-called positional arguments – parameters given in some order without any extra indications.
In this post, I analyze the anatomy of CLI arguments and point out how to read them in our own application.
If in rush, go directly to the end of an article for the summary of CLI arguments rules.
Types of arguments
Let’s take a look at this
ls (“list”) command.
$ ls -l --block-size K ~/Documents
ls is a command name. It’s installed globally, so we just call it by a name.
-l is a flag, an argument without a value. It makes
ls print results in a long format, with every file in a separate line.
--block-size K is a parameter with a value. The “K” value changes the format of displayed file sizes from default bytes to Kilobytes. The value must come right after a parameter name. Providing this parameter without a value will raise an error.
~/Documents is a positional argument. It tells
ls which directory files should be listed.
ls all arguments are optional. It also accepts any number of positional arguments.
Why does CLI arguments format matter?
Whoever will be calling the executable, is your user. Designing for the users, even in a case of a small script taking few arguments, should have the UX at the first place. And there is one important rule we should apply here – predictability and familiarity.
It means that if our executable expects arguments, it should parse them in a standard way. To use it users should have to remember as little as possible. The way to achieve it is to make usage as standard and as obvious as possible.
Basically, we would like to avoid situation like this:
$ ./my-ls -l ~/Documents Error: provide path first $ ./my-ls ~/Documents -l Error: "-l" argument not recognized $ ./my-ls --long Error: "--long" argument not recognized $ ./my-ls ~/Documents --l -rw-r--r-- file1 -rw-r--r-- file2 -rw-r--r-- file3
Flags and named arguments
If an argument name is a single letter, it’s standard to prefix it with a single dash. For longer names, we use two dashes.
Flags and named arguments are not distinguished by the length of the name. They both can be short with a single-letter name or longer. For example in the
--help is a flag, and
-O (shortcut from
--output-document) is a named argument requiring a value.
A single-dash prefix often allows concatenating multiple single-letter flags. For example, those two next calls are equal. They both list all the files (including hidden) in a current directory in a long format.
$ ls -a -l $ ls -al
Also, some commonly used flags and arguments may have two names. A short, single letter one for fast of use by experienced users, and a longer, more verbose one for better readability. An example of such flag in
ls would be
--recursive, which makes
ls print also the content of subdirectories, recursively.
Command arguments order
As a rule, the command should not expect flags and value arguments to be in any particular order. Which makes sense – we shouldn’t expect the user to remember the order on top of the kind of arguments the executable accepts. Those two calls are correct and equal:
$ ls -l --block-size K $ ls --block-size K -l
Things look differently with positional arguments. First of all, “positional” means that the position of those arguments matter. This is true, and their order may, or may not, have consequences. In the
ls command, if we provide multiple paths, their content will be listed in the same order as we gave them. In the
cp (“copy”) command consequences will be greater, as the first argument is a source file, and the second is a target where to put a copy.
$ cp source_file target_file
We see that order of flags and value arguments does not have a meaning, and the order of positional arguments is important. But can we mix those two groups with each other?
In short – no. In most cases, everything after the first positional argument is also treated as the next positional arguments.
What about the Git commands, that we use every day? Doesn’t the
git commit command break the general rules I mentioned above?
$ git commit -m 'My message'
Even if the
commit word may look like a positional argument at first, after which the
-m named argument should not happen, it’s not a case here.
commit is a subcommand.
Commands with subcommands can have two sets of arguments. First, after the main command, there are its arguments. Then there is a subcommand, and arguments specific for this subcommand.
Differences in argument parsing
Depending on the implementation used to parse command line arguments, the details and flexibility in accepting them may differ.
Mentioned earlier short flags concatenation may not work, forcing us to provide all flags separately.
Sometimes long-style named arguments may accept or even require to pass a value with an equals sign instead of space in-between, like
--block-size=K. On the other hand, a single-letter named arguments may accept or require to pass value without space, like this:
Usually, each named argument may be given only once, but there are cases where executable accepts providing it multiple times. One example may be the
git tag command, which will list all existing tags. It accepts
--sort named argument with key describing field to sort by. Used many times it allows sorting by multiple fields.
Argument parsing libraries
Taking all those rules and possibilities into consideration it’s not so easy to parse command line arguments in a flexible way that will allow users to provide parameters without any surprises. That’s why, when creating even a small, simple executable, it’s worth considering using a library for it. Such libraries will handle most of the common cases, allowing to pass input in a standard, known to users way. They will often also offer some elasticity allowing to provide parameters using more than one standard way.
There are multiple options available for basically every language, for example:
Summary of CLI arguments rules
When parsing command line arguments it’s important to keep consistency with what the user is already used to. This will make usage of an executable a much better experience.
The key rules are:
- single letter argument – single dash
- longer argument name – two dashes
- flags and value arguments before the positional arguments
- any order of non-positional arguments
- if parsing, don’t reinvent the wheel, consider using an existing library for better flexibility