Shell Style Guide from Google

ronjouch · on June 4, 2016

Tangentially, http://www.shellcheck.net/ is an awesome static linter for shell code. Available online, as a command-line tool, but most importantly as a plugin for your favorite $EDITOR.

careersuicide · on June 4, 2016

Ah! Thank you so much for posting this!

Whenever I attempt to learn a new language I find that static analysis tools and linters, if available, help me get over any initial confusion about what constitutes good idiomatic code. Guides like OP help too, but nothing beats a good linter. The thing is though, I've been writing Bash shell scripts for going on 15 years now, and I still overlook little details [0] all the time.

I've been mulling in the back of my head for a while a theory about the need for style guides and linters: if a language makes them a near necessity for reasons other than aesthetics then the language is probably best avoided. I'm not sure if agree totally with that. However, I think the most frustrating experiences I've had while programming are the result of languages which allow for multiple ways, in terms of syntax, of doing the same thing which behave the same most of the time.

0: https://github.com/koalaman/shellcheck/wiki/SC2086

ronjouch · on June 4, 2016

100% agreed!

When writing bash, there are so many hoops to jump through (e.g. syntactic traps, bashisms to avoid) that people who like me who use these languages only occasionally will shoot their foot repeatedly.

For this population (and I think it's big, given the glue nature of *sh), a guide won't help much because occasional practice won't give them any time to sink in; interactive linters are a godsend.

the_common_man · on June 4, 2016

Good rules overall, we follow similar guidelines. Any comments on:

* readonly - describing these in library scripts is a bit dangerous since the readonly is not specific to the file. So, unless you have some namespacing pattern for globals, this will bite you

* I didn't see any recommendation but always start with set -eux -o pipefail. set -e has it's own set of pitfalls but it's best to learn what the pitfalls are and work with it :-) It helps in the long run.

* When using set -u, use the FOO={FOO:-} syntax to initialize defaults for all your env vars.

codys · on June 4, 2016

On point 2: indeed, given that they are restricting themselves to bash, I would have expected a note of `set -eu -o pipefail`, at least. `-x` (tracing), isn't appropriate for many scripts, though.

On point 3: `: ${FOO:-some default value}` is the typical pattern for ensuring a default value is set.

mbakke · on June 4, 2016

I tend to avoid `set -e`. To borrow the Python adage: "explicit is better than implicit". In shell it translates to "if you need error handling, add it where necessary".

See http://mywiki.wooledge.org/BashFAQ/105 for some (most) of the pitfalls.

d0mine · on June 4, 2016

"Errors should never pass silently." is more relevant for `set -e`

mbakke · on June 4, 2016

Except they do even with `set -e`. Especially if you forget to/can't set `pipefail` (which is bash-specific).

Or consider the last example from the wiki link above:

  set -e
  f() { local var=$(somecommand that fails); }
  f    # will not exit

lisivka · on June 5, 2016

Yep, it is inconsistent:

    $ ( set -e; foo() { f=$(false); }; foo ; ); echo $?
    1
    $ ( set -e; foo() { local f=$(false); }; foo ; ); echo $?
    0
    $ ( set -e; foo() { local f; f=$(false); }; foo ; ); echo $?
    1

However, it is still better to use -e to catch errors early.

stefco · on June 5, 2016

Except `set -e` (along with pipefail) is exactly what prevents you from implicitly catching failures, forcing you to be explicit (rather than implicit) with your error-handling, which is closer to how Python itself deals with unhandled exceptions. It has its limitations and caveats, and you can accomplish the same goals somewhat differently, but it is inaccurate to say that using `set -e` is tantamount to doing things implicitly.

the_common_man · on June 5, 2016

Ah yes, I put `-x` by mistake there!

nickysielicki · on June 4, 2016

    > Bash is the only shell scripting language permitted for
    > executables.

    > [...]

    > The only exception to this is where you're forced to by whatever
    > you're coding for. One example of this is Solaris SVR4 packages
    > which require plain Bourne shell for any scripts. 

    > [...]

    > When to use Shell
    > 
    > * If you're mostly calling other utilities and are doing
    > relatively little data manipulation, shell is an acceptable choice
    > for the task.
    > 
    > * If performance matters, use something other than shell.
    >
    > * If you find you need to use arrays for anything more than
    > assignment of ${PIPESTATUS}, you should use Python.
    > 
    > * If you are writing a script that is more than 100 lines long,
    > you should probably be writing it in Python instead.  Bear in mind
    > that scripts grow.
    > 
    > * Rewrite your script in another language early to avoid a
    > time-consuming rewrite at a later date.

Limiting the domain of shell scripts is the best advice in here. Shell scripts are terrible to maintain, and you should use perl or python if it's complicated.

But limiting the domain of shell scripts also probably means that you're not going to be using any advanced features of bash, so you should write it in sh instead. It's undoubtedly more portable, and it also serves to enforce the idea that you shouldn't write complicated shell scripts. I'll take a leap and say it outright, if you're using a feature of bash that isn't in bourne shell, you shouldn't be writing it in either of them.

The internet is littered with bash scripts that are compatible with bourne shell, but the shebang says '#!/bin/bash'. Well some systems don't have bash! But everyone has bourne shell.

I'm disappointed by their choice of bash and how this guide might reinforce that behaviour, but other than that, the actual style guidelines are great.

[Edit: rewording]

Zardoz84 · on June 4, 2016

    > * If you are writing a script that is more than 100 lines long,
    > you should probably be writing it in Python instead.  Bear in mind
    > that scripts grow.
    > 
    > * Rewrite your script in another language early to avoid a
    > time-consuming rewrite at a later date.

Hehe. I did on DLang with a script that we have to update some instances of our application on develop machines. Our old bash scripts was becoming bigger and can't handle any more some few cases.

nickysielicki · on June 4, 2016

What made you choose D?

ams6110 · on June 4, 2016

I've never learned perl in depth but I've never seen a perl script that didn't seem either a) overly complicated or b) terse to the point of being opaque.

Shell scripts, for all their warts, tend to be fairly readable and easy to follow unless deliberately obfuscated. I would include Python here to a point as well, though Python is certainly friendly to architecture astronauts also.

mbakke · on June 4, 2016

> Executables must start with #!/bin/bash

I've started using #!/usr/bin/env bash for compatibility with non-FHS distros such as NixOS.

Happy to see arrays omitted; if you find yourself in need of bash arrays you should definitely consider a proper language.

I tend to stick to POSIX shell to the extent possible, but readonly variables and [[ ]] syntactic sugar makes sense when maintainability is more important than portability.

jxy · on June 5, 2016

Those are exactly two things I don't agree.

> Executables must start with #!/bin/bash

Either they never use *BSD, or they put their bash there. Unbelievable choice.

> If you find you need to use arrays for anything more than assignment of ${PIPESTATUS}, you should use

I use array all the time and never used PIPESTATUS. So what?

mbakke · on June 5, 2016

Frankly, I find hard-coding #!/bin/bash sensible in a large corporate setting such as Google. It prevents PATH manipulation and they obviously have a uniform and controlled environment.

Bash arrays can be convenient, but at that level of complexity Python or Ruby are much easier to read and understand (and necessarily maintain).

asuffield · on June 5, 2016

(Tedious disclaimer: my opinion only, not speaking for anybody else. I'm an SRE at Google.)

We run everything on prodimage, so operating system compatibility isn't really an issue.

Realistically, any time somebody sends me a code review with a shell script in it, my first comment is going to be "Why is this a shell script? Rewrite it in a supported language."

riskrisk · on June 5, 2016

> .. they never use *BSD ..

pretty much

mcsaucy · on June 5, 2016

Re: arrays -- I generally agree, but they are best solution to certain otherwise simple problems, like storing command arguments. `cmd $FLAGS` is objectively more dangerous than `cmd "${FLAGS[@]}"` due to the globbing and word-splitting issues the former exposes. In my experience, devs are more likely to opt for the former when the latter is forbidden than switch languages (because that's work).

giovannibajo1 · on June 4, 2016

Any reason why they don't mandate something like "set -euo pipefail" at the beginning? I find it invaluable for writing and debugging scripts, and in general for avoiding weird (or dangerous) errors.

jdgiese · on June 5, 2016

Love this! If you think you are a BASH pro, check out your knowledge with some challenge problems:

http://innolitics.com/10x/advanced-bash-exercises/

Also has some other useful BASH links and comments. Here is a quote:

> BASH is the most widely-used and widely-supported shell for Linux. There are other shells that are better than BASH in various ways, but we feel that none of these other shells are better enough to warrant replacing BASH as the de-facto standard when writing shell scripts. BASH is installed by default on almost all Unix-based operating systems, and the majority of the world’s shell scripts are written in BASH. For this reason, we suggest that all of our developers learn BASH.

> BASH scripts are a domain-specific programming language that is well-suited to managing processes and files. That being said, the large number of special characters appropriated for process management, its text expansions, and its unusual syntax make BASH poorly-suited for general purpose programming. Accordingly, we think that BASH should only be used for scripts that are predominantly concerned with processes and files.

pwd_mkdb · on June 4, 2016

i try to write my scripts in such a way that if all newlines were lost, the script would still run. semicolons even where optional.

requiring the use of bash for non-interactive use? good grief.

is it possible this company has a linux bias?

the usefulness of a minimal scripting shell cannot be denied. even with linux distribs that use bash, we almost always see busybox in use. busybox is more like the almquist sh than bash.

with sh, bash-only features may not work.

for example, shellshock did not work with almquist sh.

why are bash script not given a .bash file extension to distinguish them from sh scripts (.sh extension)?

.ksh extension is often used for korn shell scripts.

chubot · on June 5, 2016

Given that every server and most desktops at Google are Linux, and that it created at least two operating systems based on Linux (Android and ChromeOS) -- yes, I think it would be safe to say there's a Linux bias :)

coldtea · on June 5, 2016

>is it possible this company has a linux bias?

Why would they have any other bias? Is there any other viable alternative on the server side they should car about? Are Google realistically going to switch to it? It's not like 1990, where you coded so that your script worked with 5+ UNIX vendors.

>with sh, bash-only features may not work.

Sounds like a self-inflicted problem. Just arrange so bash handles your scripts (which they do).

>.ksh extension is often used for korn shell scripts.

A trip down memory lane...

sytse · on June 5, 2016

Inspired by this we made a guide to shell out from ruby in a safe way: https://github.com/gitlabhq/gitlabhq/blob/master/doc/develop... (on mobile and only seeing the GitHub link in Google)

giantninja · on June 4, 2016

Are there any good example scripts that show actual examples of most of these cases? I think that would be a great supplement supplement to this guide

ronjouch · on June 4, 2016

Yes but they're hidden behind the little "Play" button at the left of each point. Clicking it reveals further explanation and, often, examples.

Not very discoverable; even though this feature is documented at the top right of the page, I skipped it too and only found it by accidentally clicking one of the buttons while obsessively-compulsively selecting text I was reading :D

Razengan · on June 4, 2016

The "disclosure triangle", I believe. :)

pyed · on June 4, 2016

is https://github.com/mvdan/sh relevant ?

LukeShu · on June 4, 2016

    > SUID and SGID are forbidden on shell scripts.

Kinda funny, the Linux kernel just ignores SUID/SGID on script files.

Retr0spectrum · on June 4, 2016

    > While bash does make it difficult to run SUID, it's still possible on
    > some platforms which is why we're being explicit about banning it.

superuser2 · on June 5, 2016

Why can't we have setuid on shell scripts which are non-writeable?

I have often wished for a facility to allow unprivileged users to execute specific predefined tasks, and writing C programs for such things would be a huge pain.

ekimekim · on June 5, 2016

The issue is that there's an unavoidable race condition which renders SUID for interpreted (ie. #!) executables insecure.

Suppose there's some SUID bash script owned by root in the system that starts with "#!/bin/bash".

I (an unprivileged user) create a symlink "./foo" to "/path/to/SUID_script". I execute "./foo".

The kernel follows foo, reads the SUID script's #!, and so runs "/bin/bash ./foo". Honoring the SUID, bash is run as root.

Here's the race. In between the kernel doing this and bash finishing initialisation and reading its script arg, I swap out the "foo" symlink to instead point at "./my_evil_code".

So bash reads "./foo" and executes the content as root. I now have arbitrary code execution as root.

Are there ways this could be fixed? Possibly. But not easily.

It's not enough to just say "SUID can't work through a symlink" - I could've symlinked to the SUID script's parent directory, for example.

You can't say "the kernel should pass the fully resolved path to the interpreter", this will break a million scripts that rely on changing behaviour based on $0.

I share your pain about wanting an easy, non-compiled means of creating a small SUID program. But I don't see any good way unless something changes. Maybe a util that takes a #! interpreted file, adds an ELF header and some machine code that execs the desired interpreter and feeds it static script content?

zem · on June 5, 2016

> I share your pain about wanting an easy, non-compiled means of creating a small SUID program. But I don't see any good way unless something changes.

i think the way to go is to just to have a compiled language that makes it easy to write bash-like programs. julia has a pretty nice-looking subprocess module, for instance: http://docs.julialang.org/en/release-0.4/manual/running-exte...

Tiksi · on June 5, 2016

Isn't this exactly what sudo / sudoers is for? You can specify a single command or even restrict down to which arguments an unprivileged user can pass to that one command.

superuser2 · on June 5, 2016

I suppose so, but I'd much rather work with a directory of files than a dense config file inside visudo.

Tiksi · on June 5, 2016

You could do something like

  unprivuser  ALL=NOPASSWD:/usr/bin/bash /path/to/directory/with/files/*

It'd be a mild annoyance to have to explicitly run them with bash:

  sudo bash /path/to/directory/with/files/thingtorun

but you wouldn't have to mess with the config past that.

Godel_unicode · on June 5, 2016

Pro-tip: if you're going to go this route, stick these lines in a file in sudoers.d as you're likely to end up with a lot of them, this will help make your config more readable.

treve · on June 4, 2016

Makes even more sense to forbid them then.

hobarrera · on June 4, 2016

> Bash is the only shell scripting language permitted for executables.

Why? That seems completely arbitrary.

I already lost interest in following this in the first line.

> Executables should have no extension (strongly preferred) or a .sh extension. Libraries must have a .sh extension and should not be executable.

So, not only are they bash-only, but you add an extension that makes one assume the contrary. UGH!

havetocharge · on June 4, 2016

You have much to learn. You should be reading more of this stuff, not less.

ams6110 · on June 4, 2016

The answer to your question is obvious if you've ever worked in a large development organization.

hobarrera · on June 7, 2016

You could have bothered sharing it, if it's so obvious.

SoapSeller · on June 4, 2016

I was expecting for style guide on how to design shell user interactions.

Nontheless, Shell Coding Style Guid is interesting.

xufi · on June 4, 2016

That's a good point. I've been trying to find a good guide on make a nice MOTD (Message of the day Script) and some ASCII art since I've been playing around with my envrioment more.

paule89 · on June 4, 2016

Google seems to prefer spaces over tabs

ams6110 · on June 4, 2016

I prefer spaces. Tabs can render differently in different editors, making alignment harder.

Always irks me when I come across some config file format that only accepts tab delimeters.

ekimekim · on June 5, 2016

Which is why I take (well, prefer) a third option:

Tabs for indentation. Spaces for alignment.

The idea is you should be able to set any tabstop you like and the code will still align correctly. For example:

    ^Ia_really_long_method(a_very_long_argument,
    ^I                     the_second_argument)

sorenjan · on June 5, 2016

This is the most logical option. Tabs to add an indentation level, the width of that is unimportant and can be configured by each user. Spaces to align, since they're like any other character in a mono spaced font thereby perfectly matching the line above.

e40 · on June 5, 2016

And 2 spaces is way too little. 4 is the sweet spot.

Reason: with 2 the visual lines at different indent levels are just too close, in many fonts/sizes.

tomlu · on June 5, 2016

It's 2 spaces for everything here, even python. I don't like it and don't think I'll ever get used to it, like you say 2 is too little.

rossjudson · on June 5, 2016

Wrong. 3 spaces is the sweet spot. What is it with you people and powers of two? ;)

lisivka · on June 5, 2016

It is easy to catch an indentation error with two spaces, while it is harder to make an indentation error with just two spaces. It also easier to type two spaces than 3 or 4 spaces, which is important when program is typed without help of an autoindentation tool. Two spaces are enough for fixed length font, typical in terminal, but also saves screen space, which is just 80 characters in width.

rifung · on June 4, 2016

Golang uses tabs and not spaces though.. I actually thought everyone used spaces instead of tabs till I had to use it.

riskrisk · on June 5, 2016

golang also had/has a style guide and a gofmt tool that would indent/align things the same way from the beginning.

gman83 · on June 4, 2016

  Indentation

    Indent 2 spaces. No tabs.
    Use blank lines between blocks to improve readability.
    Indentation is two spaces. 
    Whatever you do, don't use tabs. 
    For existing files, stay faithful to the existing indentation.

Richard Hendricks is not going to like that!

andrepd · on June 4, 2016

I don't wish to dig up a holy war, but why would you want to use spaces instead of tabs? That's what they're for, isn't it? One tab = one indentation level. No need for messing around with spaces, which serve other purposes.

mwfunk · on June 4, 2016

I would be happy with 100% tabs or 100% spaces, but in practice it seems like allowing tabs inevitably leads to source files indented with a combination of both, either by accident or because someone wants to have intermediate levels of indentation (for multiline conditionals for example).

Once you have mixed tabs and spaces, everything goes to hell because now that code's formatting is tied to a specific person's idea of how many spaces should a tab display as, or whether tabs refer to tab stops or just a fixed number of spaces.

I never had strong feelings about it until encountering some really pathological code bases, which had gotten to the point where there was literally no valid tab->spaces setting that would make everything look correct.

Ultimately, though, these things really just don't matter. I use 4 spaces per indent level, but it's not like I'm incapable of reading and writing code that uses tabs, or 2 spaces, or 8 spaces. Really the only thing that matters is that there is a standard and that it is consistently applied within a project.

etwigg · on June 5, 2016

https://github.com/diffplug/spotless

_phaq · on June 4, 2016

The usual argument is that a space is always the same width, whereas a tab can have varying widths depending on your editor.

So, if you indent something with 4 spaces, then other people viewing your code will also always see 4 spaces, therefore ensuring that whatever you found to be readable indentation, does actually come out like that on the other end.

Retr0spectrum · on June 4, 2016

But why does that matter? The reader can adjust their tab-width setting to whatever they prefer.

roblabla · on June 4, 2016

My personal reason : Because it makes aligning stuff harder than it should be. I often have conditions that span multiple lines, and if they aren't aligned, it makes everything harder to read. By using spaces, I make sure the alignment is good for everybody.

I'm sure there are other reasons too.

Retr0spectrum · on June 4, 2016

See the third example on this page: https://blog.codinghorror.com/death-to-the-space-infidels/

scrollaway · on June 4, 2016

Some people absolutely want indentation to be one specific size for everybody and don't fully understand that what might look readable for them at 2 space-widths, would be more readable for some others at 4 space-widths.

Tabs are accessibility. People use spaces for the same misguided reasons they use px font sizes in css.

(Rant over. Yeah, this annoys me.)

havetocharge · on June 4, 2016

Tabs get clobbered (turned into spaces) when you copy-paste them, especially between different environments. Spaces always get preserved.

lisivka · on June 5, 2016

ASCII defines tab as shift to next tab stop, at each 8 character. Terminals are following ASCII standard, so tab is too wide to use for indentation with fixed font at 80 characters.

darekdk · on June 4, 2016

XML. Woah.

codemac · on June 4, 2016

Hard tabs for life.

Frankly, wish they had chosen a better shell like es or rc for all to be written in.