Reduced-Error Spreadsheets for Engineering

Spreadsheets can be designed in a way that make them fairly resilient to errors, but it’s much easier to design them to be terrible and dangerously failure prone.

Over the years I’ve developed some rules that make them less user-error-prone, easier to understand, and simpler to use.

I have not made a single units error in the last 5 years, and you can join me!

Here’s an example spreadsheet that can be used as a template.

Color Coding

It is immensely simpler to navigate and debug a sheet when all the cells are color coded. I use the following scheme. The extra setup effort is more than worth the time.

Commenting

Just like code, spreadsheets should be well commented. However using Google Sheets “Comments” for this purpose is BAD! Comments are intended to be mini conversation threads between people that are ultimately resolved. Instead use notes (right click cell, Insert note) which are not intended to be resolved, and do not send emails to the spreadsheet owner whenever they’re added.

  • Add a note to the name of a value when you are clarifying the purpose of that value (e.g. “this value represents the wall thickness of the fuel tank, not including coatings”)

  • Add a note to a value itself when you are commenting on something about the value itself (e.g. “4 mm was selected as a compromise between strength and mass”)

Linking

Additionally, it’s extremely helpful to link to any source material that guided the selection of a value. Especially powerful is to link to Slack threads.

No Magic Numbers

Calculations should only do math with values from other cells or constants that are a structural part of the formula (e.g. pi, factors of 2, etc). For example, if a calculation uses Avogadro’s number, that number should not be typed into the calculation, but rather live in a cell that is referred to. If the number is just typed into a calculation, then it is often hard to tell what it’s for because it has no label.

Structure

Divide the sheet up into tabs to organize the work.

  • A main tab with major inputs / outputs

  • A tab with lookup tables

  • An error checking tab

  • Multiple tabs for calculations, as needed

Units

Here are some rules regarding units that will make errors less likely:

  • Only ever use SI units in calculations, without exponent prefixes (e.g. joules, meters, kilograms are fine — nanometers, inches, acres, etc are not).

  • It’s fine to use other units for for user data input, but rather than using those units directly in calculations, add a cell to the right that first does the unit conversion and then use that value in calculations.

  • Use unit conversion functions whenever possible instead of manually inputing factors (e.g. CONVERT(C43, “nm”,”m” or RADIANS(C44)). Even for super easy conversions like grams to kg, use the convert function to communicate to your later self what you were doing.

  • When using scientific notation, use formatting like “10^6” rather than 1E6. The 1E6 notation gets expanded to 1000000, and it’s very hard to glance at that and count the number of zeros.

  • Never store percentages as numbers between 0 and 100 (that then require division by 100 to multiply them by other numbers). Rather use the percentage number immediately followed by the percent sign. 5%, for example, is treated just like 0.05 by all calculations. It’s totally fine to use percentages as long as they’re written this way, and no special care is needed to distinguish them from fractions.

A unit conversion column. Only the values from that column will be referenced by calculations, meaning that no in-calculation unit conversion will ever be needed.

Error Checking

When designing a calculator, keep any eye out for mistakes that the user might make, inconsistencies that are possible in inputs, or possible breakdowns of functions when inputs exceed a range. Use if statements to check values against limits.

Then include an error aggregation section where all errors from all over the sheet are combined together into a single string that can be viewed easily from the main page. The example spreadsheet includes this.

Local error check near calculation

Error aggregation section

Error display on main tab

Lookup Tables

Lookup tables are extremely useful, and also very failure prone when modifying their structure. I’ve developed an approach that circumvents the typical issues and makes it easy to update them over time.

First, to understand the problem, let’s look at the structure of the VLOOKUP function.

The first input is the search key to use to get corresponding values from the lookup table. The second is the entire range of the lookup table. The third is the index of the column from which we want to retrieve a value. The fourth is an indication of whether the table is sorted (and should almost always be false).

The problem is in the hard-coded column index (the number 3 from above refers to the third column from the left). As written above, if the user adds a column to the table, then the index in every single VLOOKUP function that references that table will need to be manually updated. If any are missed, those functions will just return incorrect values without necessarily showing any error. This is pretty dangerous.

Instead of hard-coding the index, we can add column numbers above the lookup table and refer to their values. Now when we add a column, we just need to remember to update the column numbers to be sequential again. We don’t need to do anything to the VLOOKUP functions that refer to the table.

The one additional improvement I’ve made is to add an "END” column. This column is never used to store data, but IS included in the lookup table range. This allows us to add columns to the ‘end’ of the table by putting them before the END column (between Price and END in this case). This prevents the need for any updates to the range used in VLOOKUP functions, even when columns are added.

Default Values

One potential source of spreadsheet error is a user modifying an input to test something out, and then forgetting to set it back. I get around this by making a second column of default values next to my inputs column. I then use conditional formatting to highlight in red any values that deviate from the default. This makes it easy to spot any inputs that have been modified from a baseline.

It’s somewhat non-obvious how to do this in the conditional formatting menu, so here are the instructions:

  1. Select the range of cells you wish to apply the formatting to (in this case the default values)

  2. Format : Conditional Formatting

  3. Set the “Format cells if…” pulldown to “Custom formula is”

  4. Set the custom formula to “=G13<>C13” (excluding the quotes) where G13 in this case refers to the first cell to be conditionally formatted (the first default value) and C13 refers to the value to compare it to (the corresponding input value). Note that you DO NOT specify this custom formula using ranges, but rather just the first item in each range.

Off-Tab References

Do not reference a value from a different tab in a calculation. Rather, set a cell in your current tab equal to the value from the different tab you wish to reference and set its color to the off-tab reference color. Then in calculations, refer to this local cell.

This makes it very easy to identify external dependencies, and it makes calculators portable. If at some point you want to reuse that calculator in a different spreadsheet, you can just copy the tab over and reconnect all of the external reference cells (rather than needing to dig through the contents of functions).

Off-tab references. The calculation cells (in gray) refer only to these cells and not to any cells on other tabs.

Previous
Previous

Fog Machines are Dumb

Next
Next

Mute the MacBook Microphone with the Dictation Key