Want to write clean code? Well here's your chance. The following set of examples are practices that I've come across during my times as a programmer and that I believe in. At least at the moment. Please feel free to add comments, criticise my thinking and contribute to everyone's knowledge. I'm trying to find a better platform for these, but for now a gist should do. I'm doing this solely to document, analyse and share my own thinking, which oftentimes is quite unstructured. These days I'm not too fussed about the syntactical structures of the code, but would rather concentrate on painting the largest guidelines and deciding the direction I want my programming style to move towards.
-
-
Save rikukissa/0dd8c7df3681c139c0f92cee3a0d5466 to your computer and use it in GitHub Desktop.
Know what you're returning and be explicit about it
If your intention is not to return any value from this file, it's important to show it explicitly. With early return, it's always tempting to place the return statement on the same row with the last operation. Don't do this without thinking, because unexpected return values do no good for the person using your function. It harms readability and makes code inconsistent.
Instead of this:
function upload(file) {
if (file.size >= 9999999) {
return;
}
if (file.name.length <= 5) {
return;
}
if (file.format === 'jpeg') {
return saveFile('jpegs/' + file.name);
}
return saveFile('others/' + file.name);
}
Do this:
function upload(file) {
if (file.size >= 9999999) {
return;
}
if (file.name.length <= 5) {
return;
}
if (file.format === 'jpeg') {
- return saveFile('jpegs/' + file.name);
+ saveFile('jpegs/' + file.name);
+ return
}
- return saveFile('others/' + file.name);
+ saveFile('others/' + file.name);
+ return
}
If your ternary condition is duplicated in either one of the branches, the whole ternary is useless
Bad code:
object.isHereSomething ? object.isHereSomething : 'default value';
Good code:
object.isHereSomething ||Β 'default value';
Explanation
The same condition is seen in 2 different places of this expression. This is your first clue, which often means that the expression can be simplified with boolean logic operators.
Simplified boolean logic
From this:
function exampleFunction(someConditional) {
if(someConditional === true /* 1. */) {
return true /* 2. */
} else {
return false
}
}
To this:
function exampleFunction(someConditional) {
return someConditional
}
If you find yourself with simplified code that looks as simple as this, please consider removing the function altogether. Indirection is always bad!
Explanation:
1. Never compare 2 booleans
function exampleFunction(someConditional) {
- if(someConditional === true /* 1. */) {
+ if(someConditional) {
return true /* 2. */
} else {
return false
}
}
2. If you're returning booleans from both branches of an if statement, it's usually not needed at all
There's rarely a need for returning true
from one side of an if statement and false
from the other.
In many cases the conditional statement can be used as the return value itself.
function exampleFunction(someConditional) {
+ return someConditional
- if(someConditional) {
- return true /* 2. */
- } else {
- return false
- }
}
On useless try-catches, being overly defensive, I/O boundaries and variable scope
I've started moving some of these to their own Gists
Only one type of return value per function
Even though this might feel like a minor thing, it certainly isn't. As with propagation of thrown exceptions and asynchronous operations, return values also cause an effect greater than what it initially looks like.
Bad
function findUsers(userId) {
const results = db.select('SELECT * FROM app_users WHERE id = ?', userId);
if(results.length === 1) {
return results[0]
}
return results
}
Good
function findUsers(userId) {
return db.select('SELECT * FROM app_users WHERE id = ?', userId);
}
Explanation
By returning values in 2 different formats, you force the caller to check in which type the received value is.
const userOrUsers = findUser(101)
if(Array.isArray(userOrUsers)) {
userOrUsers.forEach((user) => user.delete())
return
}
userOrUsers.delete()
Now as you can see, the caller has already taken 2 different routes for doing the same operation. Imagine if the proceeding operation would be even more complex, and would use calls to other functions with side effects. What a mess! Not cool at all, unlike this:
findUser(101).forEach((user) => user.delete())
The example above could also be written as
const userOrUsers = findUser(101)
[].concat(userOrUsers).forEach((user) => user.delete())
Which then again makes readability worse.
Don't return null / nil / None / void / undefined
Empty values are the magic sauce for making your developer console go all red.
They cause unpleasant surprises. They break up families. They are known to breed fruit flies in your rubbish.
In dynamic programming languages it's often very difficult to keep track of which functions can and which cannot return these buggers and in the worst case you end up complicating your code with null-checks that are not actually needed.
There are few cases however when returning null
is ~okay:
-
The value you would otherwise be returning is a singular value:
One user instance, a number, a string etc. For a function namedfindUser
it can be argued thatnull
is a valid return value when a user is not found. In the case offindUsers
that returns a list of found users the correct return value is an empty list. Now the consumer of your function only needs to write handling for a single type: a list. If the list is empty, the same logic still applies. -
You have a typed programming language where
null
is treated as a type, that forces you to explicitly define that a function returns either a value or null and makes check the value before being able to manipulate it (TypeScript, Swift...).- With languages like Java, returning null can lead into NullPointerExceptions on runtime because the compiler will not let you know about references to values that can potentially be null. With these kinds of languages it's better to use checked exceptions.
But mmmm, I'm using JavaScript / Python / Ruby ... ?
Cool, so am I. What I'd suggest is, that you use a functor data type instead. To summarise, consider what I mentioned before about returning an empty list when there are no values to return. What's so nifty about it? Well, I'd argue it's the fact that the actual value (or the non-value) is wrapped inside a container of sort that help you with manipulating the values inside of it.
Instead of using an array for wrapping a singular value, a more suitable option could be a Maybe type. Or an Optional type if you're using Java (or maybe not..). Sure there are plenty of alternatives for Python and Ruby as well.
Besides making your code more reliable it also helps everyone trying to understand what your code is doing: the reader can now glance over the code without having to dive deep into the actual function definitions. It's immensely tiring having to read through all code line by line just to figure out the possible outcomes of the function.
console.log(findUser('danieljharvey')) // { id: 2, city: 'London' }
You console.log
the return value of a function and it gives you a user object? Great, but how do I know it does it every time?
function findUser() {
if(Math.random() > 0.5) {
return { id: 2, city: 'London' };
}
return null; // lol fooled ya
}
What if it would give you a Maybe<User>
instead? At least now you know you have to be prepared for the worst.
console.log(findUser('danieljharvey')) // Maybe({ id: 2, city: 'London' })
It's an abstraction for return values that many programming languages lack for one reason of another.
Always prefer naming variables used as conditionals
Bad:
if(player.x + player.width > enemy.x && player.x < enemy.x + enemy.width) {
game.end()
}
Good:
const playerTouchesEnemy = player.x + player.width > enemy.x && player.x < enemy.x + enemy.width
if(playerTouchesEnemy) {
game.end()
}
IDEA: do not use undefined in your own code
IDEA: prefer passing primitives (react-intl generalised)
IDEA: backend: no http stuff to service layer, no database stuff to http layer (generalised)
Like "function A calculates api path", "function B does the fetch" rather than everything in a bag.
IDEA:
- no configuration objects to functions
- max 3 parameters per function
calculatePos(x1, y1, x2, y2)
can be simplified by creating aLine
type or aPoint
type - telegram convo 4.10
Shorter if
branch first
This works in most cases, however I wouldn't only leave it as the end result is, but would combine it with Define value boundaries early, keep things flat.
If statements blocks with more than few lines of code in them are usually a bad sign!
From:
if (something) {
/*
*
*
* Loads of stuff here so you need to scroll
*
*
* ...
*/
} else {
// Only one or few lines
}
To:
if (!something) {
// Only one or few lines
} else {
/*
*
*
* Loads of stuff here so you need to scroll
*
*
* ...
*/
}
Always avoid unnecessary data structures, stop creating garbage piles for your data!
Consider the following JSON response from an HTTP API:
// GET users/123
{
data: {
username: 'rikukissa',
level: 'noob',
info: {
age: 26,
email: '...'
}
}
}
There's at least 2 different things wrong here: First of all the data
wrapper is pretty much useless. Of course you could argue that there's need for meta data, error information or what ever, and that's a perfectly valid claim. In this case though, there's no information of that type included in my response. One could also argue, that error information should only be a part of erroneous responses and not valid ones so something like:
// GET users/123
{
error: false,
data: {
username: 'rikukissa',
level: 'noob',
info: {
age: 26,
email: '...'
}
}
}
is completely pointless.
Stick to convention, always
There are even wilder examples like { data: [], errors: [], meta: [] }
, and the question is: what kind of information would you store in the meta
field? I bet you a pint all suggestions you'll come up with would be better off transferred in some other part of the HTTP protocol like in the response headers. My worry is that these unconventional responses are in reality only used because of the lack of knowledge about best practices and the features HTTP provides by default. There are for instance very few error cases that couldn't be mediated with only HTTP response code.
Don't create places to dump code
Okay, let's move on. This is what we have now:
// GET users/123
{
username: 'rikukissa',
level: 'noob',
info: {
age: 26,
email: '...'
}
}
It looks fairly good, although the need for a separate info
section puzzles me. There are many names for these kinds of sections like meta
, info
, data
, recycle bin
, garbage
, utils
, common
. Let's face it: it's just a unstructured place where you can dump your new field without re-thinking the overall structure. If we were to model the response as a data type User
, we'd most likely not treat "age" any different from the rest of the fields
interface IUser {
username: string;
level: string;
age: number;
email: string;
}
You may have noticed that the name examples I gave such as utils
and common
aren't really related to this specific use case. And there's a reason for this: people tend to do this everywhere. A directory in your codebase called common
is just a telltale sign of bad design. There are very few modules that are "common" for all other modules of your codebase. I feel like I always say this, but to me it's the same as creating a file called functions.js
with all the functions of your app inside of it. Please don't.
IDEA: Prefer passing data instead of functions as parameters
- Special cases: high-level abstractions like
map
etc functions, callbacks.. - Logic upwards propagation argument
Example
From this:
getDynamicSelectOptions = (field: IField, values: IFormData) => {
if (field.dynamicOptions) {
switch (field.name) {
case "city":
return addresses[values.state].cities;
case "cityPermanent":
return addresses[values.statePermanent].cities;
case "addressLine4":
if (
addresses[values.state][values.city] &&
addresses[values.state][values.city].areas
) {
return addresses[values.state][values.city].areas;
} else {
return [];
}
default:
return [];
}
} else {
return [];
}
};
To this:
getDynamicSelectOptions = (field: IField, values: IFormData) => {
if (!field.dynamicOptions) {
return [];
}
switch (field.name) {
case "city":
return addresses[values.state].cities;
case "cityPermanent":
return addresses[values.statePermanent].cities;
case "addressLine4":
const hasAreas =
addresses[values.state][values.city] &&
addresses[values.state][values.city].areas;
if (!hasAreas) {
return [];
}
return addresses[values.state][values.city].areas;
default:
return [];
}
};
Combination of:
Define value boundaries early, keep things flat
Shorter if
branch first
The code's still quite difficult to read, but more about that later
Thanks for the useluf tips. I especially like the 'Define value boundaries early, keep things flat' tip! I'll try to remember that.
Oh, thanks a lot for your feedback and for taking time to read through some of it :)
How many levels of module dependency hierarchy is okay?
π Avoid creating dependency hierarchies deeper than 4 levels. Aim for 2-3.
Few times during my career I've come across projects with +10 levels of dependant modules. Most of them have been Java projects.
The issue is, code like that is extremely tiring to follow. I remember writing down the path I had taken whilst reading the code just so I won't forget how I ended up to the current module I'm reading. Not good.
I feel like the best way to approach this would be to try and describe the whole process from start to finish on the main level:
function getCake() {
oven = getPreheatOven()
pan = getGreasedPan()
mixture = mixIngredients(getEggs(), getFlour())
putMixtureIntoPan(pan, mixture)
return bakeForMinutes(40, oven, pan)
}
As the alternative (bad) approach I see functions that instead of being written to flow from top to bottom,
are written to flow deeper and deeper the module tree
function getCake() {
return getBakedCake(40)
}
function getBakedCake(minutes) {
oven = getPreheatOven()
pan = getPanWithMixture()
return bakeForMinutes(minutes, oven, pan)
}
function getPanWithMixture() {
pan = getGreasedPan()
mixture = mixIngredients(getEggs(), getFlour())
return putMixtureIntoPan(pan, mixture)
}
... and so on. The reader would have to go through 3 different files just to know what ingredients are used.
How to solve cyclic dependencies between modules
AKA "why am I getting
undefined
when I import something from my module
Hey! A classic situation with a very generic and bad example (TODO). You've written few modules and soon notice that they depend on each other for one reason or another. Albeit being a **very strong indicator of bad design**, this is a common situation and one that's easy to solve:
user.ts
import { calculatePoints } from './points'
interface User {
createdAt: number; // in milliseconds
level: number;
}
export function updateUserPoints(user) {
return {
...user,
points: calculatePoints(user) // dependency to points.ts
};
}
export function getUserRegisteredInYears(user) {
return Math.floor((Date.now() - user.age) / 1000 / 3600 / 24 / 365); // Please, don't to this in real life
}
points.ts
import getUserAge from './user'
export function calculatePoints(user) {
return getUserRegisteredInYears(user) * 3; // dependency to user.ts
}
There are few reasons why this happened in the first place:
- Responsibilities of each module are too loosely defined
- There are more concepts in play that you first imagined
And few solutions:
Responsibilities of each module are too loosely defined
Is there really a need for a point
module? If not, the user
module could look like this:
interface User {
createdAt: number; // in milliseconds
level: number;
}
function calculatePoints(user) {
return getUserRegisteredInYears(user) * 3;
}
export function updateUserPoints(user) {
return {
...user,
points: calculatePoints(user)
};
}
export function getUserRegisteredInYears(user) {
return getYears(Date.now() - user.age);
}
However, if you answered "yes", could it be that the points
module knows a bit too much about the internals of the user
module? Could we, for example, turn the signature of calculatePoints
into being:
- export function calculatePoints(user) {
+ export function calculatePoints(userAgeInYears) {
which would simplify the function to:
export function calculatePoints(userAgeInYears) {
return userAgeInYears * 3;
}
There are more concepts in play from what you first imagined
In this very simple example, the only thing that the getUserRegisteredInYears
function really provides us with is the conversion from milliseconds since the creation of the user to years. Could we add another module called time.ts
? That way we could drop the getUserRegisteredInYears
function altogether and refer to this new module from our points
module:
points.ts
import getYears from './time'
export function calculatePoints(user) {
return getYears(Date.now() - user.createdAt) * 3;
}
Stop repeating yourself!
π
interface User {
userCreatedAt: number;
userPoints: number;
}
π
interface User {
createdAt: number;
points: number;
}
Repeating the word user
creates extra fuzz around the words that are actually relevant, without adding any value. The reader should be able to infer the context even when it's few lines above.
π
it('should add two numbers together', ...)
it('should fail if divider is zero', ...)
π
it('adds two numbers together', ...)
it('fails if divider is zero', ...)
By rephrasing the test case description, we can keep it as descriptive while making it more concise. Someone could even argue, that a test not only verifies that your code should do something, but that it actually does the thing it was written for.
Always name your constants
Got this idea while writing the bit about cyclic dependencies. You might have come across time calculation like this before:
// Returns true if token is older than 3 days
export function hasTokenExpired(token) {
return (Date.now - token.createdAt) > 86400000 * 3;
}
The comment is pretty spot on, but we can do better:
export function hasTokenExpired(token) {
const day = 24 * 3600 * 1000;
return (Date.now - token.createdAt) > day * 3;
}
Not sure if you disagree, but to me it's much easier to read that something's greater than day * 3 (probably means 3 days) than that something's greater than 86400000 (possibly the mass of the Earth)
Great quote from clean-code-javascript:
We will read more code than we will ever write. It's important that the code we do write is readable and searchable. By not naming variables that end up being meaningful for understanding our program, we hurt our readers. Make your names searchable.
Define value boundaries early, keep things flat
I've started moving some of these to their own Gists