Indentation means that the contents of every block are promoted from their containing environment by using a shift of some space. This makes the code easier to read and follow.

Code without indentation is harder to read and so should be avoided. The Wikipedia article lists several styles – pick one and follow it.

Some people call their variables “file”. However, file can mean either file handles, file names, or the contents of the file. As a result, this should be avoided and one can use the abbreviations “fh” for file handle, or “fn” for filenames instead.

In C++, classes should start with an uppercase letter (see the Wikipedia article about letter case) and starting them with a lowercase letter is not recommended.

# Bad code
class my_class
{
    .
    .
    .
};
class MyClass
{
    .
    .
    .
};

Your code should not include unnamed numerical constants also known as “magic numbers” or “magic constants”. For example, there is one in this code to shuffle a deck of cards:

# Bad code


for (int i = 0; i < 52; i++)
{
    const int j = i + rand() % (52-i);
    swap(cards[i], cards[j]);
}

This code is bad because the meaning of 52 is not explained and it is arbitrary. A better code would be:

const int deck_size = 52;

for (int i = 0; i < deck_size; i++)
{
    int j = i + rand() % (deck_size - i);
    swap(cards[i], cards[j]);
}

See the Wikipedia article about “The Law of Demeter” for more information. Namely, doing many nested method calls like obj->get_employee('sophie')->get_address()->get_street() is not advisable, and should be avoided.

A better option would be to provide methods in the containing objects to access those methods of their contained objects. And an even better way would be to structure the code so that each object handles its own domain.

As noted in Martin Fowler’s “Refactoring” book (but held as a fact for a long time beforehand), duplicate code is a code smell, and should be avoided. The solution is to extract duplicate functionality into subroutines, methods and classes.

Another common code smell is long subroutines and methods. The solution to these is to extract several shorter methods out, with meaningful names.

People who wish to use the ternary inline- conditional operator (? :) for choosing to execute between two different statements with side-effects instead of using if and else. For example:

# Bad code


cond_var ? (hash["if_true"] += "Cond var is true")
          : (hash["if_false"] += "Cond var is false")

(This is assuming the ternary operator was indeed written correctly, which is not always the case).

However, the ternary operator is meant to be an expression that is a choice between two values and should not be used for its side-effects. To do the latter, just use if and else:

if (cond_var)
{
    hash["if_true"] += "Cond var is true";
}
else
{
    hash["if_false"] += "Cond var is false";
}

This is safer, and better conveys one’s intentions.

For more information, refer to a relevant thread on the Perl beginners mailing list (just make sure you read it in its entirety).

It is a good idea to avoid global variables or static variables inside functions; at least those that are not constant. This is because using such variables interferes withmultithreading, re-entrancy and prohibits instantiation. If you need to use several common variables, then define an environment struct or class and pass a pointer to it to each of the functions.

With many editors, it can be common to write new code or modify existing one, so that some lines will contain trailing whitespace, such as spaces (ASCII 32 or 0x20) or tabs characters. These trailing spaces normally do not cause much harm, but they are not needed, harm the code’s consistency, may undermine analysis by patching/diffing and version control tools. Furthermore, they usually can be eliminated easily without harm.

Here is an example of having trailing whitespace demonstrated using the --show-endsflag of the GNU cat command:

> cat --show-ends toss-coins.pl
$
use strict;$
use warnings;$
$
my @sides = (0,0);$
$
my ($seed, $num_coins) = @ARGV;$
$
srand($seed);  $
$
for my $idx (1 .. $num_coins)$
{$
    $sides[int(rand(2))]++;$
    $
    print "Coin No. $idx\n";$
}$
$
print "You flipped $sides[0] heads and $sides[1] tails.\n";$
>

While you should not feel bad about having trailing space, it is a good idea to sometimes search for them using a command such as ack '[ \t]+$' (in version 1.x it should be ack -a '[ \t]+$', see ack), and get rid of them.

Some editors also allow you to highlight trailing whitespace when present. See for example:

Finally, one can check and report trailing whitespace using the following CPAN modules:

You should add #include guards, or the less standard but widely supported #pragma once into header files (“*.h” or “*.hpp” or whatever) to prevent them from being included times and again by other “#include” directives. Otherwise, it may result in compiler warnings or errors.

You can sometimes see code like that:

# Bad code


int * my_array[NUM];

int * sub_array = malloc(sizeof(sub_array[0]) * SUB_NUM);
if (! sub_array)
{
    /* Handle out-of-memory */
}
for (int i = 0 ; i < NUM ; i++)
{
    populate_sub_array(i, sub_array);
    my_array[i] = sub_array;
}

The problem with code like this is that the same physical memory location is being used in all places in the array, and so they will always be synchronised to the same contents.

As a result, the code excerpts should be written as such instead:

int * my_array[NUM];

for (int i = 0 ; i < NUM ; i++)
{
    int * sub_array = malloc(sizeof(sub_array[0]) * SUB_NUM);
    if (! sub_array)
    {
        /* Handle out-of-memory */
    }
    populate_sub_array(i, sub_array);
    my_array[i] = sub_array;
}
my @array_of_arrays = map { [] } (1 .. $num_rows);

On various online forums, we are often getting asked questions like: “What is the speediest way to do task X?” or “Which of these pieces of code will run faster?”. The answer is that in this day and age of extremely fast computers, you should optimise for clarity and modularity first, and worry about speed when and if you find it becomes a problem. Professor Don Knuth had this to say about it:

The improvement in speed from Example 2 to Example 2a is only about 12%, and many people would pronounce that insignificant. The conventional wisdom shared by many of today’s software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise-and-pound-foolish programmers, who can’t debug or maintain their “optimized” programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn’t bother making such optimizations on a one-shot job, but when it’s a question of preparing quality programs, I don’t want to restrict myself to tools that deny me such efficiencies.

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

(Knuth reportedly attributed the exact quote it to C.A.R. Hoare).

While you should be conscious of efficiency, and the performance sanity of your code and algorithms when you write programs, excessive and premature micro-optimisations are probably not going to yield a major performance difference.

If you do find that your program runs too slowly, refer to our resources about Optimising and Profiling code.

You should make sure that the HTML markup you generate is valid HTML and that it validates as XHTML 1.0, HTML 4.01, HTML 5.0, or a different modern standard. For more information, see the “Designing for Compatibility” section in a previous talk.

If you want to group a certain sub-expression in a regular expression, without the need to capture it (into the $1$2$3, etc. variables and related capture variables), then you should cluster them using (?: … ) instead of capturing them using a plain ( … ), or alternatively not grouping them at all if it’s needed. That is because using a cluster is faster and cleaner and better conveys your intentions than using a capture.

When passing a non-literal-constant string as the first parameter to “printf()”/sprintf()” and friends, one runs the risk of format string vulnerabilities (more information in the link). As a result, it is important to always use a literal constant string to format the string. E.g:

# Bad code


fgets(str,sizeof(str), stdin);
str[sizeof(str)-1] = '\0';
printf(str);

should be replaced with:

fgets(str,sizeof(str), stdin);
str[sizeof(str)-1] = '\0';
printf("%s", str);

One can also use the relevant warning flags of GCC and compatible compilers to warn and possibly generate an error for that.

It is a very good idea for C and C++ code to use a good build and configuration system. There’s a page listing some prominent alternatives. For simple setups, a make file may be suitable, but more complex tasks require a configuration and build system such as CMake.

It is important to use a bug tracking system to maintain a list of bugs and issues that need to be fixed in your code, and of features that you’d like to work on. Sometimes, a simple file kept inside the version control system would be enough, but at other times, you should opt for a web-based bug tracker.

For more information, see:

This is a short list of the sources from which this advice was taken which also contains material for further reading:

  1. A large part of this document is derived from a similar documentwritten earlier for the Perl programming language.
  2. The Book “Perl Best Practices” by Damian Conway – contains a lot of good advice and food for thought, but sometimes should be deviated from. Also see the “PBP Module Recommendation Commentary” on the Perl 5 Wiki.
  3. “Ancient Perl” on the Perl 5 Wiki.
  4. chromatic’s “Modern Perl” Book and Blog
  5. The book Refactoring by Martin Fowler – not particularly about Perl, but still useful.
  6. The book The Pragmatic Programmer: From Journeyman to Master – also not particularly about Perl, and I found it somewhat disappointing, but it is an informative book.
  7. The list “How to tell if a FLOSS project is doomed to FAIL”.
  8. Advice given by people on Freenode’s #perl channel, on the Perl Beginners mailing list, and on other Perl forums.
  9. Advice given by people on Freenode’s ##programming channel and on other forums.

Formats

Version Control Repository

This document is maintained in a GitHub Repository which one can clone, fork, send pull-requests, and file issues for. Note that all contributions are assumed to be licensed under theCreative Commons Attribution 4.0-and-above (CC-by) licence. Enjoy!