File layout in a large project
File layout in a large project benefits from regularity. Here are some sample rules for a C++ project.
Convention vs configuration
I’ve first encountered the convention approach systematically used in the Ruby community as opposed to the configuration approach.
Say for example you have a database with tables and columns and classes with members that map to the data in the database.
If you have the configuration approach then names of classes and members do not have a regular mapping to the tables and column. Additional configuration, in the form of attributes, configuration files or code, is required to perform the mapping.
If you have the convention approach then the classes have the same name as (or derived from) the names of the tables, and the members have the same name as (or derived from) the names of the columns.
Project scales
By large project I mean a project that generates a few executables, that a small team of programmers produces over timescales of a few months, to a few years. I believe such a project benefits from a set of rules on how to organise the code in a regular way.
For a small project (e.g. a programmer for a couple of days) such regularity is not required, but still beneficial.
For a huge project (e.g. large teams over many years) more rules than what I describe here are beneficial (as it would be splitting it into several large projects).
Regular conventions
The file layout for a large project benefits from regularity, i.e. lack of surprises in terms of location, naming convention and structure. The regularity derives from using convention instead of configuration approach.
It means that if you see auto src = cstdio::file::open(src_file_name, "rb");
you know that we’re talking about a function called open
that you find in
../cstdio_lib/file.(h|cpp)
Otherwise additional time is regularly wasted on:
- Finding where a function or class is implemented
- Where are members of a class defined inside a class
- Ensuring the right header are included
As another example: if all your test executables have names ending in
_test.exe
then the code to run them at the end of the build can be written
generically to run all executables ending in _test.exe
. If there is no
regular naming convention then you need to store the list of test executables
to run at the end of the build and update it every time a new test executable
is added to the project, this can be error prone.
Top level folders
Have a src
folder for the source code. Create binaries such as executables
into a bin
folder, use a int
folder for intermediate binaries such as
object files.
See fit RAII as example contents of the src
folder.
Component folders
In src
use a folder for each component (Visual Studio project). A component
can be: an executable, a static or dynamic library.
The folder name is the same as the component name. E.g. source for the
executable foo.exe
are in a folder foo
, source for a static library
bar_lib
are in a folder bar_lib
.
Executables are of two types: output executable and test executables for a static library.
Keep output executables and dynamic libraries thin. E.g. the source for a
output executable consists of a small main.cpp
using code from static
libraries. Most of the related code should be put into a static library e.g.
foo_lib
.
Test executable for a bar_lib
should be called bar_lib_test
, and it’s files
have derived names e.g.: bar_lib\foo.(h|cpp)
is tested in
bar_lib_test\foo_test.cpp
.
Source files: .h and .cpp
Code in libraries is wrapped in a namespace derived from the name of the
library. E.g. namespace bar
for code for a library bar_lib
Source files relate to a unit. You need to define what is a unit based on
selecting for single responsibility. A unit can be a class or a set of related
functions. The name of the files is the name of the unit (plus the extension).
E.g. a foo.h
would contain a class foo
or a namespace foo
with functions.
You would separate a struct
from it’s related JSON
serialization/deserialization functions in different units (there are usually
plenty of contexts to use a struct without caring about serialization).
But separation is not religious: it is recommended to have in the same file
things that are intimately related. See for example fit RAII
file_raii.h
containing the class defining the unit, but also the
file_raii_traits
that is closely related.
Headers inclusion
A .cpp file first includes it’s .h file (if any).
After this headers are grouped:
- headers from the same component
- headers form other components
- platform headers
- headers of external libraries (e.g. boost)
- standard library headers
Within each group headers are sorted alphabetically.
Include all (and only) the headers that a file depends on directly.
Class members order
Inside a class prefer a consistent order e.g.:
- first types
- then constants
- then member variables
- then constructor and destructor
- then copy related
- then move related
- other public functions
- other private functions
References
See John Lakos’ talks on the subject of project layout.