Idiom: C API wrapper
Wrapping C APIs in C++.
This article is part of a series of articles on programming idioms.
C APIs
I provided an overview of the typical C APIs using handles in a previous
article. A good way of wrapping it is a combination of the FIT
RAII techniques and techniques employed by std::filesystem
.
From the FIT RAII techniques
You need the raii_with_invalid_value
template written once. It’s similar to
std::unique_ptr
, but provides functionality that makes sense for handles, not
pointers e.g. it allows for a invalid_value
that is not nullptr
(as some C
APIs do).
Then you create a wrapper once for every handle type:
1
2
3
4
5
6
7
8
9
struct registry_handle_traits {
using handle = HKEY;
static constexpr auto invalid_value = nullptr;
static void close_handle(handle h) noexcept {
static_cast<void>(::RegCloseKey(h));
}
};
using registry_handle = raii_with_invalid_value<registry_handle_traits>;
This handle is as regular as it can be, but:
- it cannot be copied (only moved)
- it does not implement equality (does not make sense, can’t be copied)
- it does not implement order (no equality)
The traits are usually hidden in a details
namespace (or using an underscore
prefix). Note I’ve left out such details like namespaces for brevity.
Then you’ll have functions that return such a handle
1
2
3
registry_handle create_registry_key(...);
registry_handle open_registry_key(...);
Then you might want to have a type that can be used as a function argument
taking either a const &
to the RAII handle or a raw handle (e.g. like
HKEY_LOCAL_MACHINE
in the case of registry). This is based on handle_arg
that only has to be written once.
1
using registry_handle_arg = handle_arg<registry_handle>;
Then you have functions that use the handle:
1
2
3
4
5
6
7
std::wstring read_registry_string(
registry_handle_arg key,
const std::wstring & sub_key_path);
std::vector<std::wstring> read_registry_multistring(
registry_handle_arg key,
const std::wstring & sub_key_path);
Then you might have the situation where closing the handle could fail. That’s common for cases where writes are buffered and closing the handle flushes the content and might detect write failures. Throwing from the destructor is not a good option. You create a function that takes over the handle ownership and closes the handle, throwing for errors. The user has to remember to call this function to detect failures.
1
2
3
4
5
6
void close(file_raii & x) {
int result = std::fclose(x.release());
if (result != 0) {
error::throw_failed("fclose");
}
}
There are also common scenarios where there are soft failures due to the entry not being present.
At it’s simplest deleting a key might detect that the key does not exist. A boolean is used to communicate if the key was actually present. This can be used to issue a large number of delete key operations and only log for the ones that return true.
1
2
3
4
5
6
7
bool delete_registry_key(...);
void foo() {
if (delete_registry_key(...)) {
LOG << "Deleted key.";
}
}
While we are here, it’s worth mentioning that the C API wrapper themselves usually don’t log, letting the user do the logging as seen above.
Another case is trying to open a registry key that might or might not exist.
That would be similar to open_registry_key
, but the caller then has to check
if the handle is valid. An invalid handle indicates that the key was missing
and hence not opened.
1
2
3
4
5
6
7
8
9
10
11
registry_handle open_registry_key_if_exists(...);
void foo() {
auto key = open_registry_key_if_exists(...);
if (!key) {
LOG << "Key is missing.";
}
else {
// use key
}
}
A third situation is trying to read a registry string value that might or might
not exist. That would be similar to read_registry_string
, but we need to
return an std::optional<std::wstring>
to distinguish between registry value
not present (an nullopt
) and a registry value present, but an empty string.
1
2
3
4
5
6
7
8
9
10
11
std::optional<std::wstring> read_registry_string_if_exists(...);
void foo() {
auto value = read_registry_string_if_exists(...);
if (!value) {
LOG << "Value is missing.";
}
else {
// use *value string
}
}
Error handling std::filesystem style
The approach is to have two overloads: one that throws for all errors another
one that takes std::error_code & ec
as the last parameter. The one taking the
error code might throw std::bad_alloc
to indicate a out of memory error,
setting the error code for all other errors:
1
2
3
4
5
6
7
8
std::wstring read_registry_string(
registry_handle_arg key,
const std::wstring & sub_key_path,
std::error_code & ec);
std::wstring read_registry_string(
registry_handle_arg key,
const std::wstring & sub_key_path);
When functions have defaulted parameters, the error code should be the last function parameter that is not defaulted.
The error code contains an error number and an error category that is
eventually an object used to interpret the error number. To keep the error code
small, the error category is ultimately an object with a long scope (static
inside non-member function) and the error code keeps just a reference to such a
long lived object, rather than the object itself. You might need to invest into
understanding this mechanism to:
- provide custom error categories (e.g. for registry string blob size is not
multiple of
wchar_t
- improve upon
std::system_category
error string formatting (e.g. to ensure the language is the one you desire)
Implementation-wise you’ll want to actually implement the error code taking
function, then implement the exception throwing one in terms of the first,
throwing std::system_error
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
foo bar(..., std::error_code & ec) {
ec = std::error_code{};
// implement functionality here
// on error set
// ec = std::error(code, my_error_category())
}
foo bar(...) {
std::error_code ec;
auto return_value = bar(..., ec);
if (ec) {
throw std::system_error(ec, "bar failed");
}
return return_value;
}
When an error code is set, the returned value is usually the default constructed one. This is not a contract for the caller. The caller should always check the error code first, before even checking the returned value.
Why this idiom works?
It practically solves the problem of resource leaks.
It has enough flexibility to cover a variety of C APIs.
It avoids repeatedly implementing low level error handling.
It presents a common, well understood way to wrap them so that generally a user can reason on the behaviour of the C++ wrapper based on the rules above and the documentation of the C APIs.
Sure we could have used a slight variation in the error handling, the
duplication of exception throwing vs. error code yielding is annoying, but the
experience of std::filesystem
was that once it was available codebases often
made a quick transition to it instead of calling the OS APIs directly.
What are it’s limits?
For most C APIs such implementation would be largely mechanical where the C++ wrapping function calls the C API and check for the error.
But for some it’s better to go for a slightly higher level. Behind a single C
API function, many higher level scenarios might be hidden: e.g. we’ve seen
read_registry_string
and read_registry_multistring
that are both build on
top of the single C API function ::RegQueryValueEx
. The advantage is that
read_registry_string
can do more than just call ::RegQueryValueEx
twice
(one to get the size followed by one to get the content), but also handle a
small value optimisation (a lot of values are short, meaning that a C API call
can be elided) and handle the case where the value size changes between the C
API calls.
Equally it’s easy to fall into the trap of creating a wrapper that’s too specific to the current usage in an application, too high level, and lacks generality and ability to reuse.
Choosing the right level can be tricky, requiring human judgement and experience.
For some cases you might need a RAII class different from
raii_with_invalid_value
to return from a function wrapping
CoInitializeEx(NULL, COINIT_MULTITHREADED)
for example. I recommend you
ensure such a class is default constructible to allow the error code overload
to return the same type.
Testing
Pure unit testing in the style we’ve seen for the regular idiom is not possible for the C API wrappers in the current form.
Just changing the form is not justified if it results in much more complex code and all is achieved is doing calls against mock C APIs that don’t actually test real functionality, just call expectations. Often the programmer’s expectations are made clear by checking the code.
Experience shows that most of the testing gains are from testing against the
actual C APIs. That exposes unexpected behaviour of the C API such as
::RegQueryValueEx
not returning the written data size for incorrectly sized
data, which is important when using the small value optimisation.
Testing against the actual C APIs might require tests to run as admin on a
separate machine/platform from the build machine and also requires that a
critical number of functions are wrapped, e.g. to test read_registry_string
you also have to implement write_registry_string
, create_registry_key
and
delete_registry_key
.
And to see what we can do to ensure that this testing difficulty does not extend to the whole application we’re going to look to another idiom.