+ - 0:00:00
Notes for current slide
Notes for next slide

Mircea Baja - 22 May 2020

Irregularity in Generic Programming

Part II (of three): Continuum

1

Agenda: Irregularity continuum

  • Linear search
  • Requirements
  • Detour: partition
  • Concepts as a thinking tool
  • What is regularity?
  • What is irregularity?
  • The irregularity continuum
  • The irregularity conjecture
  • Detour: concepts in C++ 20
2

Iterators: generalized pointers

Image

  • do not dereference the "end" position in an array (one past the last)
  • generalization: input/output/forward/bidirectional/random access iterators (e.g. position of node in a linked list)
  • dummy nodes in lists (usually) don't have values
3

Linear find with pointers - toy

int data[] = {2, 5, 7, 1, 22};
int * it = find(std::begin(data), std::end(data), 5);
if (it == std::end(data)) {
// not found ...
}
else {
// found ...
*it = 42;
}
int * find(int * f, int * l, int x) {
while ((f != l) && (*f != x)) ++f;
return f;
}
  • Note we compare f with l twice: inside the find and outside
  • In this case once with !=, the other time with ==
  • In well written code the safety issues do not propagate forever
4

Linear find canonical - toy

template<typename It, typename T>
It find(It f, It l, const T & x) {
while ((f != l) && (*f != x)) ++f;
return f;
}
  • When does it work?
  • It will work for T being int, short, long, unsigned etc.
  • Type It (for f and l) is pointer-like/position to sequences of integer-like values (e.g. T *)
  • e.g. It can also be an iterator in a linked list (dereferencing gives value, not node; advance follows next)
  • T and It are integer-like not in the addition/multiplication sense, but in the it holds a value that can be copied, compared for equality etc.
5

Requirements syntactic

  • We can compare for equality f and l
  • We can dereference f
  • The iterator type has an associated value type
  • Dereferencing f is reference to the value type
  • We can advance f (with ++)
  • These are easy to express
  • These apply to types, not values
6

Requirements semantic

  • Once f equals l, it stays that way, we can compare again and get the same result
  • And that's true even for a copy of f
  • Equality: reflexive, symmetric and transitive
  • We can reach l from f, l should not be dereferenced
  • These are harder to express
7

Linear find using concepts - toy

template<typename It>
using ValueType = typename It::value_type;
template<typename It>
concept InputIterator = requires(It a, It b)
{
{a == b} -> std::boolean;
{a != b} -> std::boolean;
typename ValueType<It>;
{*a} -> std::common_reference_with<ValueType<It>>;
++a;
};
template<typename It>
requires InputIterator<It>
It find(It f, It l, const ValueType<It> & x)
{
while ((f != l) && (*f != x)) ++f;
return f;
}
  • Still toy example (arrays?)
8

Linear find - industrial version

template<InputIterator It, Sentinel<It> S, class T, class Proj = ranges::identity >
requires IndirectRelation<ranges::equal_to<>, projected<It, Proj>, const T*>
It find(It f, S l, const T& x, Proj proj = Proj{})
{
while ((f != l) && (ranges::invoke(proj, *f) != x)) ++f;
return f;
}
  • Sentinel supports iterators stopping on file end, zero terminated strings. In general the case where checking for the end is different from equality of two iterators.
  • Projections support the case where only part of the value type is used
  • Type T is not necessarily the value type of the iterator
  • Overloaded to get a range e.g. the whole container
std::cout << find(people, "Alice", &person::first_name)->last_name;
9

Faster linear find

template<typename It, typename T>
// requires It is an ForwardIterator
// T is equality comparable with ValueType<It>,
// T is assignable to ValueType<It>
It find_by_setting_sentinel(It f, It l, const T & x) {
// precondition: l can be dereferenced to store value as sentinel
*l = x;
while (*f != x) ++f;
return f;
}
  • Trick known in 1974/Knuth
10

Detour: partition

Image

11

Partition semistable - toy code

template<typename It, typename Pred>
// requires It is an ForwardIterator,
// Pred is an unary predicate on ValueType(It)
It partition_semistable(It f, It l, Pred pred) {
f = std::find_if(f, l, pred);
if (f == l) return f;
for (It i = std::next(f); i != l; ++i) {
if(!pred(*i)) {
std::iter_swap(f, i);
++f;
}
}
return f;
}
  • NOTE: false range is first (EOP style, not std style)
12

Partition - options

  • Forward iterators - semistable
  • Bidirectional iterators - faster
  • Stable - more resources
  • But can be implemented stable for list - the list iterators have something in common after all (partition_linked)
  • Position variations
  • The predicate is only applied once (or the algorithm spec should say otherwise)
  • There are also 3-way partition algorithms
13

Algorithm naming quirks

  • We only looked at a particular idiom (eager algorithms using first/last)
  • There are a lot of algorithms (google "105 STL Algorithms in Less Than an Hour")
  • There are subtle differences (e.g. between accumulate and reduce with regards to the associativity of the operation)
  • Leads to cognitive load (transform_exclusive_scan anyone?)
  • A pragmatic solution: reduced vocabulary to repeatedly used algorithms in an area of code
  • A raw for/while loop is sometimes suitable, also more adaptable to changes area of code
14

Concepts

  • Specific: a particular way of implementing concepts in C++20 describing syntactic requirements (constraints)
  • General: tool of thinking about requirements on types (not on values), especially relations between types.
  • They (can) have a open/extensible character: not all concepts are orthogonal or hierarchical: the case of overlapping requirements identified after the type definition (especially important for built-in types)
15

Concept naming quirks

  • Non-intuitive synthetic names e.g. iterator, value type, sentinel, projection etc.
  • Not derived mechanically: no mechanic nested concept propagation
  • Based on observations on common behaviour
  • They are invented, taste and choice matters
  • Reduced vocabulary is a practical choice sometimes
  • Another choice is customisation points e.g. std::begin
  • Another practical choice is over-constraining to start with
  • Hyperbolic usage conjecture applies to concepts
  • Don't try to be too generic on custom/rarely used concepts
16

Regular concept - for data

  • Default constructible
  • Copyable (and movable)
  • Equality
  • Destructible
Some a;
a = b;
assert(a == b);
c = b;
assert(a == c);
  • Fundamental behaviour, relied on for correctness reasoning
  • Integer-like not in the addition/multiplication sense, but in the it holds a value that can be copied, compared for equality etc.
  • e.g. standard containers
17

Regular concept - for functions

  • Arguments and return types are regular data types
  • Replacing arguments with equal values results into equal return values
  • Pure functions with no side effects, and no internal state like random number generators
  • Can have side effects, but consistent results
  • e.g. predicates for standard algorithms
  • Different sounding definition compared with data regularity
  • Same outcome: fundamental behaviour, relied on for correctness reasoning
18

Irregularity continuum

  • For almost every characteristic of a regular data concept, there are useful types that violate it
19

Data irregularity examples

  • Move: a bit weird - move an integer?
  • Copy constructor throws - standard containers
  • Default constructor throws - std::list (also what state?)
  • Move constructor throws - std::list
  • Destructor throws - scope exit idiom
  • No default constructor - scope object type
  • No copy - move only resource types
  • Equality - of what? e.g. equality of remote content
  • Equality complexity - worst case quadratic for unordered containers
  • Equality for input iterator - does not make sense
  • Float Nan equality == not !=
  • But have not seen a sane case where == is different from not !=
20

Function irregularity examples

  • A lot of code is about side effects, order of operations matter - fully irregular
  • Pseudo-regular - algorithms where the function only gets applied once per value - e.g. partition example
21

Irregularity conjecture

  • Generic data structures and algorithms are diverse, quirky and are not universal
22

Irregularity conjecture

  • Generic data structures and algorithms are diverse, quirky and are not universal
  • Reduced vocabulary is a pragmatic option of handling complexity
  • Gains (efficiency/different behaviour) are available in special cases
  • Either by using a more refined vocabulary
  • Or by using human knowledge not available to the machine
23

Things we know - examples

  • Side effects are fine: pseudopredicate
  • Can open file: running as root/administrator
  • Sequence is sorted/ordered/partitioned for relation used: binary search can be used
  • Relation is transitive for values provided
  • push_back() won't throw: vector was previously resized
  • Permutations can avoid invalidating the moved from objects
  • Machines cover more ground, but there is always a gap
24

Detour: Concepts in C++20

25

Generic rationale

int min_int(int a, int b) { ... }
float min_float(float a, float b) { ... }
const std::vector<int> & min_vector_int(
const std::vector<int> & a, const std::vector<int> & b) { ... }
26

Generic rationale

int min_int(int a, int b) { ... }
float min_float(float a, float b) { ... }
const std::vector<int> & min_vector_int(
const std::vector<int> & a, const std::vector<int> & b) { ... }
template<typename T>
const T & min(const T & a, const T & b) {
// implementation here
}
27

Generic rationale

int min_int(int a, int b) { ... }
float min_float(float a, float b) { ... }
const std::vector<int> & min_vector_int(
const std::vector<int> & a, const std::vector<int> & b) { ... }
template<typename T>
const T & min(const T & a, const T & b) {
// implementation here
}
if (a < b) return a; else return b; // 1
if (a <= b) return a; else return b; // 2
if (a > b) return b; else return a; // 3
if (b < a) return b; else return a; // 4
28

C++ concept

  • named sets of requirements
template<typename T>
concept EqualityComparable =
requires (const T a, const T b) {
{ a == b } -> std::boolean;
{ a != b } -> std::boolean;
};
29

C++ concept

  • compile type predicates (&& and ||)
  • syntax vs. semantics
template<typename T>
concept StrictlyTotallyOrdered =
EqualityComparable<T>
&&
requires (const T a, const T b) {
{ a < b } -> std::boolean;
{ a > b } -> std::boolean;
{ a <= b } -> std::boolean;
{ a >= b } -> std::boolean;
};
30

Requires-expression

requires(T a, U b) {
a + b; // simple: expression can compile
typename T::sub_type; // type requirement
{ a + 1 } -> std::same_as<int>; // compile AND std::same_as<decltype(a+1), int>
}
31

Usage

// requires clause
template<typename T>
requires StrictlyTotallyOrdered<T>
const T & min(const T & a, const T & b) {
if (b < a) return b; else return a;
}
template<StrictlyTotallyOrdered T>
const T & min(const T & a, const T & b) {
if (b < a) return b; else return a;
}
// abbreviated function template
const StrictlyTotallyOrdered auto & min(
const StrictTotallyOrdered auto & a,
const StrictTotallyOrdered auto & b) {
if (b < a) return b; else return a;
}
32

Placeholder type deduction

requires(T a, U b) {
{ a + 1 } -> std::same_as<int>;
{ a == b } -> std::boolean
}
33

Placeholder type deduction

requires(T a, U b) {
{ a + 1 } -> std::same_as<int>;
{ a == b } -> std::boolean
}
void fn(std::same_as<int> x); // invent
fn(a + 1); // check if it compiles
34

Placeholder type deduction

requires(T a, U b) {
{ a + 1 } -> std::same_as<int>;
{ a == b } -> std::boolean
}
void fn(std::same_as<int> x); // invent
fn(a + 1); // check if it compiles
template<std::same_as<int> T>
void fn(T x);
35

Placeholder type deduction

requires(T a, U b) {
{ a + 1 } -> std::same_as<int>;
{ a == b } -> std::boolean
}
void fn(std::same_as<int> x); // invent
fn(a + 1); // check if it compiles
template<std::same_as<int> T>
void fn(T x);
template<typename T>
requires std::same_as<T, int>
void fn(T x);
36

All together now

template<InputIterator It, Sentinel<It> S, class T, class Proj = ranges::identity >
requires IndirectRelation<ranges::equal_to<>, projected<It, Proj>, const T*>
It find(It f, S l, const T& x, Proj proj = Proj{})
{
while ((f != l) && (ranges::invoke(proj, *f) != x)) ++f;
return f;
}
37

Questions?

38

Agenda: Irregularity continuum

  • Linear search
  • Requirements
  • Detour: partition
  • Concepts as a thinking tool
  • What is regularity?
  • What is irregularity?
  • The irregularity continuum
  • The irregularity conjecture
  • Detour: concepts in C++ 20
2
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow