Short report from code::dive 2015

Author:	Wojciech Muła
Added on:	2015-11-15

Few days ago I attended code::dive 2015, an IT conference in Wrocław, Poland. It was a one-day conference with a great number of presentations. There were four stages and five sessions, in total 20 talks. Impressive number! But an attender had to choose his own path of just five lectures. I think the decision was pretty difficult. Sometimes less is better.

The conference was organized by Nokia Poland for the second time, and these guys did the job very well. The location was perfect. The event was in a big cinema near the city center. There were free food and water. You had only to register online.

I decided to watch following presentations:

Andrei Alexandrescu "Writing Fast Code" part 1.
Andrzej Krzemiński "Constraint yourself".
Urlich Drepper "Intelligent application configuration data management".
Bartosz Szurgot "C++ vs. C: the embedded perspective".
Dan Saks "Representing Memory-Mapped Devices as Object".

I'm not going to describe here the presentations in details, because videos should eventually appear on the official channel at Youtube. I just point out things which interested me.

Writing Fast Code

Andrei performed a nice show, however some part were... hm, confusing. For example he claimed that code x = x/10 emits a division instruction. This is not true, all compilers run in so called "release mode" will emit multiplication by a reciprocal of a constant. Check this for your own. Another big misunderstanding of the speaker was the cause of slow writing operations on the modern hardware. He claimed that after a write request a CPU loads a line of cache, then modifies it contents according to the request, and finally CPU writes the cache line back to the memory. No. It doesn't work like this, simply. Slow down is caused mostly by multicore architecture and required synchronization among cache subsystems. But this is not very important, I think people should remember simple fact: fewer writes means faster programs.

Update 2015-11-22: my friend Piotr reminded me another funny fact, a speculation why division of floats is faster than division of integers. Andrei claimed that the reason is... exponential format of floats. It's just a subtraction of exponents, a division of mantissa and viola. I'm pretty sure that it's not the real reason.

Andrei showed how he has optimized procedure converting a number to an ASCII representation. He used few tricks, and one of them is worth to mention. He minimize the number of "real" conversions by introducing a specialized path for smaller values. Do the same in your program, analyze your data and use an if instruction to select a fast path. It usually works. Andrei gained 3-5 speedup without big effort.

From perspective of a programmer who has never worked on code optimization Andrei's advice were very valuable. For example: never measure time of debug compilation. Compare your program with good, standard & proved existing solutions. Your optimization of one module could have a negative impact on the whole application. When you measure a time, run tests many times and get the minimum measurement. Pretty obvious, but precious for newbies (the conference was full of students from local universities).

Constrain yourself

Andrzej's presentation was about building restricted data types, making no room for misuse. The starting point of his talk was conclusion that standard C++ data types, like int or char, are a kind of void pointer, i.e. they are transient, untyped entities. It's very true. Then he showed three examples from the real life.

The first example was a type representing number of minutes since midnight. Basically it was wrapped integer number having a tight API. Surprisingly amount of code required to do this isn't very large. Constraints are checked in the run-time during instance construction. And this is the only disadvantage, because everything else is checked in compile-time.

The second example was a short string implementation, where length of a string is given as a template parameter. First of all the class conserves memory and avoid dynamic allocation, and also expose a very basic set of methods when compared to std::string. However, thanks to the C++ type system, to distinguish strings of the same length but different purposes, a programmer have to use fake traits class. I really dislike this approach. It's ugly, but the only solution.

The last example in Andrzej's talk was another class hiding from an user a magic values designed to mean "no value" or "no valid". The speaker compared this solution with boost::optional. The boost template class can wrap virtually all types, but at cost of additional memory. The magic-value approach has no memory footprint, however it is limited mostly to numeric types.

The presentation has reminded me two sad facts:

The C++ type system is weak, and making it stronger require some effort, especially use templates (which make compilation time longer).
The class std::string is the worst class ever, having giant API, lacking useful methods, having no extension points (except allocator).

Intelligent application configuration data management

Urlich showed a little bit history of configuration files and all related problems. There are three major issues: number of different configuration files (like global in /etc, user dot-files and so on), problem with versioning (and changing values of a running application), and the last one: how to obtain a current settings used by a program. Urlich mentioned also that some aspect of program could be controlled through environment variables, however it is not recommended by the speaker.

The first cure was to embed a Lua interpreter in our C++ program, as the interpreter is very small and the language is pretty powerful. Lua scripts gives more flexibility and are easy to extend on the fly. Indeed a nice approach, but in my opinion this just move the problems outside a C++ program.

The problem with obtaining a current configuration was resolved really clever. Urlich proposed to expose all parameters as a virtual file system, like procfs. He used well know FUSE — it works and it doesn't require too much code. Thanks to file system an application can react instantly on parameter write, and also versioning is possible. Moreover, an administrator can capture all parameters using standard tools.

The talk was interesting. However the presented solution was strictly Linux-related and IMHO was too complex for simple/average applications. Maybe for large, complex applications dealing with significant number of parameters this could be a good method.

I didn't like Urlich's slides, a wall of C++ code doesn't look good.

C++ vs. C: the embedded perspective

A little bit provoking title. But first of all I have to say that Bartosz was speaking fast. Really fast. Really, really fast. He can be a host at auctions.

The speaker compared size of executables for different architectures (ARM, Atmega and something else) produced by C and C++ compilers. For me it was a bit surprising that C++ code was smaller in most cases. A C++ code used templates in place of C macros, thanks to that a compiler was able to perform better optimization. Simple CPUs have simple architecture and number of instructions strictly determines an execution time. Thus smaller executable size means less instructions and in fact faster program.

Of course a C++ program on an embedded system probably would never use RTTI nor exceptions facilities, but at least a programmer could use classes and templates, which helps a lot.

Representing Memory-Mapped Devices as Object

Another embedded-related presentation. Dan showed how to use classes to hide hardware obscure details, in the case serial port settings. Dan started from a C-like solution (really ugly and error-prone), and finished with a nice encapsulation in a class. However, overloading new operator to initialize hardware is the most bizarre thing I've ever seen (I've seen many weird things, believe me).