• Multi-argument constructors with default parameters & explicit keyword

    Recently I was reviewing some snippet of code. It looked similar to following.

    1 struct S {
    2 explicit S(int someParam, int otherParam = 0) {
    3 // Implementation...
    4 }
    5 };

    At a glance I spotted the explicit keyword and wondered if it's needed. It's obvious that for one-argument constructors it is needed, if we don't want compiler to perform implicit conversions. However, what is the behavior when there's more than one arguments, but with default values?

    I dig into the C++ standard and found relevant example [N3242] § 12.3.1/1. In the example there's a constructor accepting const char* as the first argument and int as the second one. The second one has default value of 0.

    So the rule is that any explicit multi-argument constructors may become implicit if they have all but one arguments with default values.
  • flint: Facebook's C++ lint - compilation

    Recently I read a post at Facebook Code about flint - yet another C++ linter. I though I could run this on our project to see whether it will outperform CppCheck. It was pain in the ass to compile this stuff. I hope this post will save your time, if you encounter similar problems.

    folly
    Folly is a C++ library used at Facebook. You can find details here. Without digging in one can discover that compilation instructions aren't descriptive enough. If you are running on Fedora 17 or Ubuntu 12.10 everything should compile as expected. Otherwise you have to pay attention which version of libraries folly depends on you download.

    The first dependency is double-conversion. In the instructions it is said that if you have scons, you have double-conversion. This was not true in my case. I had to manually download it. I picked version 1.1.5, and apparently this is not the best one to pick. It doesn't compile at straight off. To make it working one must create symbolic magic link.

    ln -s <double-conversion>/src/libdouble_conversion_pic.a <double-conversion>/src/libdouble-conversion.a

    This is not the only link needed. The second one concerns header files.

    ln -s <double-conversion>/src/ <double-conversion>/src/double-conversion

    Now the compilation should go further. The next issue that occurred in my system was related to Google->opensource. Somehow I downloaded opensourced Gflags library, but the folly was still referring to Google one. Following one-liner helped to resolve this issue.

    find . -name "*.cpp" -print | xargs sed -i 's/google::ParseCommandLineFlags/gflags::ParseCommandLineFlags/g'

    flint
    Here you should be prepared to install newer version of dmd, gdc, or ldc. Above find-replace command is also needed. The one thing that is wrong on flint GitHub page is missing dot after -L.

    This way the executable comes, but running it gives segmentation fault. I figured out that CXX executable is working fine and it gives reasonable output. The D version seem to be broken.

    If time permits I'll try to dig in and examine why. As always, you are welcome to share your ideas in comments.

    Flint output looks very clear. In our project it found several problems, mainly related to non-explicit constructors, that may be accidentally treated as conversion constructors. Also, it was able to find header files without include guards (nice one!).

    The last interesting that caught my eye is the following. It refers to SO ;)

    /home/szborows/****.hpp(12): Warning: Protected inheritance is sometimes not a good idea. Read http://stackoverflow.com/questions/6484306/effective-c-discouraging-protected-inheritance for more information.

    Update (10.06.2014, 01:32):
    I managed to create valid D executable of flint through manually compiling newest version of dmd compiler and using it against flint.
    There is no difference in output.

  • How it's made: C++ compilers - slides


    As I recently wrote, yesterday I conducted a lecture about internals of C++ compilers for C++ Wrocław User Group. I'd like to share the slides with the world. You can access them here.

    Unfortunately, audio mixer settings were broken, so the screencast is not published.





  • Yet another interesting dark corner of C++

    Dear C++ experts ;>:

    Is following code correct or is it ill-formed?

    typedef vector<int> Integers;

    Surely, it isn't. What about following?

    vector<int> typedef Integers;

    Is it well formed? It turns out that it is. I've found this information here. I decided to dig in a little.
    It seems that defining multiple type aliases for particular type is fine when the type is same for all aliases. For example, following snippet is not ill-formed:

    typedef vector<int> Integers;
    vector<int> typedef Integers;

    This is suspicious, though. This is nearly the same what is used in Standard Template Library (STL), e.g. in initializer_list:

    class initializer_list {   
    public:
        typedef _E    value_type;
        typedef const _E&    reference;
        typedef const _E&    const_reference;

    Here we can see that initializer_list defines reference and const_reference type aliases, but they actually are of same type. Actually this happened earlier in std::set, where both iterator and const_iterator are constant (otherwise a programmer could interfere on std::set internals).

    The question is - in how many C++ places we can expect similar (unfamiliar) syntactical opportunities? Please share your ideas in comments. From my experience I can add the following.

    struct S {
        void static const meth(int const) {}
    };

    The first interesting thing is that the static keyword occurs in quite weird place. The second thing, far more interesting is the const void type. There's no point to have such a type in C++ language. Or is it?



  • C++11 quirk: Default c-tor & cv-qualified instances

    Recently I found an interesting C++11 quirk.

    http://jaredgrubb.blogspot.com/2013/05/c11-quirks-in-defaulted-functions.html

    According to answers from the stack overflow it comes out that POD objects cannot be used with default constructors if you wish to create const instance of them. And it makes sense, as the const object cannot be changed in future.

    The funny part is when you use `= default` outside of class body everything's fine.

    Interestingly Clang compiler is again closer to the standard than gcc. It doesn't compile following code, whereas gcc does.

    struct A { A() = default; }
    int main(int, char) { const A a; }

    Thanks to Jonathan Wakely it turned out that Clang behavior is the proper one. The C++ standard states:

    "A special member function is user-provided if it is user-declared and not explicitly defaulted on its first declaration. A user-provided explicitly-defaulted function (i.e., explicitly defaulted after its first declaration) is defined at the point where it is explicitly defaulted; if such a function is implicitly defined as deleted, the program is ill-formed." [N3242, 8.4.2/4]


  • How it's made: C++ compilers - Wrocław C++ User Group

    Please feel invited to my presentation on C++ compilers' internals.

    Place: University of Wrocław
    Time: 17.04.2014

    You can find more information here (PL).
  • Measuring C++ code quality - tools and techniques

    My another article was published in last "Programista" journal. It is about tools and techniques that help measuring C++ software quality. It covers many practical aspects like code coverage, cyclomatic complexity, dependencies, etc.

    Cover page
  • Practical aspects of software engineering

    Please feel invited to lectures that I'll conduct at Wrocław universities.

    C++ Templates, STL, and Boost
    University of Wrocław
    26.03.2014, 16:00 - 18:00

    Wrocław University of Technology
    31.03.2014, 17:00 - 19:30

    C++ Coder Dojo Workshop
    University of Wrocław
    19.05.2014, 16:00 - 18:00

    Wrocław University of Technology
    21.05.2014, 17:00 - 19:00
  • django animal captcha

    Some time ago I shared django animal captcha module on GitHub. Some of you may find it useful. Please pull-request/fork if you have ideas.

    https://github.com/szborows/django-animal-captcha

  • Wrocław C++ Users group - presentation on C++03 idioms

    It turned out that I'll be presenting idioms in C++03 this thursday (17.10.2013). Feel free to come.
    You can find details here.

    See you!
  • Article in Polish developer journal "Programista"

    I'm glad to announce that my article "Tricks and idioms in C++" ("Sztuczki i idiomy w C++") was published in polish developer journal, "Programista".
    The journal is available in paper and e-paper forms. Click here for an description.

    cover page
  • Surprising "dark corner" in C++

    UPDATE 21.07.2013: Surprising dark corner turned on to be a compiler bug. Anyway, I think it is worth to know how name lookup works in detail.

    Today was supposed to be a normal developer day. We have found something extremely surprising regarding C++, though. At least for us. Maybe someone will eventually provide reliable explanation on SO as my Google-fu (actually, more duckduckgo-fu recently :-) ) seems to have worsened.

    This is a MWE of what I'm talking about:
     1 template <typename T> class Widget { };
    2
    3 class Base
    4 {
    5 public:
    6 virtual void handle(Widget<float> widget) { }
    7 virtual void handle(Widget<int> widget) { }
    8 };
    9
    10 class Derived : public Base
    11 {
    12 public:
    13 virtual void handle(Widget<int> widget) { }
    14 };
    15
    16 int main(int, char **)
    17 {
    18 Derived * derived = new Derived();
    19 derived->handle(Widget<float>());
    20 delete derived;
    21 }
    The code is not complicated. It is even not complex. We have some empty Widgettemplate class that is used as an argument. Below two classes are defined - Base andDerived. All methods in base class are virtual. Derived class does not provide implementation for a method that accepts Widget<float> as an argument. In main function we simply try to invoke this method.

    We expect that Base::handle(Widget<float> const& widget) is going to be called. However before we run this simple program, we got compilation error.
    $ clang++ polym.cpp
    polym.cpp:20:21: error: no viable conversion from 'Widget<float>' to 'Widget<int>'
    derived->handle(Widget<float>());
    ^~~~~~~~~~~~~~~
    polym.cpp:1:29: note: candidate constructor (the implicit copy constructor)
    not viable: no known conversion from 'Widget<float>' to
    'const Widget<int> &' for 1st argument;
    template class Widget { };
    ^
    polym.cpp:13:37: note: passing argument to parameter 'widget' here
    virtual void handle(Widget<int> widget) { }
    ^
    1 error generated.
    Wait, what has just happened? Did compiler just tried to perform implicit conversion fromWidget<float> to Widget<int>? Why? I'm afraid I can't give you precise answer. This was tested on gcc-4.7.2 and clang-3.0.

    I suspect that compiler finds other methods with similar signatures and tries to match template parameter. Other lookup branches are dropped, so signature is not found in base class too. After it turns out that there is no exact match, implicit conversions are employed. Another option is that in such situations methods are shadowed, but that would be against polymorphism. Those are only my suspictions, do you have other ideas?

    The question is open. So far I'm aware of two solutions.
    16 int main(int, char **)
    17 {
    18 Derived * derived = new Derived();
    19 dynamic_cast<Base*>(derived)->handle(Widget<float>());
    20 delete derived;
    21 }
    10 class Derived : public Base
    11 {
    12 public:
    13 virtual void handle(Widget<int> widget) { }
    14 using Base::handle;
    15 };
    The first solution is rather straightforward. The other one is pretty curious, though.
  • Lectures at Wrocław University of Technology and University of Wrocław

    Next week I'm going to conduct two lectures about C++ "Templates, STL, and Boost". Don't hesitate to come if you hang around.

    First lecture
    Place: University of Wrocław
    Date: 25.03.2013

    Second lecture
    Place: Wrocław University of Technology
    Date: 27.03.2013
  • Handler mechanisms: design patterns

    What really is a handler? What is main handling mechanism? These two questions appear on my mind when I start writing this post. Let me explain meaning of these words before I start, so no misunderstanding can actually happen. A handler is a procedure (stored as a function, method, or functor) that perform individual task according to given parameters. Usually there are a lot of handlers that accept same set of arguments and one main handling mechanism that decides which handler should handle particular "request". Main handling mechanism is simply a procedure that stores references to all registered handlers and executes respective one on demand. Handling mechanism should hide all unnecessary details from the end-user as he does not care how things are really handled. In this post I'll try to depict designs that can be used in order to implement such a "design pattern".

    We start with C. The first thing that comes into mind is switch-case expression. Let's see how actually code looks like by defining three different handler functions and one controller.
    int handlerA(int param) { return param + 1; }
    int handlerB(int param) { return param + 2; }
    int handlerC(int param) { return param + 3; }

    int handle(int code, int param)
    {
    switch(code)
    {
    case 0: return handlerA(param);
    case 1: return handlerB(param);
    case 2: return handlerC(param);
    default: assert(!"No handler registered for provided code");
    }
    }
    Nothing really interesting happens here. When new handlers are needed, the switch statement must also be modified. On the other hand invocation time when running program is really minimal (everything is computed at compilation-time). This approach may be good for small projects that have small amount of handlers.

    Moving forward we can discover another approach - array of function pointers.
    int handle(int code, int param)
    {
    static int (*handlers[])(int) = { handlerA, handlerB, handlerC };
    static const int numHandlers = sizeof(handlers) / sizeof(handlers[0]);
    assert(code > 0 && code < numHandlers && "No handler registered for provided code");
    return handlers[code](param);
    }
    In terms of performance there is no meaningful difference between these two approaches. What we gain is less characters typed on the keyboard and more concise (in my opinion) design. Note that we have to add strict bound checking to avoid SIGSEGVs or other sort of problems.

    We can easily redesign function array approach to use methods. The code would be pretty similar, so I won't write it down here. However, to improve our design we can use boost library as shown below.
    class Handlers
    {
    public:
    int HandlerA(int param) { return param + 1; }
    int HandlerB(int param) { return param + 2; }
    int HandlerC(int param) { return param + 3; }
    };

    int main(int ac, char ** av)
    {
    Handlers handlers;
    std::vector<function<int(int)> > fs;
    fs.push_back(bind(&Handlers::HandlerA, ref(handlers), _1));
    fs.push_back(bind(&Handlers::HandlerB, ref(handlers), _1));
    fs.push_back(bind(&Handlers::HandlerC, ref(handlers), _1));
    ...
    }
    For small project this approach could be inefficient: implementation requires additional dependencies. In bigger projects that already use boost libraries (what is common), this is not a big deal.

    boost::bind documentation says that not using boost::ref results in pass-by-value behavior. I tried that and noticed that single bind produces eight calls to copy constructor! Maybe my opinion could be considered as premature optimization, but I advise to use boost::ref anyway.

    In all previous approaches there were no possibility to add handlers at run-time. Eventually we got to this place. On the other hand, performance is decreased a little bit.

    Okay, I have just mentioned that new handlers can be "registered" in run-time. However it's hard (at least for me) to imagine in what kind of situation this could be helpful. Maybe you do have any ideas? In the meantime let's concentrate on more practical approaches.

    Imagine set of of handlers that register themselves automatically within the main handling class. Adding new handler doesn't require any changes in main handling mechanism. Moreover, there is no central point, where all handlers are defined, so we achieve "distributed" system. Usually when we solve this kind of problems we end up with something similar to code showed below.
    class IHandler
    {
    public:
    virtual void Handle(void * data) = 0;
    };

    class HandlerA : public IHandler
    {
    public:
    virtual void Handle(void * data)
    {
    // Handle data here...
    }
    };

    class MainHandler
    {
    public:
    void Handle(int code, void * data)
    {
    _handlers[code]->Handle(data);
    }

    void RegisterHandler(int code, IHandler * handler)
    {
    _handlers.insert(make_pair<int, IHandler *>(code, handler));
    }

    private:
    map<int, IHandler *> _handlers;
    };
    This approach forces developer to add somewhere a code that will register his class to the central point. How it can be done differently? Take a look at following snippet.
    class IHandler
    {
    public:
    virtual void Handle(void * data) = 0;
    };

    class MainHandler
    {
    public:
    static void Handle(int code, void * data)
    {
    _handlers[code]->Handle(data);
    }

    static void RegisterHandler(int code, IHandler * handler)
    {
    _handlers.insert(make_pair<int, IHandler *>(code, handler));
    }

    private:
    static map<int, IHandler *> _handlers;
    };
    map<int, IHandler *> MainHandler::_handlers;

    template <typename T>
    class AutoRegister
    {
    public:
    AutoRegister()
    {
    MainHandler::RegisterHandler(T::HANDLED_CODE, new T());
    }
    void fun() const { return; }
    ~AutoRegister()
    {
    // OPT: Unregister here...
    }
    };

    class HandlerA : public IHandler
    {
    public:
    virtual void Handle(void * data)
    {
    // Handle data here...
    }

    static int HANDLED_CODE;

    private:
    static AutoRegister<HandlerA> _autoRegisterer;
    };
    int HandlerA::HANDLED_CODE = 1;
    AutoRegister<HandlerA> HandlerA::_autoRegisterer;

    // Example usage:
    int main(int ac, char ** av)
    {
    MainHandler::Handle(1, NULL);
    }
    As you can see there is no registration code inside a class that is a handler. Instead, we have to put two additional fields into the class that will automatically register a class in which they are defined to the central handling mechanism. In other words registration code may be unified and developer can just add few lines in order to define new handler. We can even employ CRTP pattern here, so defining new handler is as simple as just deriving from other type, but I'd prefer to stop things before they become uncontrollable. We must keep in mind that static initialization as shown above has its own problems. Arseny Kapoulkine has written a great post about dangers that come with this approach -"Death by static initialization" and I advise to read this post before implementing anything.

    There are variety of design patterns related to handling mechanisms. Choosing right one strongly depends on multiple conditions. Switch and array based solutions have that advantage that they are thread-safe and relatively fast. When handlers are supposed to be added in run-time we can pick up some OO-based solution with e.g. vector. For convenience some auto-registering functionality may be used, but used with extremely care.

    At the end of this post I'd like to compare C++ solutions with one python implementation - just to see dynamically-typed language in action. Take a look at following three code snippets: handler interface, example handler, and main handling mechanism respectively.
    class BaseHandler(object):
    __metaclass__ = abc.ABCMeta

    @abc.abstractmethod
    def Handle(self, data): pass

    @abc.abstractproperty
    def code(self): pass
    class ConcreteHandler(BaseHandler):
    code = 1
    def Handle(self, data):
    pass # Handle here...
    class MainHandler(object):
    def __init__(self):
    self.__handlers = dict([
    (x[1].code, x[1]())
    for x in inspect.getmembers(sys.modules[__name__])
    if (issubclass(x[1], BaseHandler) and x[1] != BaseHandler)
    ])

    def Handle(self, code, data):
    try:
    return self.__handlers[code].Handle(data)
    except KeyError:
    raise RuntimeError('No handler registered for code ' + str(code))
    That magic is, however, natual consequence of dynamically-typed language such as python which provides, in my opinion, great balance of power and readiness :-).

    Natural continuation of this post should be review of possibilities in C++11. Do you have any remarks in this area? I'm curious how much it can help in such areas. Maybe it can't improve design very much, but I wouldn't be surprised in case of some killer C++11 feature, though. Share your ideas in comments!
  • A perfect continuous integration system

    One of the fundamental concept of agile software development is Continuous Integration (CI). You can read more about it here, for instance. In this post I'll try to define elementary things that in my humble opinion could build a ground for perfect continuous integration environment. Your comments on my concept will be also appreciated.

    For this post I assume GNU environment, and statically-compiled languages like C++. However, such recipe can be easily rearranged to other compilers, languages, and environments.

    1. CI server
    First things go first. In order to facilitate the concept of CI we need some kind of automation software. Amount of this kind of software is fair big, so we can pick up one that fits our needs. The most popular is Jenkins (derivative from Hudson), as it is free and covers a lot of functionality. It is also easily extendable, thanks to over 400 plugins. From the other side, we have commercial applications, like for instance, Bamboo by Atlassian. I can't compare these two, but I assume that such a comparison would be similar to comparing MS Word with OpenOffice.org. For full list of well-known CI servers refer to this list.

    2. Unit & Module tests
    When we have our CI server running we are able to start defining jobs it will take car of. I do not mention basic "build" job, as it is obvious. Every knows that good code should be tested as much as possible. We can distinguish between standard Unit Tests (UT), that are likely to be written by developers and Module Tests (MT), that are preferably written by software testers. Personally I prefer to have one test runner to evaluate all UT and MT tests. This runner should output report in a form that is consumable by the CI server.

    3. Lint for code and documentation
    Before development on specific project is started, it is good idea to establish kind of coding convention and have a tool that will check whether software that is written follows rules specified in that document. This tool can be merged with standard lint tools like, for instance, this one, so hard-to-find errors are reported in the early stages of software development. Examples of such errors are: uninitialized variable, lack of assignment operator while copy constructor is defined, and so on.
    We should also pick documentation system like, for example, doxygen and a tool that will check whether documentation is well written. People tends to ignore that, which is sad, because undocumented code is kind of useless for future maintainers and developers.

    4. Coverage and profiling
    When we have UT and MT integrated into our automation process, we likely are want to know how much code is really tested and what is overall performance of this code. For former case we have tools like gcov that will help us determine final test coverage in percents. For latter case there is available gprof, which output can be reorganized, so CI server can handle it. Example of such a tool is gprof2dot.py. This simply way we are able to see call graphs directly from our CI site and find out bottlenecks in the software we are developing.

    5. Memory leaks hunter
    Another thing that may be useful is tracing memory leaks (in case of languages that are statically compiled). We can use tool like valgrind to perform such operations. By using dedicated plugins for our CI server we are able to see overall trend of memory that was leaked in the software during run-time and figure out from where those leaks come from.

    6. Cyclomatic complexity
    Cyclomatic complexity helps us to determine how complex the software under production actually is. You can find more information about it here. I find this kind of tools very useful, therefore they are included in this list.

    That's it! If you think that something crucial is missing here do not hesitate to write your opinions in comments!