Thursday, March 5, 2015

Some notes on shared_ptr atomicity and sharing configuration state

At PowerDNS, we've frequently run into this problem: a program has a complicated amount of state and configuration which determines how queries are processed, which happens non-stop. Meanwhile, occasionally we need to change this configuration, while everything is running.

The naive solution to this problem is to have a state which we access under a read/write lock. The state can in that case only be changed if no thread holds a read lock on it. This has at least two downsides. For one, locks aren't free. Even if they don't involve system calls, atomic operations cause inter-CPU communications and cache evictions. Secondly, if the worker threads hog the read lock (which they may need to do for consistency purposes), we can't guarantee that updates happen in a reasonable timeframe.

Effectively this means that a change in configuration might take a very long time, while we incur overhead every time we access the configuration, even if it isn't changing.

A very very tempting solution is to keep the configuration in a shared_ptr, and that threads access the configuration through this shared_ptr. This would give us unlocked access to a consistent configuration. And, if we read the C++ 2011 standard, it looks like this could work. It talks about how std::shared_ptr is thread safe under various scenarios. Simultaneously, the standard defines atomic update functions (, which are sadly unimplemented in many modern compilers. This is a hint.

So here is what one would hope would work:
if(!g_config->acl.check(clientIP)) dropPacket();
Where the global g_config would be something like shared_ptr<Config>. If the user updates the ACL, we would do this to propagate it:
auto newConfig = make_shared<Config>(*g_config); newConfig->acl=newACL; g_config=newConfig;
And we would fervently hope that that the last statement was atomic in nature, so that a user of g_config either gets the old copy, or the new copy, but never anything else. And this would be right at least 999999 out of 1 million cases. And on that other case we crash. I know cause I wrote a testcase for this this afternoon.

It turns out that internally, a shared_ptr consists of reference counts and the actual object. And sadly, when we assign to a shared_ptr, the reference counts and the object get assigned to separately, sequentially. And a user of g_config above might thus end up with a shared_ptr in an inconsistent state that way.

By tweaking things a little bit, for example by utilizing swap(), you can increase the success rate of this mode of coding to the point where it fails almost almost never. You could fool yourself you solved the problem. Over at PowerDNS we thought that too, but then suddenly CPUs and compilers change, and it starts breaking again, leading to hard to debug crashes.

So, to summarise, whatever the C++ 2011 standard may or may not say about shared_ptr, as it stands in 2015, you can't atomically change a shared_ptr instance while someone tries to use it.

And of course we could add an RW-lock to our every use of g_config, but that would get us back to where we started, with heavy locking on everything we do.

Now, in general this problem (infrequent updates, non-stop access) is very well known, as is the solution: Read Copy Update. I'm not a big fan of software patents (to say the least), but I'll lovingly make an exception for RCU. IBM released the patent for use in GPL-licensed software, and unlike most patents, this one doesn't only prohibit other people from doing things, RCU also tells you exactly how to do it well. And RCU is sufficiently non-obvious that you actually need that help to do it well.

Now, the full glory of RCU may be a bit much, but it turns out we can very easily get most of its benefits:

  • Lock the g_config shared_ptr before changing it (this can be a simple mutex, not even an RW one, although it helps) 
  • Have the threads make a copy of this g_config ten times per second, fully locked. 
  • The threads actually only access this (private) copy
This means that if the configuration is changed, the operational threads will continue with the old configuration for at most 0.1 second. It also means that no matter how staggering the overhead of a lock is, we incur it only ten times per second. Furthermore, since the lock is only held very briefly for a copy, the updates will also happen very quickly.

In this way, we don't rely on unimplemented atomic shared_ptr functions, but we do get all the benefits of almost completely unlocked operations. 

UPDATE: Many people have pointed out that instead of "10 times per second", do the update if an atomic "generational" global counter no longer matches the local one. But some potential synchronisation issues linger in that case (you might miss a second very rapid change, for example. So while interesting, we do lose simplicity in this case.

UPDATE has the code for this idea

Summarising: don't attempt to rely on potential shared_ptr atomic update behaviour, but infrequently copy it it, but frequently enough that changes in configuration propagate swiftly, but not so frequently that the locking overhead matters.

Enjoy! And if you know about the implementation plans and status of the atomic_load etc family of functions for shared_ptr in the various popular compilers, please let me know!

UPDATE: Maik Zumstrull found this thread about the atomic shared_ptr operations in gcc.

Friday, February 13, 2015

Some notes on sendmsg()

This post is mostly so other people can save themselves the two days of pain PowerDNS just went through.

When sending or receiving datagrams with metadata, POSIX offers us sendmsg() and recvmsg(). These are complicated calls, but they can do quite magical things. They are the "where does this go" part of the socket API. And a lot went there.

For example, when you bind to or ::, you receive datagrams sent to any address on the port you bound to. But if you reply, you need to know the right source address because otherwise you might send back a response from another IP address than received the question. recvmsg() and sendmsg() can make this happen for you. We documented how this works here.

So, we learned two important things over the past few days.

Requesting timestamps
To request timestamp information, which is great when you want to plot the *actual* latency of the service you are providing, one uses setsockopt() to set the SO_TIMESTAMP option. This instructs the kernel to deliver packets with a timestamp describing when the packet hit the system. You get this timestamp via recvmsg() by going through the 'control messages' that came with the datagram.

On Linux, the type of the control message that delivers the timestamp is equal to SO_TIMESTAMP, just like the option we passed to setsockopt(). However, this is a lucky accident. The actual type of the message is SCM_TIMESTAMP.  And it only happens to be the case that SO_TIMESTAMP==SCM_TIMESTAMP on Linux. This is not the case on FreeBSD.

So: to retrieve the timestamp, select the message with type SCM_TIMESTAMP. If you select for SO_TIMESTAMP, you will get no timestamps on FreeBSD.

Datagrams without control messages
Secondly, sendmsg() is not a very well specified system call. Even though RFC 2292 was written by that master of documentation Richard Stevens, it does not tell us all the things we need to know. For example, we discovered that if you use sendmsg() to send a packet without control messages, on FreeBSD it is not enough to set the length of the control message buffer to 0 (which suffices on Linux).

FreeBSD in addition demands that the control message buffer address is 0 too. FreeBSD has a check if the length of the control message buffer is at least 1 control message, unless the address of the control message buffer is 0.

So: if you use sendmsg() to send a datagram without any control messages, set both msg_control and msg_controllen to 0. This way you are portable.

We hope the above has been helpful for you.

Sunday, January 18, 2015

On C++2011, Quality of Implementation & Continuous Integration

Over the past years, I've done a few projects outside the main PowerDNS tree, and for all of them I've used C++2011. A wonderful mark of how big of an improvement C++2011 is, is how much pain you feel when you return to programming in 'regular' C++.

Recently, I've started to note that the new powers of C++ can either translate into better productivity (ie 'more functionality added/hour of work') or perhaps more importantly, in higher quality of implementation ('QoI').

And this drove me to ponder the concept of QoI a bit, as I think it is underrated compared to writing fast (by some measure) and bug free code.

I recently had to pick a reasonable value as an estimate while writing C++03 code, and I found that my fingers considered it too much work to actually scan three vectors of objects to make a decent estimate. As a result, the code ended up with a hardcoded number which (for now) is reasonable.

This is not quality of implementation. For example, a low QoI implementation of a generally useful memory allocator functions well for the amount of memory the author used it for - say, 1 gigabyte. Unbeknownst to you, lots of the inner workings are overkill when on an embedded platform, for example an O(N) algorithm that is actually pretty slow for small N. Meanwhile, other parts of the library might scale badly to 2GB of memory arena.

A high QoI implementation of a generic memory allocator would find ways to scale itself to multiple domains of scale. It would not surprise you over the years as your project evolves. It would adapt.

We often hear (correctly) 'make it work, make it right, make it fast'. QoI is the part of making it right. It might in turn also make your code fast!

In my example, C++2011 would've allowed me to scan three different vectors like this:

for(const auto& vec : {vecA, vecB, vecC}) {
  for(const auto& entry : vec)
     totSize += entry.length();

Whereas the equivalent in C++03 is something like:
unsigned int countVec(const vector& vec)
    int ret=0;
    for(vector::const_iterator iter = vec.begin(); iter!=vec.end(); ++iter)
       ret += iter->length();
    return ret;

... lots of code in between ... 
totSize = countVec(vecA);
totSize += countVec(vecB);
totSize += countVec(vecC);

You can see how the second variant might not happen ("100kb of entries is a good guess!").

If for this reason alone, I expect my C++2011 code to not only be more pleasing to read, but also to deliver higher quality of implementation.

It is therefore that it pains me to report that in 2015, I can't find a Continuous Integration provider taking C++2011 seriously (for free or for money).

Travis-CI which I otherwise love dearly uses an antiquated version of g++ that doesn't do C++2011 at all. If you modify the platform into installing g++-4.8, you find that the supplied version of Boost predates C++2011 and fails to compile. The deployed version of clang fares better, but can't do threads, and bails out the moment you #include anything thread related.

Meanwhile, Circle CI does ship with a slightly more recent gcc (but not recent enough), but for some reason uses a version of Ubuntu that can't install 'libboost-all-dev', or even 'libboost-serialization-dev'.

I spent some hours on it this morning, and I'm sure there are solutions that don't involved compiling Boost for every commit, but I haven't found them yet.

So pretty please, with sugar on top, could the CI platforms up their game a bit? If the goal of CI is to quickly find bugs and issues, they should surely feel motivated to support a language that offers ways to do this.


Sunday, December 7, 2014

Picking open source technology: pick the community too

A brief post on picking (open source) technology. When we need something for our project, say, a database, we have a large range of choices. There is old and staid software that may not be hip, but comes with a well known track record and lots of features. There are new shiny frameworks that are blazing trails and publishing impressive benchmarks, written in languages still being standardized.

What do we pick? First of course, our dependency has to meet (most of) our needs. Next, we can look at its current reputation - if a project is known to be a bloated mess, or is well known to crash if you wave at it, we can discount the project for now.

But, even then, we are left with a lot of choice. What I care about these days is the community around a project. Because it turns out that the community is the best predictor of the future of a project. We can’t actually predict the future, but we can be sure that we’ll have new needs for our dependencies. We can also be sure we’ll have questions and discover bugs.

And a healthy community almost guarantees that things will end up well. As a recent example, PowerDNS has recently become involved with a customer where we are helping setup a PowerDNS based anycast environment. For this we needed a BGP implementation. A quick consultation with our community reported that ExaBGP would be a very good choice, and indeed, it offered all the features we needed.

On deployment however, we found two small issues that were holding up our deployment. PowerDNS employee of the month Peter van Dijk worked with the ExaBGP people, fixed the two issues, and both fixes have now been merged by the ExaBGP project. They are happy, we are happy. The next release of ExaBGP won’t just meet our needs, it will suit the needs of many more people.

Last week, a developer reported that the most excellent Valgrind tool found some potential errors in the LMDB project, and I was shocked to see LMDB lead developer tell the reported to ‘learn how to use his tools’, and not address his report in any meaningful way.

I asked Howard Chu to reconsider his stance, given my belief that open source can’t be great without a functioning community, something you don’t build this way. Howard told me that how he treated the reporter was entirely intentional. Shortly afterwards, the following was posted on the LMDB list:

“if you post to this list and your message is deemed off-topic or insufficiently researched, you *will* be chided, mocked, and denigrated. There *is* such a thing as a stupid question. If you don't read what's in front of you, if you ignore the list charter, or the text of the welcome message that is sent to every new subscriber, you will be publicly mocked and made unwelcome.”

This is entirely intentional and by design.”

Now of course, everyone is free to run their project in their own way. But I’m also free to pick my dependencies, and I care about the development community being sane and inclusive. “Bitchslapping” reporters of potential valid issues isn’t that. Formalizing this behaviour most definitely isn’t. I won’t be picking LMDB for any future projects unless this changes.

As a parting gift to LMDB, I worked to document the (powerful) semantics of LMDB, and dragged details out of Howard. This document can be found on If you use LMDB, it may be helpful for you.

I spent this evening working with PowerDNS contributor Ruben Kerkhof to merge several of his fixes for PowerDNS issues. Ruben’s company Tilaa uses PowerDNS, and we are Tilaa customers. I’m truly proud of our community and what we achieve together, and I recommend that every open source project works to foster its community as well.

So in short.. before picking a technology to depend on, investigate how they deal with feature requests, bug reports and questions. What you will learn is the best predictor of how the project will serve you (and vice versa!) over the coming years.

Monday, November 3, 2014

Tin cans, can openers and solar power: explaining the snail’s pace of innovation

Innovation is fascinating. It brings us (by definition) all improvements in technology. Simultaneously, it proceeds at a snail’s pace. An example I love to share is the invention of the tin can, which was a breakthrough in food preservation (there were no refrigerators at the time, and tin cans allowed people to safely transport and store previously perishable goods).

Made practicable in the early 19th century, tin cans were sold bearing notices like “Cut round the top near the outer edge with a chisel and hammer”. And then.. it took over 50 years for the first home operable tin opener to arrive in 1870 (in the 1850s, an opener was available for use in stores, who would open your cans for you). The modern easy to use and safe opener was only invented in 1925.

Until the arrival of the opener we know today, opening tin cans was a dangerous exercise. In fact, it was a common cause of injuries and infections, highly dangerous before the advent of antibiotics. Yet it took decades before the problem was solved, even when the need for a solution stared people in the face!

This example is historical, and hard to come to grips with - why did no one invent the current can opener earlier? The mind boggles. Here’s a far more recent example and one that is still developing. It may help shed light on why innovation proceeds so slowly.

(It looks like this post is about how much I like solar, and while this might be the case, the point I’m trying to make is not that solar power is great. The point is to elaborate on why innovation operates so slowly, using a currently happening development. In fact, if you find yourself disagreeing with my description of the bright future of renewable energy, ponder where your disagreement is coming from. That very disagreement likely is what this post is about!)

Renewable energy

Renewable energy sources have long been treated with derision. “What about when there’s no wind and no sunlight?”. Industry professionals and engineers alike shared this disdain, because as seen through their eyes, solar power compares very badly with (for example) a coal powered plant. The coal powered plant delivers energy against low prices (perhaps 3-7 eurocents/kWh), and does so continuously within a short few hours of turning it on, which you can do at any time you desire.

If you hold this next to solar power, photovoltaics do come off very poorly. They roughly operate in three modes - no output (night), 10% output (overcast) or 80-100% output (direct sunlight). One can try to predict when they’ll deliver this output, but you’ll never get it quite right. And, to add insult to injury, the production cost per kWh currently is a lot higher than coal powered plants! So not only don’t you know if and when you’ll get energy, you have to pay more for it too. A mostly similar story applies to wind power.

The upshot of the intermittent nature of availability is that by the time you’ve built enough solar and wind farms to cover your average needs, you’ll have generated vast overcapacity for when the sun shines and it is windy. Your capital costs meanwhile have also been huge compared to boring old power plants.

In short, from the viewpoint of the seasoned energy professional, solar and wind power makes for very unreliable expensive capacity that generates imbalance between power needs and power production. The professional then points out that energy storage (which could average out the imbalance) is prohibitively expensive, and won’t save the day. Repeat this story for decades and decades, and you are in 2008 or so.

Personally, I was fully on board with this line of thinking. I advocated the development of large fusion or even fission plants as the (remote) future.

Then, politics happened

Then, through various plans that weren’t too well thought out (I think), all of a sudden whole countries started building heavily subsidized wind farms and suddenly cheap roof-top solar panels at a massive clip. Germany comes to mind. The Netherlands and other countries are now following behind, even with dwindled subsidies.
How come? Well.. the original energy professionals were thinking entirely on the production side (and who can blame them, that’s where they worked!). The reality is that in many countries, no consumer actually pays anywhere near 3-7 (euro)cents/kWh for electrical power. Transport costs and heavy taxes increase actual ‘grid costs’ to over 20 eurocents/kWh in most parts of Europe.

And since producing energy you use yourself is untaxed, domestic energy users compare the production costs of solar against the delivery and taxed costs of grid power. And guess what, since there is a factor 3 to 6 between these numbers, all of a sudden ‘grid parity’ arrived. Unsubsidised solar energy is now cheaper than grid power in many places.

Google trends for “grid parity

Note how the concept of grid parity, or at least the term, burst on the scene in January 2008 (the Wikipedia page was created in June of that year). Before that date, the old view of solar power as a sort of retarded coal fired plant prevailed. Production parity was what mattered. Grid parity, which is what matters for consumers, was not a factor. (By the way, I checked - the flatline of the graph pre-2008 is real, and does not reflect a lack of data).

Grid parity is when solar is interesting for consumers - production parity is only relevant to power producers.

The new reality

What the staid electricity people thought unacceptable (and perhaps unimaginable) is now the new reality. The centralized generation of power has become an uneconomic and highly ungrateful enterprise to be in. If the sun shines and the wind blows, there is literally no need for the coal fired plant. Effective energy prices now routinely plummet to 0 cents, and have gone negative on occasion.

Note zero pricing around 03:00 and negative prices around 14:20

And while this is clearly unsustainable (nobody can generate power this way and stay in business), this reality isn’t going to go away, although some localities have so far succeeded in outlawing (Cyprus) or heavily taxing (Hawaii) grid connected solar panels.

The practical upshot though is that even today, substantially lower amounts of coal and gas are being shoveled into power plants.

Meanwhile, wily entrepreneurs have cottoned on the new reality of highly variable energy prices, and are scheduling their business needs around them. Processes that lose money at 20 cents/kWh make a lot more sense at 3 cents/kWh! This is the future, and attempting to emulate the flat power prices of the past century is not.

In another shocking development, a decade ago, California suffered from brownouts and even rolling blackouts, mostly on hot days when lots of air conditioners were running. Now it turns out that a lot of this can be blamed on energy fraud by Enron, but the problem of peak energy use was real.

Welcome to the new reality - already in 2012 the peak was mostly gone, and in 2014 it will have become a dip. The new challenge is the ramp-up when the sun goes down. All of this because in 2012 photovoltaics met the concept of ‘grid parity’, a concept very rarely discussed before 2008.

Only now that minds have been liberated, new thinking has arrived. One could for example ponder potential mass adoption of electrical cars, cars that will need to be charged before their owners drive off with them, but conceivably could charge ‘on demand’ during the day. They could even deliver power during unexpected peak net-demand periods (!). Nearly free energy and “the internet of things” could make this a reality.

Lots of other things could happen too, for example electrical heating of houses currently heated by gas, electrical (on demand) heating of water reservoirs for residential use (‘peak shaving’), bulk energy storage centrally, regionally or even in homes, etc. It is a whole new world out there. How things will end up is hard to predict, but we can already be sure that everything will be cheaper and a lot friendlier to the environment. And this leaves undiscussed the geopolitical impact of a world far less reliant on oil and gas!

All this thinking was previously rejected out of hand, “because it did not act like a power plant we know”.

What this means for innovation

In hindsight, the commonly held belief that solar (or renewable energy in general) will never fly until it performs exactly like a traditional power plant was and is ludicrous. But you still hear it today. A lot. And when you hear it, ponder the decades it took to invent a reliable can opener. Innovation happens at a snail’s pace, and only appears obvious in hindsight.

This inability to see the beyond the present concepts appears to be a feature (bug?) of human nature.

If you are personally trying to innovate, read up on the history of previous inventions, and see how they were rejected by experts and professionals. Read on until you get a feel for proper rejection (“the flying car” with present day technology) and failure-of-the-imagination rejection (British Post Office comment on phones “no need, we have lots of messenger boys”).

To innovate, we must free yourselves of dearly held convictions and concepts. Ironically, the people with the most knowledge and thus the best credentials to invent the future tend to cling fastest to previously held concepts. This is the challenge of innovation - using the knowledge out there, but without having it hold us back.

Good luck!
(Thanks to Nicholas Miell, Tsjoi Tsim and Jan van Kranendonk for editing & constructive criticism - all mistakes remain mine!)

Tuesday, August 19, 2014

The absolute minimum difficulty recipe with maximum impact: authentic chicken soup

I like cooking, and I like to share that joy. If  you can’t cook however, your first efforts are likely to be mediocre, which is not very encouraging. “Why bother?!”.

In this post, I’ll share something that can barely be called a recipe, that’s how simple it is. BUT! The results are entirely authentic and spectacular. You can take this with you to a home cooked dinner, and your friends will be very happy with your efforts. They will ask you for the recipe! Little will they know how easy it is. Also, it is almost impossible to mess this up. I dare you to try ;-)

Ingredients: 10-20 raw chicken wings (plain or spiced, NOT breaded!), 2-4 onions, a few large carrots (or more smaller ones), salt, pepper.
Required equipment: oven-proof container, oven, pan, stove, strainer or colander.
Time spent in the kitchen: 15 minutes

Pre-heat oven to 180C (350F). Meanwhile, put the chicken wings in the oven-proof container. If you got plain wings, add some salt and pepper.  If you have it, sprinkle some oil over the chicken wings.

Put container with chicken in oven (if at temperature). Then you can wait 25 minutes, you can wait 35 minutes, you can wait 45 minutes, and in every case you’ll have some pretty good chicken wings. If after 25 minutes they look ready to eat, they ARE. Do not eat unless thoroughly hot inside!

(if you don’t have an oven, you can also fry the chicken wings in a pan on a stove, works just as well, but you need to pay some closer attention to getting them browned all round and well done inside. Will require more oil).

Now for the surprising part. Eat as many of the chicken wings as you feel like. No need to finish them all. This was not your special dinner ;-)

Next, fill a pan with a few liters (quarts) of water, then throw in all your chicken wings, the ones you ate, the ones you didn’t. Peel onions, cut in large chunks, add to water. Chuck the carrots in there too. Bring water to a boil, turn down heat and leave to simmer for 3, 4, 5, hell 6 hours if you feel like it. If during the boiling you note brown stuff floating on the water, remove that with a spoon. Your house will smell lovely meanwhile.

Afterwards, use colander or strainer to filter out all the bones and pieces of meat. (You can add back the pieces of meat if you want.) Now taste the soup and add salt to taste. This part is not optional, it really needs salt.

That’s it! You can take the soup with you to a party and heat it there. In a fridge it will last for days.

Note: to improve on this soup, add more vegetables (doesn’t really matter which ones), or instead of chicken wings, use an entire chicken.

Monday, August 11, 2014

Some tips when travelling through France: toll & internet

Brief post so other people can find this. If you are going on holiday in France by car, I can highly recommend getting a 'telepeage' badge, which allows for contactless payments.  This means you can pay for the highway tolls without the hassle of tickets and credit cards at booths. The telepeage lanes go a lot quicker than the ones for people who need tickets. Many of them you can pass at 30km/hour even! In The Netherlands you can get a badge at the ANWB before you go. I'm also told you can buy them at the largest toll stations.

Secondly, if you want internet, Orange has several interesting options.  There is the "Let's Go" offer which is attractive. It is a WiFi box which can offer internet service to up to 5 devices. It does HSDPA. There's also a 4G offer, which is even cooler.

However, this being France, so life is not made easy for you. When you buy the box, inside you find a form you have to fill out and send in with a copy of a photo ID. Until this has been processed, you can't top up your 500MB internet credit. In my experience, this processing takes more than a week at least.

It so turns out that the store where you buy the Let's Go offer can also verify your ID on the spot, but nobody reminds you of this. So, when you buy it, insist that they register you too. Even then it takes 48 hours to process.

Secondly, even though the "Let's Go" offer is geared for tourists you can only top up your internet credit.. using a France-domiciled credit card!! I'm not making this up. You can however buy credit at "Tabac" shops or Orange boutiques. Not all Tabac shops know their Orange terminal can do this though.

I would recommend that when you buy the Let's Go box, you take care of the identification, and also immediately buy a voucher to top up. For 35 euros, I got 3500MB of internet, which should suffice. (Unless you forget to turn off cloud backups of all photos you take..)

Finally, although many shops are called 'Orange', I found that the ones labeled as 'Partner' on the Orange website do not carry the Let's Go offer. So head one that is actually Orange owned.

With the advice above.. you may save yourself the 6 visits to various Orange stores I needed to get functioning internet access!