bert hubert finally blogs

Monday, July 15, 2013

A common support anti-pattern: the stale issue that comes back to haunt you

So, here's a scenario if you are supporting users of (your) software or systems. An urgent issue is reported, and you get to work on addressing it. After a while, a workaround is discovered and for now, the problem has gone away. Or, what also happens frequently, the problem goes away by itself.

As a diligent supporting organization, you might 'ping' the user once to figure out of they are happy, and perhaps you still have some outstanding questions for them (log files, packet traces, versions installed etc). But otherwise, both user and vendor move on to more pressing issues, and we don't get to the bottom of it. It is not in most organizations' nature to focus on things that are not broken.

Time passes, and maybe a few months later, the customer is fuming, the issue is back, "and we reported this MONTHS ago, we have a 2 hour SLA, and it STILL isn't solved!" The blame is put squarely on the vendor, because the individual corporate employee most certainly isn't going to blame himself. It is just not done, and this is to be understood.

Meanwhile, you or your people dig out the old email exchange and note that "well yeah, but you didn't get back to us on X!", or the weaker variant "the workaround worked, and you went silent on it".

Escalation ensues, and it is noted that a more professional support organization would've kept nagging about the open question, or working on (what appeared to be) the low-priority remaining issue.

By now everybody is seriously pissed off at each other.

This anti-pattern is well known, and occurs everywhere. A common first-order approach to prevent it is for supporting organizations to attempt to proactively close issues that aren't progressing.

This sometimes works, but most often it makes the customer feel that their vendor is trying to artificially "solve" the issue, and not actually help.

Additionally, it doesn't feel good for people to have to agree to non-solved issues to be 'closed', or even 'solved'. In corporate environments, such things might come back to haunt the employee ('why did you sign off on that?!').

So, often a low-level stalemate develops where the customer is unwilling to spend time with the vendor to get to the bottom of the issue, but also not agreeing to close it. And a few months down the road, BOOM, "this problem STILL isn't solved, and we've been at it for MONTHS!".

Neither side wants this, but it keeps on happening, and it keeps on pissing people off. It is human nature and corporate realities working against us.

So - what is the solution? Clearly we need some indication that is acceptable to all sides, but saves a lot of shouting later on. One suggested way to achieve this is to add another status flag to an issue: 'Paused'. This does not in any way imply the issue is solved, or unimportant, or that anyone has agreed the fault is on their side.

It means what it says - this issue is paused. And if later on the problem becomes urgent again, it can be unpaused. Of course, the people that now shoulder more of the blame won't be too happy about it, but at least there is a reflection of the fact that *nobody* was working on it.

Supporting organizations meanwhile should remind supported users to respond to outstanding questions, and note that it is perfectly fine to agree to 'pause' the issue. This might even happen automatically after a few reminders.

So summarizing, by not angering people by closing issues the user is not actively working on, but by adding a 'Paused' status, when the problem resurfaces, we can all get to work faster because the mutual screaming about issues being left unresolved for months 'while we have a 4 hour SLA with you!'

PS: And yes, if you really think this post is about you.. it might well be ;-)

Sunday, July 14, 2013

A "null result" bonus to improve science & science reporting

Every week we get at least one, but usually more, hype filled press releases & news items about how certain foods, medicines or lifestyle choices will either kill or save you. The vast majority of these weekly claims don't turn out to hold water.

As examples lifted from this actual week, I offer;

Fish oil supplements linked to prostate cancer
Not quite: this was observation of people with existing prostate cancer, in all likelihood influenced by their earlier diagnosis. More analysis here.
Diet soda drinkers might get fatter and unhealthier than their 'regular' soda counterparts
Not quite: this was an 'opinion' piece full of theoretical ways this might be the case, not backed up by actual research. More analysis here.

If you actually spend time on the press releases and underlying papers (if they even exist!), you often discover that:

there is no actual (new) research to back up the claims, or
that the claims bear scant relation to what is in the paper, or
that the data has been massaged heavily until some correlation popped out (and massaged & weak correlation is pretty far from causation, and most often proof of no causation).

These days, the discerning internet user can find sites that take the time to debunk over-hyped claims, but the brave souls dissecting the research behind the headlines will always be 'late', and secondly, they don't make Fox News or the New York Post.

So, the average person worried or interested in her health is bombarded by multiple confusing and conflicting headlines per week. This does nothing to improve our actual health, and in all likelihood worsens it ("forget that, the story changes every month").

What is behind this avalanche of weak or even bogus results in the news? It goes like this. Scientists perform expensive research, and very often, nothing spectacular comes out. Healthy people are healthier, people that exercise have lower blood pressure, folks that do things in moderation do lots better etc.

Scientists are people too, and they have to justify their work, so they start the first round of trawling the data. And if you've measured enough, some interesting correlation always pops up! To counter this, Bonferroni correction should be applied to statistics, but not doing is so a common but helpful oversight. I mean, the research was expensive enough, something should come out!

So we have a claim, for example: 'Overweight post-menopausal women with pre-diabetes who eat fifth quintile amounts of avocados have lower insulin resistance'. This is typically what you'll find in a research paper, and where such a claim (had it survived Bonferroni correction, which it would likely have not) actually is worth reporting. Meanwhile, the claim is flagged with 'p < 0.05' which means the result is statistically significant; in actual effect, the impact can still be clinically insignificant (and often is).

Next, the research institute also wants to look good, so its PR department takes the paper, speaks with the scientists and writes a press release: "Benefit of eating avocados on insulin resistance, preventing diabetes". Note that they lobbed off all the qualifications, plus extrapolated the claim into preventing disease.

Finally, journalists fed this press release are eager for clicks on their articles, so they liven up the press release with some further human interest quotes and headline the piece: 'Scientists say: Eat avocados to ward off diabetes'.

And there we go - from an investigation with no really significant results, we end up with a pretty stonking headline with incorrect advice.

So what do we do?

Here's an odd idea. Zappos, an online shoe store, has a 'quit now' bonus for new hires. If after training you decide to leave, the company pays you $3000. The net effect of this is that people have an incentive to leave if they feel Zappos is not going to be a great place for them.

And, although I don't know how it works in practice, in theory this should be a big win - anyone who stuck around against their will but thus inticed to leave will 1) not be a drag on Zappos 2) be able to move on to better pastures all the quicker.

The relevance to our scientists feeling pressured to publish should be obvious. Launch a fund, perhaps at department or institute level, or make it a national prize, for researchers honest enough to claim 'no significant results' from their research if there were none.

Compare the (at best misleading) headline 'Eat avocados to ward off diabetes' with 'Different levels of fruit consumption did not meaningfully change levels of diabetes among 3500 randomly selected staff of healthcare institutes'.

The latter headline would admittedly not make the evening news. But it would allow investigators to move on to new research, and not further confuse the public. And very importantly, it would also make sure that even negative or null results make it to (the academic) press.

As Ben Goldacre of www.alltrials.net often points out, not reporting unwelcome results leads to a statistical excess of positive results, thus "proving" that ineffective treatments actually work!

Now, I admit the details of this 'Zappos prize' would be daunting, and it would also require a significant fund to have any impact. It would need prestige too - scientists (who, as noted above are people too), are less swayed by money than most.

But something has to change. Today, mediocre research grabs the headlines while researchers honest with themselves struggle to get their voices heard!

Your thoughts are more than welcome.

Thursday, June 20, 2013

How to give a decent presentation

Hi everybody,

I frequently attend conferences, and about as frequently give presentations there. Sadly, over the years, I've seen many smart and gifted people struggle to share their work with their audiences. Luckily, over time, watching & doing presentations has taught me a little bit about what makes a good presentation.

In general materials on presentations, there is a lot of emphasis on using the right fonts, maintaining eye contact with the audience and otherwise being "convincing". Such advice is of little use for the attendants & presenters at technical conferences though.

We want good content, not suave presentations! And that is a good thing since many of us in the tech community tend to be a lot better with computers than with being 'flashy'.

This year we'll again be seeing the four-yearly cycle of great hacker conferences in The Netherlands continue with OHM2013: Observe Hack Make. These events are volunteer organized, and as part of doing my bit, I thought I'd compile what I've learned about doing presentations. This will make me feel less guilty also when I see people digging trenches etc.

On http://tinyurl.com/decent-presentation you can find a Google document that contains a presentation on doing just that: giving a decent presentation. And, since slides can't and shouldn't tell the whole story, I've narrated this presentation here and here on YouTube.

This presentation outlines a process of getting to great content (and also touches on how to present that content well). This process starts with answering questions: WHY, WHO, WHAT and HOW. The WHY and WHO determine WHAT to tell, and at which level of knowledge your presentation should start.

The HOW tells you how to replicate your knowledge in the minds of the audience.

At OHM, I too will be presenting, and as an example, I'll go through these four questions here for my presentation on "What you need to know about what you eat: health & weight".

WHY: We all get more and more obese, even people perfectly following government advice on how to eat and exercise. Over the past decade, a new consensus has arisen on why we get fat, and we now know that the conventional wisdom has it all upside down, and is making us sick. I'm presenting because I want to share what I've learned in order to let everybody share this new knowledge, so we can save their health!
WHO: Hackers with shoddy exercise and eating habits. Many of us where at GHP and HEU over 20 years ago, and I can tell you, the hacker community.. is getting bigger (or at least larger). Especially us 'older' folks are starting to care about what we eat and do. The audience will care, but will not necessarily know the finer distinctions between cis- and trans-fats etc.
WHAT: We have one hour, so we can't explain the full modern nutritional theory. So, we'll be explaining basics, plus specific things people can do to improve their health. Also, pointers so people know where to go to learn more.
HOW: I have to build it up. If I just get on stage and start ranting about glucolipotoxicity, nobody will know what to make of the story. Introduction is my own story, and that of my family. This makes it personal and interesting. Then we demolish the conventional wisdom with powerful and horrifying graphs. Next we explain some basics that make it obvious current advice is all upside down. Then, once that is clear, clarify what does work. Finally, we round off with a highlight of the most interesting people, books & groups.

With these questions answered, I know what content I need to write, what pictures and graphs I need to gather, and how to keep people paying attention!

It is my sincere hope that if you'll present at OHM, or at any other geeky conference, that you'll be able to benefit from the presentation, and that you'll be better able to get your ideas across!

Finally: to anyone aiming to present at OHM, please contact me if you think I could help with your presentation, for example, by brainstorming on WHO you'll be explaining to and WHAT!

Thursday, June 13, 2013

Some notes on medical statistics

Over the past year, I've been reading more and more about the causes of obesity and the (related) epidemic of diabetes, since both run in my family. In my readings, I've encountered a lot of dodgy statistics to bolster research claims.

Statistics allow us to make statements like 'the chance that these dice are unfair is less than 1%', based on throwing them n times and observing the results. We call such results 'significant', where the threshold for significance is often set at '5% chance of results being random and not because of some effect'.

(and for the statistics professionals, I know my terminology is sloppy. Have this comic to make up for it:)

http://xkcd.com/795/

The world of medical research also tries hard to do statistics, and by and large fails at this. Partially this is due to a misunderstanding of how statistics work, and partially this is a problem of language.

For example, a pill which causes a 1% absolute reduction in the number of heart attacks in a population can easily result in a 'statistically significant effect'. This is because we might be *very* sure that the "Odds Ratio" of having a heart attack is 0.99 and not 1. "p < 0.05". This number is not clinically significant though, or more concretely, it is an irrelevant number.

Public relations departments, funding considerations and industry relations however just scream to turn this mathematical, statistical significance into a bold press release reporting an actual significant medical advance.

However, since heart attacks are rare, hundreds of people would spend decades taking this particular pill before a single actual heart attack would be prevented. And who knows how many side effects there would have been! So, statistical significance does not equal practical significance.

A far better metric is called The Number (of people) Needed To Treat (NNT) to get benefit. For example, the NNT of common painkillers for treating a normal headache is very close to 1, since they almost always work.

The NNT is far more powerful than "relative statistical significance". For example, although 25% of the over 45 population in the US is now prescribed statin pills, its NNT for preventing a heart attack for people without prior heart disease is 300 person years, or, described differently, if 60 of those people take statins for 5 years, 59 of them receive no benefit. All 60 are at risk for potential side effects however.

The NNT for preventing a *fatal* heart attack in this population is in fact immeasurably high ('infinite'). For people who have had a heart attack already, the NNT for preventing death is around 80 for 5 years.

There is also "the NNT for harm", which for statins is about 10 after 5 years. In other words, of those 60 people treated for 5 years, 6 of them would have a serious side effect.

The NNT & NNT for harm are medical statistics done right; and it is therefore no surprise these numbers are exceedingly unpopular in press releases and articles.

So next time you read about a medical breakthrough.. look beyond the reported statistical "significance" and see if you can find the NNT.

Some good links for further reading:

NNT numbers for many treatments & diagnostic tests can be found on the most
excellent site www.thennt.com.
Wikpedia: http://en.wikipedia.org/wiki/Number_needed_to_treat
The Cholesterol Conundrum: http://www.saturdayeveningpost.com/2012/04/24/wellness/cholesterol-conundrum.html
Bad Pharma book: http://www.amazon.co.uk/dp/0007350740/ref=nosim?tag=bs0b-21

Monday, May 6, 2013

How to discover if an IP address is yours

A quick post - sometimes you need to know if an IP address is yours. One way of figuring this out is to ask the kernel to give you a list of all IP addresses it considers local, and go from there. This is pretty laborious however, and requires special processing for 127.0.0.0/8 for example, *all* of which is local.

Another way which I heard of uses getsockname(2), a call which determines the local address of a socket. If you setup a connection, the kernel will automatically pick the most appropriate source address for you. And should you be setting up a connection to yourself, the source address will be identical to the destination address!

This way, you can easily detect if you own an IP address. The initial downside is that this appears to require sending packets, but it turns out you can avoid this by connect(2)ing a connectionless datagram socket.

The final sequence is (minus error checking):

int s = socket(AF_INET, SOCK_DGRAM, 0);
connect(s, (struct sockaddr*)& remote, sizeof(remote));
struct sockaddr_in local;
socklen_t socklen = sizeof(local);
getsockname(s, (struct sockaddr*) &local, &socklen);
return local.sin_addr.s_addr == remote.sin_addr.s_addr;
// return memcmp(&local.sin6_addr.s6_addr, &local.sin6_addr.s6_addr, 16)==0;

This trick is described in Steven's Unix Network Programming volume one, section 8.14.

Wednesday, May 1, 2013

PowerDNS now has its own dedicated blog

Dear readers,

PowerDNS now has its own blog, which can be found on http://blog.powerdns.com. This blog will continue to sprout occasional programming oddities and observations.

But if you want to read about PowerDNS, head on to http://blog.powerdns.com!

Nothing else is changing, but we want to have a clear PowerDNS blog which can also be used by other PowerDNS employees & contributors.

Bert

Wednesday, April 10, 2013

Bitcoins explained for normal people -or- please get back to work

Bitcoins have been taking the world by storm, and what I’ve been reading has been making me angry. Consider this post my version of:

http://xkcd.com/386/

In short, I believe bitcoins are an interesting experiment, but that most people currently promoting bitcoins stand to profit from luring more folks into buying them and believing in them. In other words, they have a stake in convincing you to join in.

In this explanation, I hope to educate readers about how bitcoins:

cannot do what normal money does and
as an investment are a pyramid scheme and thus
are not “the new world order”.

I will first try to explain what bitcoins are and aren’t so you can make an informed decision if you want to partake in this gold rush.

A first stab: naive digital currency

Let’s start with a simple experiment. I want to start a currency, but not go through the hassle of actually minting coins or printing notes. So what I do is I take otherwise unremarkable pieces of paper and write numbers on them, say 0 through 21 million, and I tell everyone all those numbers are money.

And if people believe me, it will work, and you can then do payments with these pieces of paper. Interestingly, we don’t even need the pieces of paper: it is the number that is the money (there is no shortage of paper). This also makes it a breeze to do payments online - no need to ship paper, no bank involved! Instead of keeping large stacks of paper to prove our worth, we only keep the numbers in a file somewhere (if you delete the file, you lose your money though).

The problem now is that nothing stops people from spending their money twice. So I take my number 3141593 and use it to buy bread and simultaneously order some stuff online with it. Clearly, numbers as money are neat in theory, but soon the currency collapses since everybody has a copy of every number. People can just keep spending their numbers, and no one wants them anymore.

Second iteration: keeping track to prevent double spending

Within bitcoin, all transactions are recorded. So if you spend a bitcoin (which, mathematically, is a very large number), that transaction gets broadcast and stored in the network. If I then try to spend the same bitcoin again, I’ll find that the network refuses to register that transaction - because there already is a longer transaction chain recorded that includes the new owner. Thus, double spending is prevented.

This means two things though:

Each transaction is logged! So if you get your wages in bitcoins, and spend them at the local supplier of recreational drugs, that transaction is recorded forever. This might haunt you at a later stage. At the very least your employer now knows where you spend your money.
Recording & verifying the transaction takes time. Because the bitcoin network is fully distributed and has no trusted central hub, transactions are only assumed to be distributed if enough parts of the network have verified them. This takes around 10 minutes, and for absolute certainty, an hour is recommended. So forget about a quick shopping trip using bitcoins.

Third iteration: adding back privacy

This recording thing does not sit well with anyone and is an obvious flaw. A “solution” has been found however. When someone wants to send you money, you create a fresh identity for that transaction.

By exchanging these custom identities, each individual transaction gets a sheen of anonymity, but not really. Because whole transaction chains are available, it is nice that individual transactions are anonymous, but if you want to spend your coins, you’ll link them together on the outgoing transaction anyhow.

Fourth iteration: money supply issues

So where do the bitcoins come from? In a normal currency, a central bank creates money, usually in line with the (intended) growth of the economy. As bitcoin has no central bank, means have been found to allow people to ‘mine’ for new coins at a predetermined rate.

There is a steady supply of new bitcoins, mathematically set at 150 bitcoins/hour up to 2017, at which point this will slow down to 75 bitcoins/hour. Eventually, there will be around 21 million bitcoins, and not ever more.

This makes it impossible to look at bitcoins as ‘money’. Whole economies have been killed by setting the money creation rate wrong (too low in 1710 in France causing deflation, too high has happened to almost every currency, resulting in inflation). While we can be angry at how central banks did not stop the current ‘old school currency’ mess, bitcoins have only one money creation rate, and it is fully set in stone.

In short, the bitcoin economy may grow, but the number of bitcoins in circulation can not be matched to that growth. As long as the interest in bitcoins grows faster than the creation rate, as is currently the case, the bitcoin shows heavy deflationary behaviour - each individual bitcoin becomes worth ever more ‘regular money’, making it in effect a very bad idea to actually *spend* your coins.

Under deflationary conditions, in nominal terms, things keep getting cheaper. Why buy a car now when you can be sure it will be cheaper next week? Such conditions have killed whole economies.

The issues in short

Bitcoin transactions are (way) slower than regular money transactions (10 minutes - 1 hour)
Every bitcoin transaction leaves a publicly visible trace that can only be obscured but not removed
Because the bitcoin supply rate is fixed, the regular money value of bitcoins will fluctuate wildly, making them unsuitable as normal currency

Many bitcoin adherents will agree with the points above, and offer two answers:

Bitcoin is not a regular currency, but an investment
Most of the problems can be solved by calculating the bitcoin value of a transaction against current conversion rates to regular money

This holds no water. If we look at bitcoins as an investment, this only works if we can convince people to join in and thus grow the bitcoin economy. But why would they join in? Why, because the value of the coin keeps increasing! This is known as a pyramid scheme, where people who join in first take the money of those that enter the game later. These in turn only make money if they entice even more people to take part. In the mid 1990s, pyramid schemes wrecked the Albanian (real) economy, leading to the 1997 Albanian revolution.

If we look at bitcoins as currency, but admit that we still need traditional money as an adjunct, then bitcoin is attempting to do a revolution with aid of the establishment, something that has rarely worked. Any of the purported advantages of bitcoins disappears if it needs regular currency as an adjunct before being useful!

Finally

So, before jumping on the bitcoin bandwagon, realize that as a currency, bitcoins are flawed, and as an investment, you are late to the game, and you are merely funding the folks that got in earlier. And before you know it, you'll find yourself enthusing about bitcoins at birthday parties because, you know, you are now part of the pyramid!

I wrote this page out of sheer frustration that many of my smartest friends are devoting enormous amounts of energy to bitcoin-related projects and not actually contributing to their own or society’s well-being.

Hence the second title of this rant: and now get back to work.