bert hubert finally blogs

Tuesday, March 26, 2013

A quick note on cable modems and "Serious Switches"

Posting this so someone else might find it and save a day of headbashing.

I have a Ziggo Motorola cable modem which acts as a bridge. Recently, I revamped our home routing infrastructure (because our old 4-port server died), and installed a most excellent HP 1810g switch.

Because this switch supports VLANs, I was able to configure a Raspberry Pi as a "router on a stick" that routed between 3 VLANs, our house LAN, the Ziggo cable and Telfort DSL.

I did note I had to reboot the cable modem a bit to get things to work, but then they did. When the Raspberry Pi was retired from routing & VoIP switching duty, it got replaced by a most excellent HP MicroServer N40L, but try as a might, I couldn't get a DHCP lease through the Ziggo cable modem bridge.

I did see ARP packets come in for other Ziggo IP addresses, but my DHCP requests would never get an answer. I rebooted the modem a few times and performed various other tricks, but nothing helped.

On the internet, some people noted the cable modem would only work for 1 MAC address at a time, so I changed my MAC address to that of the Raspberry, but still no dice.

Yesterday it dawned on me - my fancy switch itself generates LLDP packets! And once the cable modem has seen the switch MAC address, it considers that to be its friend. And thus blocking all my Linux server's DHCP requests!

I turned off LLDP, rebooted the modem, and was back in business.

Moral of this story - your fancy "enterprise" equipment may upset your consumer electronics.

Saturday, February 23, 2013

A horse, a donkey, a cow: a genetic diff

So, continuing the series 'if you are a hammer, everything looks like a nail', here we'll bridge the worlds of genetic sequencing and programming and show the diff between a horse, a cow and a donkey.

DNA is a lot like computer code, except that it is not an imperative nor a functional programming language. DNA describes what amino acids end up in proteins, and these proteins have shapes and chemical properties which makes them interact in a way that we call 'life'. For more context, see my earlier article 'DNA as seen through the eyes of a computer programmer'.

Like code, DNA evolves in the course of the development of life. Some code never changes, because it is so vital and tricky that any change immediately leads to a non-functioning organism. Other code is so uncritical or unimportant that it can (and does) change at a high clip, leading to many useful or perhaps detrimental mutations.

In between are pieces of DNA that are very consistent within species, but show remarkable change between them. Such code is used to fingerprint organisms, live, but mostly dead. Such a fingerprint (or better, a barcode) can quickly and reliably tell if we are eating horse, donkey or beef.

Huge databases have been established, one of which (BOLDSystems) can be queried here. This is called 'the barcode of life', and for animals, this has been standardized on the mitochondrial CO-1 gene, which encodes part of our aerobic metabolism, powering our cells.

So, what does this look like? Behold, the diff between a horse and a donkey:

As we can see, most DNA is identical, with variations mostly impacting individual nucleotides. In addition, there is one longer stretch that is different.

Now let's make a very current and relevant comparison: a horse and a cow:

Note that we still have lots of single mutations, but we also see a whole line that is mostly different! Clearly, a horse is not a cow. No matter how well you cook it!

If you want to make your own comparison, first look up the scientific (latin) name of the desired animal. Next, look it up, and from the list of sequences, pick two with the same CO-1 length (some barcodes contain more DNA than others).

Then use this tiny Python script to generate the nice html diffs you see above:

import sys, os, time, difflib, optparse

def main():
    usage = "usage: %prog [options] fromfile tofile"
    parser = optparse.OptionParser(usage)
    (options, args) = parser.parse_args()

    fromfile, tofile = args

    one =open(fromfile).readlines()
    theother = open(tofile).readlines()
  
    d = difflib.HtmlDiff()
    result = d.make_file(one, theother, fromfile, tofile)
    sys.stdout.writelines(result)

if __name__ == '__main__':
    main()

Good luck!

Sunday, January 20, 2013

A PowerDNS... WOK?!

From a letter, where you should know that 'Mok' is Dutch for 'Mug':

Dear PomerDNS management,

Recently me ordered, via a mebform or your mebsite, a so called PomerDNS Mok. but something ment terribly mrong since me received a PomerDNS Wok! Therefor me send this back to you in hope it can be changed for the proper article!

Mith friendly greetings,

A PomerDNS user

In the box, I found this... Wok:

Amazing proof that humor is not dead ;-)

Thanks to +Reinoud van Leeuwen for this wonderful letter, and I can assure you the wok is quite genuine - it will find a proud place in the office!

Friday, November 30, 2012

Adding new DNS record types to PowerDNS software

Our friends from NLNetLabs recently described how to add new record types to NSD, which I think is a great idea. Especially if this enables the community to add their favorite record types for us!

Here are the full descriptions on how we added the TLSA record type to all PowerDNS products, with links to the actual source code.

First, define the TLSARecordContent class in dnsrecords.hh:

class TLSARecordContent : public DNSRecordContent
{
public:
includeboilerplate(TLSA)

uint8_t d_certusage, d_selector, d_matchtype;
string d_cert;
};

The 'includeboilerplate(TLSA)' generates the four methods that do everything PowerDNS would ever want to do with a record:

read TLSA records from zonefile format
write out a TLSA record in zonefile format
read a TLSA record from a packet
write a TLSA record to a packet

The actual parsing code:

boilerplate_conv(TLSA, 52,
conv.xfr8BitInt(d_certusage);
conv.xfr8BitInt(d_selector);
conv.xfr8BitInt(d_matchtype);
conv.xfrHexBlob(d_cert, true);
)

This code defines the TLSA rrtype number as 52. Secondly, it says there are 3 eight bit fields for Certificate Usage, Selector and Match type. Next, it defines that the rest of the record is the actual certificate (hash). 'conv' methods are supplied for all DNS data types in use.

Now add TLSRecordContent::report() to reportOtherTypes().

And that's it. For completeness, add TLSA and 52 to the QType enum in qtype.hh, which makes it easier to refer to the TLSA record in code if so required.

Please contact us to get your patch merged, or submit it via our GitHub page!

Wednesday, October 17, 2012

I'm a C++ dinosaur, but I'm OK

So here's a nice challenge. Let's say you have a list of member email addresses which you get from your account list. But you also have a list of email addresses that you have of your customers, addresses that your customers have used to communicate with you. And finally you have a list of email addresses of potential customers, but who are not yet signed up.

Now let's say someone decides to do a survey of future and existing customers, and sends out the survey to a list that purports to be that. And when the numbers of the survey come in.. mass confusion arises because the numbers don't add up.

So we now have a bunch of files, 'email addresses we mailed the survey to', 'main account email addresses', 'potential customer email addresses' and 'other customer email addresses'. And nobody knows why the last three don't add up to the first list.

Did I mention typos and polluted data? The pain.

So what would normal people do? Well, no idea, but we got on the case using 'diff'. This was tremendously helpful, but only up to a point. It is hard to answer questions like 'give me everybody not on list A but who is on B or C'. This is of course all very easy to do from SQL.

But I wasn't feeling like that. So, because C++ is what I do, I wrote a C++ program. I might be a dinosaur like that, but I feel compelled to point out that the program I present below will scale to literally billions and billions of customers. You know, in case you have more users than there are people on the planet.

So what does this small program do? Using slightly modern C++, it reads lines of text from several files. And once it is done, it will print for each unique line of text in which files it was found. This allows you to quickly see for each email address if it was part of 'main account email addresses', but perhaps not of 'potential customer addresses' etc.

Once it has printed that, it prints out a list of those same lines of text per 'configuration'. So you get groups of lines, and one group might represent 'on the list of addresses we emailed to, but not in any of the other files'. In other words, no clue why we emailed this address.

Another group might represent 'addresses we emailed and only present on the potential customer list'. And finally, and importantly, you might get the group 'we didn't email, but IS actually a customer'.

The output can be read by most spreadsheets, and they'll do the right thing.

All in all, this helped solve a tricky problem, and was actually implemented in less time than it would take to do it by hand. I hope ;-)

// read all input files, output for each line of text in which of the input files it was found
// public domain code by bert.hubert@netherlabs.nl
#include <string>
#include <vector>
#include <map>
#include <iostream>
#include <fstream>
#include <boost/dynamic_bitset.hpp>
#include <boost/algorithm/string.hpp>
#include <boost/foreach.hpp>

using namespace std;

// this allows us to make a Case Insensitive container
struct CIStringCompare: public std::binary_function<string, string, bool>  
{
  bool operator()(const string& a, const string& b) const
  {
    return strcasecmp(a.c_str(), b.c_str()) < 0;
  }
};

int main(int argc, char**argv)
{
  typedef map<string, boost::dynamic_bitset<>, CIStringCompare> presence_t;
  presence_t presence;
  
  string line;
  cout << '\t';
  for(int n = 1; n < argc; ++n) {
    cout << argv[n] << '\t';
    ifstream ifs(argv[n]);
    if(!ifs) {
      cerr<<"Unable to open '"<<argv[n]<<"' for reading\n"<<endl;
      exit(EXIT_FAILURE);
    }
    
    while(getline(ifs, line)) {
      boost::trim(line);
      if(line.empty())
        continue;
      presence_t::iterator iter = presence.find(line);
      if(iter == presence.end()) { // not present, do a very efficient 'insert & get location'
        iter = presence.insert(make_pair(line, boost::dynamic_bitset<>(argc-1))).first;  
      }
      iter->second[n-1]=1;
    }
  }
  cout << '\n';
  
  // this is where we store the reverse map, 'presence groups', so which lines where present in file1, but not file2 etc
  typedef map<boost::dynamic_bitset<>, vector<string> > revpresence_t;
  revpresence_t revpresence;
  
  BOOST_FOREACH(presence_t::value_type& val, presence) {
    revpresence[val.second].push_back(val.first);
    cout << val.first << '\t';
    for (boost::dynamic_bitset<>::size_type i = 0; i < val.second.size(); ++i) {
      cout << val.second[i] << '\t';
    }
    cout << endl;
  }
  
  cout << "\nPer group output\t\n";
  BOOST_FOREACH(revpresence_t::value_type& val, revpresence) {
    cout<<"\nGroup: \t";
    
    for (boost::dynamic_bitset<>::size_type i = 0; i < val.first.size(); ++i) {
      cout << val.first[i] << '\t';
    }
    
    cout << endl;
    
    BOOST_FOREACH(string& entry, val.second) {
      cout << entry << "\t\n";
    }
  }
}

Monday, October 8, 2012

On binding datagram (UDP) sockets to the ANY addresses

This story goes back a long time. For around 10 years now, people have been requesting that PowerDNS learn how to automatically listen on all available IP addresses. And for slightly less than that time, we've been telling people we would not be adding that feature.

For one, if you run a nameserver, you should *know* what IP addresses you listen on! How else could people delegate to you, or rely on you to resolve their queries? Secondly, running services by default on 'all' IP addresses is a security risk. The PowerDNS Recursor for this reason binds to 127.0.0.1 by default.

But still, people wanted this feature, and we didn't do it. Because we knew it'd be hard work. There, the truth is out. But we finally bit the bullet and had to figure out how to do it. This page shares that knowledge, including the fact that the Linux manpages tell you to do the wrong thing.

There are two ways to listen on all addresses, one of which is to enumerate all interfaces, grab all their IP addresses, and bind to all of them. Lots of work, and non-portable work too. We really did not want to do that. You also need to monitor new addresses arriving.

Secondly, just bind to 0.0.0.0 and ::! This works very well for TCP and other connection-oriented protocols, but can fail silently for UDP and other connectionless protocols. How come? When a packet comes in on 0.0.0.0, we don't know which IP address it was sent to. And this is a problem when replying to such a packet - what would the correct source address be? Because we are connectionless (and therefore stateless), the kernel doesn't know what to do.

So it picks the most appropriate address, and that may be the wrong one. There are some heuristics that make some kernels do the right thing more reliably, but there are no guarantees.

When receiving packets on datagram sockets, we usually use recvfrom(2), but this does not provide the missing bit of data: which IP address the packet was actually sent to. There is no recvfromto(). Enter the very powerful recvmsg(2). Recvmsg() allows for the getting of a boatload of parameters per datagram, as requested via setsockopt().

One of the parameters we can request is the original destination IP address of the packet.

IPv6

For IPv6, this is actually standardized in RFC 3542, which tells us to request parameter IPV6_RECVPKTINFO via setsockopt(), which will lead to the delivery of the IPV6_PKTINFO parameter when we use recvmsg(2).

This parameter is sent to us as a struct in6_pktinfo, and its ipi6_addr member contains the original destination IPv6 address of the query.

When replying to a packet from a socket bound to ::, we have the reverse problem: how to specify which *source* address to use. To do so, use sendmsg(2) and specify an IPV6_PKTINFO parameter, which again contains a struct in6_pktinfo.

And we are done!

To get this to work on OSX, please #define __APPLE_USE_RFC_3542, but otherwise this feature is portable across FreeBSD, OSX and Linux. (Please let me know about Windows, I want to make this page as valuable as possible).

IPv4

For IPv4 the situation is more complicated. Linux and the BSDs picked a slightly different way to do things, since they did not have an RFC to guide them. Confusingly, the Linux manpages document this incorrectly (I'll submit a patch to the manpages as soon as everybody agrees that this page describes things correctly).

For BSD, use a setsockopt() called IP_RECVDSTADDR to request the original destination address. This then arrives as an IP_RECVDSTADDR option over recvmsg(), which carries a struct in_addr, which does NOT necessarily have all fields filled out (like for example the destination port number).

For Linux, use the setsockopt() called IP_PKTINFO, which will get you a parameter over recvmsg() called IP_PKTINFO, which carries a struct in_pktinfo, which has a 4 byte IP address hiding in its ipi_addr field.

Conversely, for sending on Linux pass a IP_PKTINFO parameter using sendmsg() and make it contain a struct in_pktinfo.

On FreeBSD, pass the IP_SENDSRCADDR option, and make it contain a struct in_addr, but again note that it probably does not make sense to set the source port in there, as your socket is bound to exactly one port number (even if it covers many IP addresses).

Binding to :: for IPv6 and IPv4 purposes

On Linux, one can bind to :: and get packets destined for both IPv6 and IPv4. The good news is that this combines well with the above, and Linux delivers an IPv4 IP_PKTINFO for IPv4 packets, and will also honour the IP_PKTINFO for outgoing IPv4 packets on such a combined IPv4/IPv6 socket.

On FreeBSD, and probably other BSD-derived systems, one should bind explicitly to :: and 0.0.0.0 to cover IPv4 and IPv6. This is probably better. To get this behaviour on Linux, use the setsockopt() IPV6_V6ONLY, or set /proc/sys/net/ipv6/bindv6only to 1.

Actual source code

To see all this in action, head over to http://wiki.powerdns.com/trac/browser/trunk/pdns/pdns/nameserver.cc - it contains the relevant setsockopt(), sendmsg() and recvmsg() calls.

Tuesday, August 28, 2012

A few quick notes on making an application FULLY IPv6 compliant

Over the past decade, PowerDNS has become ever more IPv6 compliant, and I think that since a year or so, we fixed every last issue.

So why did it take so long? Just creating an AF_INET6 socket and binding it shouldn't be that hard, right?

Here are some points to ponder:

IP addresses are used for more things than offering services! In other words, once your application can bind() to an IPv6 address, you are not done.

Filtering - if you offer the ability to restrict service to certain classes of IP addresses, your filtering needs to be IPv6 aware
Proxying - if your service can forward requests to somewhere else, that too needs to know what socket family to use
Databases, web services often supply data you need to do your work. And those underlying services too can live on IPv6!

So look not only at each call to bind() but to each call to connect() too!

IPv6 addresses are like IPv4 addresses.. but not quite

Scoped link local addresses. In some circles, it is recommended to bind services to locally scoped addresses, and such addresses look like fe80::92fb:a6ff:fe4a:51da%eth0. You must make sure you use the right API call to translate such a human address representation into a sockaddr_in6. I recommend getaddrinfo(), but pay close attention to its non-standard return codes!

Address lookup - if the operator specifies that queries need to be forwarded to 'downstream.yourcompany.com', did he mean the IP address of that host? Or the IPv6 address? Or both? Or the first one which works? If there are more IPv4 and IPv6 addresses, what to do?

Also, be prepared to lookup the address again. My Android phone tries to talk IPv6 over the IPv4-only cellular network immediately after leaving my home Wifi - which doesn't work!

Ports - for IPv4, it is common to use 1.2.3.4:25 to describe port 25 on host 1.2.3.4. But how to map this to IPv6, which already uses : internally? Commonly used patterns are [::1]:25 or ::1#25. Pick one, both for input and output. ::1:25 is ambiguous.
Some systems have no IPv6 at a very fundamental level. This includes security conscious FreeBSD and OpenBSD users. Make sure that your application can deal with a failure to bind to ::, and also with a failure to even *make* an AF_INET6 socket.
Some systems might not have an IPv4 address, preventing you from binding to 127.0.0.1 or even 0.0.0.0! This is a good test case to see if you really fixed everything.
When writing socket family agnostic code, be aware that some operating systems are more picky than others. For example, Linux will allow you to pass an AF_INET sockaddr with a socket length that is > sizeof(sockaddr_in), for example sizeof(sockaddr_in6). FreeBSD complains about this. So always supply the correct sockaddr length!
sockaddr_in6 contains a lot of fields, some of which are never used. This is unlike sockaddr_in which is completely specified by family, IP address and port. Be sure to zero your sockaddr_in6 before use, as it might start to fail silently later on if you don't.
On Linux, if you bind an IPv6 socket to the IPv6 address ::, by default, it will also listen on IPv4 0.0.0.0! This is unexpected and a bit weird, but it is what happens. If an IPv4 connection comes in over the IPv6 socket, the IPv4 address will be mapped to an IPv6 address that starts with ::ffff. The major issue with this is that you suddenly can't bind anything (else) to 0.0.0.0 anymore on the same port number. Use the IPV6_V6ONLY setsockopt() option to fix this.

If you still use such :: binding to IPv4 and IPv6, make sure that any access filters you have will correctly match ::ffff:1.2.3.4 to 1.2.3.4!

Finally, for PowerDNS it took ages to remove the last 'IPv4-only' vestiges. We only were done once we had audited *each* and *every use* of AF_INET in our source tree.

It is highly recommended to use an abstraction that allows you to specify, pass and print addresses without having to worry about them being IPv4 or IPv6. The PowerDNS ComboAddress might serve as inspiration. Code can be found here. From there you can also find code that helps match IPv4 and IPv6 addresses to NetMasks and NetMaskGroups, for easy filtering.

Finally, unless you truly audit each and every socket operation, you will miss certain corner cases, so do expect some fallout down the road.

bert hubert finally blogs

Tuesday, March 26, 2013

A quick note on cable modems and "Serious Switches"

Saturday, February 23, 2013

A horse, a donkey, a cow: a genetic diff

Sunday, January 20, 2013

A PowerDNS... WOK?!

Friday, November 30, 2012

Adding new DNS record types to PowerDNS software

Wednesday, October 17, 2012

I'm a C++ dinosaur, but I'm OK

Monday, October 8, 2012

On binding datagram (UDP) sockets to the ANY addresses

IPv6

Binding to :: for IPv6 and IPv4 purposes

Actual source code

Tuesday, August 28, 2012

A few quick notes on making an application FULLY IPv6 compliant

Search this blog

Older postings

Tuesday, March 26, 2013

Saturday, February 23, 2013

Sunday, January 20, 2013

Friday, November 30, 2012

Wednesday, October 17, 2012

Monday, October 8, 2012

IPv6

Binding to :: for IPv6 *and* IPv4 purposes

Actual source code

Tuesday, August 28, 2012

Binding to :: for IPv6 and IPv4 purposes