When we pick technologies (languages, compilers, CPUs, computers, routing platforms, databases etc), we need ones that are able to deal with the size or scale of our problem. Now, what is this scale? And does it matter?
If you picked it wrong, you will find out. If you pick an operating system, open a million files, and find that the kernel bogs down because it has lots of linear searches over open file descriptors, you know the designers of the operating system weren't thinking of your use case. And perhaps in this specific example, you could work around the issue (you don't actually _need_ a million open files at once), but you'll be guaranteed to run into other linear walks. They just weren't thinking at your scale.
So what determines the 'natural scale' for which a technology is suited? Mostly, this turns out to be in all the individual parts that break when scaling to larger volumes, larger data rates, larger change rates etc. And to scale up a technology, you need to address all these individual squeaking parts, of which there might be many.
If you settled on something that scales to 'X', and then your use grows to '10X', you might find yourself being held back by a myriad small things that slow you down. And this will not change quickly - you picked your infrastructure as, well, infrastructure. It is there. It is a given. It will change, but only on multi-year timescales. So it matters, since now you'll be toast for years to come. Pick correctly!
Now, to drive home this point, and this is the real excuse for this post, here's a fair-use busting quote from the most wonderful Cryptonomicon by Neal Stephenson. You should read it. Many times. The first 100 pages are a bit slow, but THEN it delivers. Here is one of the book's heroes, Bobby Shaftoe, brilliantly explaining the concept of a technology's natural scale:
I'd like to focus a bit on this feeling - "imposing stress on the machine". In the computing world, we recognize this. We have an intuitive feeling that our MySQL database will become severely unhappy with a billion rows, but will zoom with 50 million rows, for example. That Bobby Shaftoe talks about this same feeling indicates it might have broader, more universal, technological roots.
And again, we can recognize this feeling. Setup a million routes on your big BGP router? It'll just smile at it. It is doing what it was meant to do. It is not impressed. Reload them all you want. Similarly, do something like that on your Raspberry Pi, and you know you are "imposing on the machine".
Bobby now expands a bit on the concept:
The phrase "it was a big deal for them" is one to recognize & compare with your feelings for technology you are considering. Is it a big deal for it to do what you want?
Again note the awe the technology inspires, and how it is pitted against the rifles which just operate in a very different class of scale.
As a case in point, over at PowerDNS we've been taking LMDB for a spin. And I can tell you, it is a water cooled Vickers with a radiator. We've thrown everything we've had at it, and it just zooms along.. like it is enjoying the challenge. It chomps on the zones like they aren't even there.
So, wrapping up this brief post - whenever you select new technology for a project, be it a compiler, a language, a computer, a router, a database - try to feel its scale. And the feelings of 'imposing a strain', 'being a big deal', 'not even noticing the work', described above may well guide you to pick the right technology. Good luck!
If you picked it wrong, you will find out. If you pick an operating system, open a million files, and find that the kernel bogs down because it has lots of linear searches over open file descriptors, you know the designers of the operating system weren't thinking of your use case. And perhaps in this specific example, you could work around the issue (you don't actually _need_ a million open files at once), but you'll be guaranteed to run into other linear walks. They just weren't thinking at your scale.
So what determines the 'natural scale' for which a technology is suited? Mostly, this turns out to be in all the individual parts that break when scaling to larger volumes, larger data rates, larger change rates etc. And to scale up a technology, you need to address all these individual squeaking parts, of which there might be many.
If you settled on something that scales to 'X', and then your use grows to '10X', you might find yourself being held back by a myriad small things that slow you down. And this will not change quickly - you picked your infrastructure as, well, infrastructure. It is there. It is a given. It will change, but only on multi-year timescales. So it matters, since now you'll be toast for years to come. Pick correctly!
Now, to drive home this point, and this is the real excuse for this post, here's a fair-use busting quote from the most wonderful Cryptonomicon by Neal Stephenson. You should read it. Many times. The first 100 pages are a bit slow, but THEN it delivers. Here is one of the book's heroes, Bobby Shaftoe, brilliantly explaining the concept of a technology's natural scale:
"Now when Bobby Shaftoe had gone through high school, he’d been slotted into a vocational track and ended up taking a lot of shop classes. A certain amount of his time was therefore, naturally, devoted to sawing large pieces of wood or metal into smaller pieces. Numerous saws were available in the shop for that purpose, some better than others. A sawing job that would be just ridiculously hard and lengthy using a hand saw would be accomplished with a power saw.
Likewise, certain cuts and materials would cause the smaller power saws to overheat or seize up altogether and therefore called for larger power saws. But even with the biggest power saw in the shop, Bobby Shaftoe always got the sense that he was imposing some kind of stress on the machine. It would slow down when the blade contacted the material, it would vibrate, it would heat up, and if you pushed the material through too fast it would threaten to jam."
I'd like to focus a bit on this feeling - "imposing stress on the machine". In the computing world, we recognize this. We have an intuitive feeling that our MySQL database will become severely unhappy with a billion rows, but will zoom with 50 million rows, for example. That Bobby Shaftoe talks about this same feeling indicates it might have broader, more universal, technological roots.
"But then one summer he worked in a mill where they had a bandsaw. The bandsaw, its supply of blades, its spare parts, maintenance supplies, special tools and manuals occupied a whole room. It was the only tool he had ever seen with infrastructure. It was the size of a car. The two wheels that drove the blade were giant eight-spoked things that looked to have been salvaged from steam locomotives. Its blades had to be manufactured from long rolls of blade-stuff by unreeling about half a mile of toothed ribbon, cutting it off, and carefully welding the cut ends together into a loop.
When you hit the power switch, nothing would happen for a little while except that a subsonic vibration would slowly rise up out of the earth, as if a freight train were approaching from far away, and finally the blade would begin to move, building speed slowly but inexorably until the teeth disappeared and it became a bolt of pure hellish energy stretched taut between the table and the machinery above it. Anecdotes about accidents involving the bandsaw were told in hushed voices and not usually commingled with other industrial-accident anecdotes.
Anyway, the most noteworthy thing about the bandsaw was that you could cut anything with it and not only did it do the job quickly and coolly but it didn’t seem to notice that it was doing anything. It wasn’t even aware that a human being was sliding a great big chunk of stuff through it. It never slowed down. Never heated up."
And again, we can recognize this feeling. Setup a million routes on your big BGP router? It'll just smile at it. It is doing what it was meant to do. It is not impressed. Reload them all you want. Similarly, do something like that on your Raspberry Pi, and you know you are "imposing on the machine".
Bobby now expands a bit on the concept:
"In Shaftoe’s post-high-school experience he had found that guns had much in common with saws. Guns could fire bullets all right, but they kicked back and heated up, got dirty, and jammed eventually. They could fire bullets in other words, but it was a big deal for them, it placed a certain amount of stress on them, and they could not take that stress forever."
The phrase "it was a big deal for them" is one to recognize & compare with your feelings for technology you are considering. Is it a big deal for it to do what you want?
"But the Vickers in the back of this truck was to other guns as the bandsaw was to other saws. The Vickers was water-cooled. It actually had a f*cking radiator on it. It had infrastructure, just like the bandsaw, and a whole crew of technicians to fuss over it. But once the damn thing was up and running, it could fire continuously for days as long as people kept scurrying up to it with more belts of ammunition. After Private Mikulski opened fire with the Vickers, some of the other Detachment 2702 men, eager to pitch in and do their bit, took potshots at those Germans with their rifles, but doing so made them feel so small and pathetic that they soon gave up and just took cover in the ditch and lit up cigarettes and watched the slow progress of the Vickers’ bullet-stream across the roadblock.
Then he ceased firing at last. Shaftoe felt like he should make an entry in a log book"
Again note the awe the technology inspires, and how it is pitted against the rifles which just operate in a very different class of scale.
As a case in point, over at PowerDNS we've been taking LMDB for a spin. And I can tell you, it is a water cooled Vickers with a radiator. We've thrown everything we've had at it, and it just zooms along.. like it is enjoying the challenge. It chomps on the zones like they aren't even there.
So, wrapping up this brief post - whenever you select new technology for a project, be it a compiler, a language, a computer, a router, a database - try to feel its scale. And the feelings of 'imposing a strain', 'being a big deal', 'not even noticing the work', described above may well guide you to pick the right technology. Good luck!