Serverless Computing - A Paradigm Shift

January 20, 2018

In the beginning

When I first started programming (on the Spectrum ZX81), you would typically write a program such as this (apologies if it’s syntactically incorrect, but I don’t have a Speccy interpreter to hand):


10 PRINT "Hello World"
20 GOTO 10

You could type that in to a computer at Tandys and chuckle as the shop assistants tried to work out how to stop the program (sometimes, you might not even use “Hello World”, but something more profane).

However, no matter how long it took them to work out how to stop the program, they only paid for the electricity the computer used while it was on. Further, there will only ever be a finite and, presumably (I never tried), predictable number of “Hello World” messages produced in an hour.

The N-Tier Revolution

Fast forward a few years, and everyone’s talking about N-Tier computing. We’re writing programs that run on servers. Some of those servers are big and expensive, but pretty much the same statement is true. No matter how big and complex your program that you run on the server, it’s your server (well, actually, it probably belongs to a company that you work for in some capacity). For example, if you have a poorly written SQL Server Procedure that scans an entire table, the same two statements are still true: no matter how long it takes to run, the price for execution is consistent, and the amount of time it takes to run is predicable (although, you may decide that if it’s slow, speeding it up might be a better use of your time than calculating exactly how slow).

Using Other People’s Computers

And now we come to cloud computing… or do we? The chronology is a little off on this, and that’s mainly because everyone keeps forgetting what cloud computing actually is. You’re renting time on somebody else’s computer. If I was twenty years older, I might have started this post by saying that “this was what was done back in the 70’s and 80’s in a manner of speaking. But I’m not, so we’ll jump back to the mid to late 90’s: and SETI. Anyone who had a computer back in those days of dial-up connections and 14.4K modems will remember that SETI (search for extra-terrestrial intelligence) were distributing a program for everyone to run on their computer instead of toaster screensavers*.

Wait - isn’t that what cloud computing is?

Yes - but SETI did it in reverse. You would dial-up, download a chunk of data and your machine would process it as a screensaver.

Everyone was happy to do that for free: because we all want to find aliens. But what if there had been a cost associated to each byte of data processed; clearly something similar was in the mind of Amazon when they started with this idea.

Back to the Paradigm Shift

So, the title, and gist of this post is that the two statements that have been consistent right through from programs written on the ZX Spectrum to programs written in Turbo C and Pascal, to programs written in C# running on a dedicated server, has now changed. So, here are the four big changes brought about by the cloud **:

1. Development and Debugging

If you write a program, you can no-longer be sure of the cost, or the execution time. This isn’t a scare post: both are broadly predictable; but the paradigm has now changed. If you’re in the middle of writing an Azure function, or a GCP BigQuery query, and it doesn’t work, you can’t just close your laptop and go for dinner while you have a think, because while you do, nodes are lighting up all over the world trying to complete your task. The lights are dimming in Seattle while your Azure function crashes again and again.

2. Scale and Accessibility

The second big change is the way that your code is scaled. For example, you might be used to parallelising slow code so that you can make use of all available threads; however, in our new world, if you do that, you may actually be making it harder for your cloud platform of choice to scale your code.

Because you pay per minute of computing time (typically this is more expensive than storage), code that is unnecessarily slow or inefficient may not cause your system to slow down - what it will probably do is to cost you more money to run it at the same speed.

In some cases, you might find it’s more efficient to do the opposite of what you have thus-far believed to be accepted industry best practice. For example, it may prove more cost efficient to de-normalise data that you need to access.

3. Logging

This isn’t exactly new: you should always have been logging in your programs - it’s just common sense. However, there’s a new emphasis here: you’re not running this on your own server (you’re not even running it on a customer’s server) - it’s somebody else’s. That means that if it crashes, there’s a limited amount of investigation that you can do. As a result, you need to log profusely.

4. Microservices and Message Busses

IMHO, there are two reason why the Microservice architecture has become so popular in the cloud world: it’s easy - that is, spinning up a new Azure function endpoint takes minutes.

Secondly, it’s more scaleable. Microservices make your code more scaleable because it’s easy for the cloud provider to instantiate two instances of your program, and then three, and then a million. If your program does one small thing, then only that small thing needs to be instantiated. If your program does twenty different things, it can still scale, but it’ll cost more, because it will require more processing power.

Finally, instead of simply calling the service that you need, there is now the option to place a message on a queue; apart from separating your program into definable responsibility sectors, this means that, when your cloud provider of choice scales your service out, all the instances can pick up a message and deal with it.

Summary

So, what’s the conclusion: is cloud computing good, or bad? In five years time, the idea that someone in your organisation will know the spec of the machine your business critical application is running on seems a little far-fetched. The days of having a trusted server that has a load of hugely important stuff on, but nobody really knows what, and has been running since 1973 are numbered.

Obviously, there’s a price to pay for everything. In the case of the cloud, it’s complexity - not of the code, and not really of the combined system, but typically you are introducing dozens of moving parts to a system. If you decide to segregate your database, you might find that you have several databases, tens, or even hundreds of independent processes and endpoints, you could even spread that across multiple cloud providers. It’s not that hard to lose track of where these pieces are living and what they are doing.

Footnotes

* If you don’t get this reference then you’re probably under 30.

** Clearly this is an opinion piece.