Creating our new product has given me the opportunity to explore, evaluate and implement newer approaches to creating an IoT cloud backend.
Go Vs TypeScript
For a start, it gave me the chance to learn Go (a relatively new programming language). I don’t want to start a holy war on JS/TypeScript Vs Go, but I noticed scaling issues with TS in the past, whereas Go appears to scale vertically and maintain performance as things scale up. Similar to this excellent YouTube comparison made by Ben Davis - Tech (YouTube)
I’d also recommend picking up Go after you’ve used/mastered TS and then choose what’s appropriate for what you’re trying to achieve. In our case, we want highly performant, scalable data processing pipelines with backend APIs / RPC services and perhaps some CLI tools - so Go is an obvious choice for us. Before you ask - yes Rust was also considered in these discussions, but the initial learning curve was a bit of a turn-off (...maybe in V2).
Selecting a Message Broker
Given our platform needs to deal with distributing vast numbers of different types of packets from different sources between cloud services, the choice of message broker is a particularly crucial one.
Our customers need to be able to send and access high throughput, real-time data, and so we need a scalable, reliable and low latency message broker for our data processing pipeline. As with practically all Cloud frameworks and protocols, there always seems to be a vast array of viable options. I’ve previously used RabbitMQ - so I’m familiar with its functionalities (and shortcomings). But after filtering down the long list of brokers to only production-ready options that allow us to remain cloud provider agnostic (so no Azure Service Bus, GCP Pub/Sub, AmazonMQ, etc.), it seemed the general consensus was to use either RabbitMQ or Apache’s Kafka.
RabbitMQ Vs Kafka
There are a number of useful articles that have already done a thorough comparison of the 2:
https://www.upsolver.com/blog/kafka-versus-rabbitmq-architecture-performance-use-case
https://www.pubnub.com/blog/kafka-vs-rabbitmq-choosing-the-right-messaging-broker/
But the main takeaways for me and considering Embeint’s use cases were:
Kafka offers higher performance, scalability and throughput than RabbitMQ with less resources. It has less focus on message routing and delivering customised message queues for consumers but rather storing and retaining large logs of data - allowing the consumer to control which messages it has processed.
RabbitMQ excels at complex messaging routing, allowing individual messages to be routed and fanned out to different exchanges and queues based on the message’s routing key. RabbitMQ also guarantees reliable message delivery with messages, with consumers pulling and removing the data from the queue as it gets processed. However, it does not offer the same throughput or scalability as Kafka.
In our use case (a real-time, high throughput device data pipeline where every millisecond of latency and processing speed matters), we are less concerned with complex dynamic message routing and much more concerned with reduced latency, scalability and robustness.
So in the end, it seemed like the better choice to go with Kafka as our message broker. I’m only a few weeks into using it, but so far, so good. Right now I’m having a good look at confluent.io which seems to host a pretty solid instance used by many. But it is still early days and so I’d love any comments/feedback on any advantages or downsides of either option that I might have missed. Send me a message over on LinkedIn