"Try it yourself" "very quickly wanted to get real-time streaming for more"
My experience is the opposite.
You think you need streaming, so you "try it out" and build something incredibly complex with Kafka, that needs 24h maintenance to monitor congestion in every pipeline.
And 10x more expensive because your servers are always up.
And some clever (expensive) engineers that figure out how watermarks, out of orderness and streaming joins really work and how you can implement them in a parallel way without SQL.
And of course a renovate bot to upgrade your fancy (but half baked) framework (flink) to the latest version.
And you want to tune your logic? Luckily that last 3 hours of data is stored in Kafka so all you have to do is reset all consumer offsets, clean your pipelines and restart your job and the in data will hopefully be almost the same as last time you run it. (Compared to changing a parameter and re-running that SQL query).
When all you business case really needed was a monthly report. And that you can achieve with pub/sub and an SQL query.
In my experience the need for live data rarely comes from a business case, but for a want to see your data live.
And if it indeed comes from a business case, you are still better off prototyping with something simple and see if it really flies before you "try it out".
I think I get the analogy, something like: you can append to the record but everything is still record-based? And what do garden gnomes have to do with it :-)
My experience is the opposite.
You think you need streaming, so you "try it out" and build something incredibly complex with Kafka, that needs 24h maintenance to monitor congestion in every pipeline.
And 10x more expensive because your servers are always up.
And some clever (expensive) engineers that figure out how watermarks, out of orderness and streaming joins really work and how you can implement them in a parallel way without SQL.
And of course a renovate bot to upgrade your fancy (but half baked) framework (flink) to the latest version.
And you want to tune your logic? Luckily that last 3 hours of data is stored in Kafka so all you have to do is reset all consumer offsets, clean your pipelines and restart your job and the in data will hopefully be almost the same as last time you run it. (Compared to changing a parameter and re-running that SQL query).
When all you business case really needed was a monthly report. And that you can achieve with pub/sub and an SQL query.
In my experience the need for live data rarely comes from a business case, but for a want to see your data live.
And if it indeed comes from a business case, you are still better off prototyping with something simple and see if it really flies before you "try it out".