> Usually that was in the form of long-running network calls. That's why doing a...

notpachet · on Aug 10, 2022

I was actually referring to background job workers in my comment. Queuing is still queuing, whether its request queuing due to a shortage of web workers or job queuing due to a shortage of job workers.

Yes, there are more levers one can pull for job workers, and it's probably easier to horizontally scale those workers than web workers for various reasons. But regardless of which workers are performing the long-running I/O, there's still a hard bottleneck imposed by the number of available workers. They're still going to inefficiently block while waiting for the I/O to complete. The bottleneck hasn't been truly eliminated; it's just been punted somewhere else in the application architecture where it can be better mitigated.

Background jobs may be the most elegant solution for handling long-running I/O in a typical Rails app, but that's still less elegant than simply performing those requests inline and not having to worry about all the additional moving parts that the jobs entail.

byroot · on Aug 10, 2022

> that's still less elegant than simply performing those requests inline and not having to worry about all the additional moving parts that the jobs entail.

I strongly disagree here. Going through a persisted queue gives you lot of tools to manage that queuing.

If you were to just spawn a goroutine or some similar async construct you lose persistence, lots of control on retries, resiliency by isolation etc. When you have "in process jobs" re-deploying the service become a nightmare as it becomes extremely muddy how long a request can legitimately take.

Whereas if you properly defer these slow IOs to a queue, and only allow fast transactional request, you can then have a very strict request timeout which is really key for reliability of the service.

notpachet · on Aug 11, 2022

Those are all fair points. My only counterargument is that for a certain class of requests, the simplicity of not needing to worry about a separate background jobs queue outweighs the benefits that the job queue provides. There's some fuzzy line where those benefits become worth it. And you're probably going to cross that line earlier with a Rails app than with an evented one. There are lots of cases in Rails where problems are solved via background jobs that would most likely just stay in the parent web request in an IO-friendlier environment.

jrochkind1 · on Aug 11, 2022

Rails or not, in any platform/framework, if you do an API request inline in a request that could take max time N, then the total request time could take max time N+m, so if you don't want requests to take >N, you don't do requests inline.

What am I missing, how does Rails make this especially bad?

Or is it that in another platform/framework, it's easier to allow requests to take long N+m to return if you want? True that would be easier in, say, an evented environment (like most/all JS back-ends), but... you still don't want your user-facing requests taking 5 or 10 seconds to return a response to the user do you? In what non-Rails circumstances would you do long-running I/O inline in a web request/response?

notpachet · on Aug 11, 2022

> Or is it that in another platform/framework, it's easier to allow requests to take long N+m to return if you want? True that would be easier in, say, an evented environment (like most/all JS back-ends), but... you still don't want your user-facing requests taking 5 or 10 seconds to return a response to the user do you? In what non-Rails circumstances would you do long-running I/O inline in a web request/response?

Yeah, that's what I'm getting at. It's true that even in evented backends there's a line beyond which it's probably better to put the long-running stuff in a background queue, but it's a higher bar than in Rails. I've run pretty high-throughput Node and Go apps that had to do a lot of 1-5s requests to external hosts (p95's probably up to 10s) and they didn't really have any issues. In my opinion, it wouldn't have been worth it to add a separate background queue; the frontline servers were able to handle that load just fine without the additional indirection.

byroot makes good points in a sibling comment about retries and more explicit queue control being advantages of a job queue pattern regardless of whether you're evented or not. I just think that those advantages have a higher "worth it" bar to clear in an evented runtime in order to justify their overhead (vs Rails).