(Someone correct me if I'm wrong!) I think about differential dataflow as the solution to "I can't batch data operations, because I don't know when my various inputs will land."
If everything exists at 7am, and/or you don't need the freshest computed values, this is not the solution you need.
If data A is ready between 2-4am, data B at noon, and data C sometime between 8am-6pm, this allows you to abstract that uncertainty into code, then let the system solve it on a daily basis.
This is not a problem everyone has. But it is a problem most people working with inventory or events have! And it's usually a problem people feeding things to ML have.
If everything exists at 7am, and/or you don't need the freshest computed values, this is not the solution you need.
If data A is ready between 2-4am, data B at noon, and data C sometime between 8am-6pm, this allows you to abstract that uncertainty into code, then let the system solve it on a daily basis.
This is not a problem everyone has. But it is a problem most people working with inventory or events have! And it's usually a problem people feeding things to ML have.