One might reasonably ask a frontier how to generate the source code for an agent based system that exhibits examples of initiative, emotion, creativity, curiosity, opinions, beliefs, self-reflection, or even logical reasoning.
I believe at that point we would have to seriously ask ourselves about the definition of "reasoning" or "intelligence"; humans have an intuitive understanding, LLMs don't - would an LLM be able to evaluate the output of an LLM, or would we have to involve a "human in the loop"[1]?