Instead of individually replying your comments, let me answer them at once here,...

Instead of individually replying your comments, let me answer them at once here, because I agree you are correct in principle but think you are still missing my points.

Modern tracing JIT engines indeed work by (heavy) specialization, often using multiple underlying representations for single runtime type. I think V8 has at least four Array representations? After many enough specializations it is possible to get a comparable performance even for Python. The question is how many, however.

For a long time, most dynamically typed languages and implementations didn't even try to do JIT because of its high upfront cost. The cost is much lower today---yet still not insignificant enough to say it's no-brainer to do so---, but that fact was not that obvious 20 years ago. Ruby was also one of them, and YJIT was only possible thanks to Shopify's initial works. Given an assumption that JIT is not feasible, both CPython developers and users did a lot of things that further complicate eventual JIT implementations. C API is one, which is indeed one of the major concern for CPython, but a highly customized user class is another. Herein lies the problem:

> Magic methods are not that "hard" to optimize (as long as you don't overload the add,etc operators of f.ex. the Number class in JS).

Indeed, it is very unusual to subclass `Number` in JS, however it is less unusual to subclass `int` in Python, because it is allowed and Python made it convenient. I still think a majority of `int` will use the built-in class and not subclasses, but if it's the only concern, Psyco [1] should have been much popular when it came out because it should have handled such cases perfectly. In reality Psyco was not enough, hence PyPy.

[1] https://psyco.sourceforge.net/introduction.html

At this point I want to clarify that magic methods in Python are much more than mere operator overloading. For example, properties in JS are more or less direct (`Object.defineProperty` and nowadays a native syntax), but in Python they are implemented via descriptors, which are a nested object with yet another dunder methods. For example this implements the `Foo.bar` property:

    class Foo:
        class Bar:
            def __get__(self, obj, objtype=None): return 42
        bar = Bar()

In reality everyone will use `bar = property(lambda self: 42)` or equivalent instead, but that's how it works underneath. And the nested object can do absolutely anything. You can specialize for well-known descriptor types like `property`, but that wouldn't be enough for complex Python codebases. This is why...

> This is how as JS engine handles that + behaves differently between numbers, strings, BigInt's, Date object's,etc.

...is not the only thing JS engines do. They also have hidden classes (aka shapes) that are recognized and created in runtime, and I think it was one of innovations pioneered by V8---outside of the PL academia of course. Hidden classes in Python would be more complex than those in JS for this added flexibility and resulting uses. And JS hidden classes are not even that simple to implement.

After decades of JIT not in sight, and a non-trivial amount of work to get a working JIT even after that, it is not unreasonable that CPython didn't try to build JIT for a long time and the current JIT work is still quite conservative (it uses a copy-and-patch compilation to reduce the upfront cost). CPython did do lots of optimizations possible in interpreters though, many things mentioned above are internally cached for performance. One can correctly argue that such optimizations were not steady enough---for example, adaptive opcodes in 3.11 are something Java HotSpot used to do more than 10 years ago.