Explanation: Python is a programming language. Numpy is a library for python that makes it possible to run large computations much faster than in native python. In order to make that possible, it needs to keep its own set of data types that are different from python’s native datatypes, which means you now have two different bool types and two different sets of True and False. Lovely.

Mypy is a type checker for python (python supports static typing, but doesn’t actually enforce it). Mypy treats numpy’s bool_ and python’s native bool as incompatible types, leading to the asinine error message above. Mypy is “technically” correct, since they are two completely different classes. But in practice, there is little functional difference between bool and bool_. So you have to do dumb workarounds like declaring every bool values as bool | np.bool_ or casting bool_ down to bool. Ugh. Both numpy and mypy declared this issue a WONTFIX. Lovely.

  • macniel
    link
    fedilink
    412 months ago

    Well yeah just because they kinda mean the same thing it doesn’t mean that they are the same. I can wholly understand why they won’t “fix” your inconvenience.

      • nickwitha_k (he/him)
        link
        132 months ago

        They are two different data types with potentially different in-memory representations.

        • Ephera
          link
          fedilink
          02 months ago

          Well, yeah, but they do mean the exact same thing, hopefully: true or false

          Although thinking about it, someone above mentioned that the numpy bool_ is an object, so I guess that is really: true or false or null/None

          • nickwitha_k (he/him)
            link
            142 months ago

            In an abstract sense, they do mean the same things but, in a technical sense, the one most relevant to programming, they do not.

            The standard Python bool type is a subclass of the integer type. This means that it is stored as either 4 bytes (int32) or 8 bytes (int64).

            The numpy.bool_ type is something closer to a native C boolean and is stored in 1 byte.

            So, memory-wise, one could store a numpy.bool_ in a Python bool but that now leaves 3-7 extra bytes that are unused in the variable. This introduces not just unnecessary memory usage but potential space for malicious data injection or extraction. Now, if one tries to store a Python bool in a numpy.bool_, if the interpreter or OS don’t throw an error and kill the process, you now have a buffer overflow/illegal memory access problem.

            What about converting on the fly? Well, that can be done but will come at a performance cost as every function that can accept a numpy.bool_ now has to perform additional type checking, validation, and conversion on every single function call. That adds up quick when processing data on scales where numpy is called for.