Explanation: Python is a programming language. Numpy is a library for python that makes it possible to run large computations much faster than in native python. In order to make that possible, it needs to keep its own set of data types that are different from python’s native datatypes, which means you now have two different bool types and two different sets of True and False. Lovely.

Mypy is a type checker for python (python supports static typing, but doesn’t actually enforce it). Mypy treats numpy’s bool_ and python’s native bool as incompatible types, leading to the asinine error message above. Mypy is “technically” correct, since they are two completely different classes. But in practice, there is little functional difference between bool and bool_. So you have to do dumb workarounds like declaring every bool values as bool | np.bool_ or casting bool_ down to bool. Ugh. Both numpy and mypy declared this issue a WONTFIX. Lovely.

  • nickwitha_k (he/him)
    link
    137 months ago

    They are two different data types with potentially different in-memory representations.

    • Ephera
      link
      fedilink
      07 months ago

      Well, yeah, but they do mean the exact same thing, hopefully: true or false

      Although thinking about it, someone above mentioned that the numpy bool_ is an object, so I guess that is really: true or false or null/None

      • nickwitha_k (he/him)
        link
        147 months ago

        In an abstract sense, they do mean the same things but, in a technical sense, the one most relevant to programming, they do not.

        The standard Python bool type is a subclass of the integer type. This means that it is stored as either 4 bytes (int32) or 8 bytes (int64).

        The numpy.bool_ type is something closer to a native C boolean and is stored in 1 byte.

        So, memory-wise, one could store a numpy.bool_ in a Python bool but that now leaves 3-7 extra bytes that are unused in the variable. This introduces not just unnecessary memory usage but potential space for malicious data injection or extraction. Now, if one tries to store a Python bool in a numpy.bool_, if the interpreter or OS don’t throw an error and kill the process, you now have a buffer overflow/illegal memory access problem.

        What about converting on the fly? Well, that can be done but will come at a performance cost as every function that can accept a numpy.bool_ now has to perform additional type checking, validation, and conversion on every single function call. That adds up quick when processing data on scales where numpy is called for.