I just came across this strange behaviour of
>>> import numpy >>> ar = numpy.array([1,2,3], dtype=numpy.uint64) >>> gen = (el for el in ar) >>> lst = [el for el in ar] >>> numpy.sum(gen) 6.0 >>> numpy.sum(lst) 6 >>> numpy.sum(iter(lst)) <listiterator object at 0x87d02cc>
According to the documentation the result should be of the same
dtype of the iterable, but then why in the first case a
numpy.float64 is returned instead of an
numpy.uint64? And how come the last example does not return any kind of sum and does not raise any error either?
In general, numpy functions don't always do what you might expect when working with generators. To create a numpy array, you need to know its size and type before creating it, and this isn't possible for generators. So many numpy functions either don't work with generators, or do this sort of thing where they fall back on Python builtins.
However, for the same reason, using generators often isn't that useful in Numpy contexts. There's no real advantage to making a generator from a Numpy object, because you already have to have the entire Numpy object in memory anyway. If you need all the types to stay as you specify, you should just not wrap your Numpy objects in generators.
Some more info: Technically, the argument to
np.sum is supposed to be an "array-like" object, not an iterable. Array-like is defined in the documentation as:
An array, any object exposing the array interface, an object whose
__array__method returns an array, or any (nested) sequence.
The array interface is documented here. Basically, arrays have to have a fixed shape and a uniform type.
Generators don't fit this protocol and so aren't really supported. Many numpy functions are nice and will accept other sorts of objects that don't technically qualify as array-like, but a strict reading of the docs implies you can't rely on this behavior. The operations may work, but you can't expect all the types to be preserved perfectly.