This is the third and final in set of 3 parts of NumPy tutorials.

PART 1: | 1. What is a NumPy array? |

2. How to create and inspect NumPy arrays? | |

PART 2: | 3. Array indexing/Slicing |

4. Array Operations | |

PART 3: | 5. What is broadcasting? |

6. Speed test: Lists Vs NumPy array |

In this part, we will learn

What is broadcasting with a simple example?

How fast are arithmetic computations on Arrays over Lists?

Broadcasting:

In general, arrays of different dimensions cannot be added or subtracted. NumPy has a smart way to

overcome this problem by duplicating the smaller dimension array to be the size of a higher

dimension array and then performs the operation.

Ex: If we want to add array([3]) to array([1,2,3]). By simply giving array([3]) + array([1,2,3]), numpy

understands that your idea is to add [3] to every element of [1,2,3]. Immediately, it duplicates the

value [3] as many times as it is in the larger array, in this case, array([3,3,3]) and now performs the

addition operation.

To simplify, Broadcasting is the name given to the method that numpy uses to allow array arithmetic

between arrays with a different shape or size. To quote the exact definition as in scipy.org

“The term broadcasting describes how numpy treats arrays with different shapes during arithmetic

operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so

that they have compatible shapes.”

We can take 3 types of examples to efficiently convey the concept of broadcasting. Let us see

examples first and try to interpret them.

In the first cell, we added a scalar value (means single value) to a vector (means 1-D array). Like we

discussed earlier, value of scalar ‘b’ is duplicated so that both array dimensions are equal and then

addition is performed. Observe the output to see each value is incremented by a value ‘b =2’.

Similarly in the second cell, we added a scalar value to a 2-D array. Until now, the smaller array is a

single value hence broadcasting works without any limitations. But the actual purpose of

broadcasting is to add 2 arrays of different dimensions > 1. Let us see how that works and what the

limitations are.

The first example here is a row-wise broadcasting. This means each row in ‘a_2d’ is added with the

1-D array ‘b_1d.’ This happened because ‘b_1d’ is a unit row vector. Similarly, second example is a

column wise broadcasting.

Broadcasting is an interesting concept but it comes with its own limitations. Broadcasting expects at

least any one dimension (row or column) to be equal in both the arrays. Let us check by giving

different dimensions.

If we observe carefully, neither column nor row dimensions are equal and hence the ‘value error’.

Broadcasting is definitely a handy shortcut but imposes some rules to allow the operations. The

examples we covered here are very basic and are intended for beginners.

You can go ahead and further explore these articles to know more:

Broadcasting Scipy.org

Array broadcasting in NumPy

Speed test: Lists Vs Arrays:

We will keep it extremely simple.

I simply took 10 lakh numbers of different sequences, ‘i’ is simple numbers and ‘j’ are squares of ‘i’. I

divided ‘j’ by ‘i’ element wise (j/i).

I simply recorded time taken to do the same operation on lists and arrays and computed a ratio of it.

It says NumPy performs 22 times faster than lists.

Now, I leave it to your imagination, what it will be like if we do computations as heavy as ML

algorithms over lists and not on arrays.

Some reasons for such difference in speed are:

NumPy is written in C, which is basically being executed behind the scenes

NumPy arrays are more compact than lists, i.e. they take much lesser storage space than

lists

Go through these discussions if you want further read these

For some more examples:

Why are numpy arrays so fast?

Why numpy instead of python lists?

Well! This is all the NumPy you need to kick start your journey through Python for data sciences.

Congratulations! You are one step closer.

Next Series in line is, Pandas Library: Data Pre-processing library that is built on top of NumPy. We

discuss more data science examples when we discuss that.