This is the third and final in set of 3 parts of NumPy tutorials.

PART 1: 1.       What is a NumPy array?
2.       How to create and inspect NumPy arrays?
PART 2: 3.       Array indexing/Slicing
4.       Array  Operations
PART 3: 5.       What is broadcasting?
6.       Speed test: Lists Vs NumPy array

In this part, we will learn
 What is broadcasting with a simple example?
 How fast are arithmetic computations on Arrays over Lists?
Broadcasting:
In general, arrays of different dimensions cannot be added or subtracted. NumPy has a smart way to
overcome this problem by duplicating the smaller dimension array to be the size of a higher
dimension array and then performs the operation.
Ex: If we want to add array([3]) to array([1,2,3]). By simply giving array([3]) + array([1,2,3]), numpy
understands that your idea is to add [3] to every element of [1,2,3]. Immediately, it duplicates the
value [3] as many times as it is in the larger array, in this case, array([3,3,3]) and now performs the
addition operation.
To simplify, Broadcasting is the name given to the method that numpy uses to allow array arithmetic
between arrays with a different shape or size. To quote the exact definition as in scipy.org
“The term broadcasting describes how numpy treats arrays with different shapes during arithmetic
operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so
that they have compatible shapes.”
We can take 3 types of examples to efficiently convey the concept of broadcasting. Let us see
examples first and try to interpret them.

In the first cell, we added a scalar value (means single value) to a vector (means 1-D array). Like we
discussed earlier, value of scalar ‘b’ is duplicated so that both array dimensions are equal and then
addition is performed. Observe the output to see each value is incremented by a value ‘b =2’.
Similarly in the second cell, we added a scalar value to a 2-D array. Until now, the smaller array is a
single value hence broadcasting works without any limitations. But the actual purpose of
broadcasting is to add 2 arrays of different dimensions > 1. Let us see how that works and what the
limitations are.

The first example here is a row-wise broadcasting. This means each row in ‘a_2d’ is added with the
1-D array ‘b_1d.’ This happened because ‘b_1d’ is a unit row vector. Similarly, second example is a
column wise broadcasting.

Broadcasting is an interesting concept but it comes with its own limitations. Broadcasting expects at
least any one dimension (row or column) to be equal in both the arrays. Let us check by giving
different dimensions.

If we observe carefully, neither column nor row dimensions are equal and hence the ‘value error’.
Broadcasting is definitely a handy shortcut but imposes some rules to allow the operations. The
examples we covered here are very basic and are intended for beginners.
You can go ahead and further explore these articles to know more:
 Broadcasting Scipy.org
 Array broadcasting in NumPy
Speed test: Lists Vs Arrays:
We will keep it extremely simple.

I simply took 10 lakh numbers of different sequences, ‘i’ is simple numbers and ‘j’ are squares of ‘i’. I
divided ‘j’ by ‘i’ element wise (j/i).

I simply recorded time taken to do the same operation on lists and arrays and computed a ratio of it.
It says NumPy performs 22 times faster than lists.
Now, I leave it to your imagination, what it will be like if we do computations as heavy as ML
algorithms over lists and not on arrays.

Some reasons for such difference in speed are:
 NumPy is written in C, which is basically being executed behind the scenes
 NumPy arrays are more compact than lists, i.e. they take much lesser storage space than
lists

Go through these discussions if you want further read these
 For some more examples:
 Why are numpy arrays so fast?
 Why numpy instead of python lists?

Well! This is all the NumPy you need to kick start your journey through Python for data sciences.
Congratulations! You are one step closer.

Next Series in line is, Pandas Library: Data Pre-processing library that is built on top of NumPy. We
discuss more data science examples when we discuss that.

Leave a Reply

Your email address will not be published. Required fields are marked *