What is Numpy?
Introductory
If you ever wondered how to kick-start your career in data science, data analysis, machine learning and any other fields diverging from the ones that I just mentioned you should be aware that Numpy is the best framework to do this, and has became to be a mandatory skill within the field. Even though this powerful framework might come with some disadvantages, it is worth mentioning for any young and eager fellow student that nobody desire to reinvent the wheel. For instance if you have a built-in structure monkey with plenty of under the hood features that occupies 30 bits, nobody ever will rewrite a gigantic project just for the sake of a new monkey structure that occupies 28 bits. So feasibility over performance most of the cases! And this is where Numpy framework comes into action
Performance
Here you can see below a performance comparison between pure Python and Numpy:
Vectorized operations
import numpy as np
import random
import time
def func(x):
return 7 * x**2 + 3*x + 9
if __name__ == '__main__':
t0 = time.time()
py_list = [random.randint(0, 35) for i in range(2000000)]
py_list = [7 * x**2 + 3*x + 9 for x in py_list]
t1 = time.time()
print(f"Time taken by pure Python: {t1 - t0:.4f} seconds")
t0 = time.time()
nd = np.random.randint(0, 35, size=2000000)
vectorized_func = np.vectorize(func)
nd = vectorized_func(nd)
t1 = time.time()
print(f"Time taken by Numpy: {t1 - t0:.4f} seconds")
an example of output:
Time taken by pure Python: 1.8844 seconds
Time taken by Numpy: 0.8967 seconds
Hence, we can observe the vector problems and iterators that are handled much more efficient by Numpy rather by pure python because of some optimization solutions. For instance, Numpy framework uses itertools.
Matrix multiplication
from random import randint
from time import time
import numpy as np
if __name__ == '__main__':
n = 500
a = [[randint(0, 100) for i in range(n)] for j in range(n)] # The first matrix
b = [[randint(0, 100) for i in range(n)] for j in range(n)] # The second matrix
c = [[0 for i in range(n)] for j in range(n)] # The result matrix
t0 = time()
for i in range(n):
for j in range(n):
for k in range(n):
c[i][j] += a[i][k] * b[k][j]
t1 = time()
print(f'matrix multiplication in Python: {t1-t0:.3f} seconds')
t0 = time()
nd_a = np.random.randint(0, 100, size=(n, n))
nd_b = np.random.randint(0, 100, size=(n, n))
nd_c = nd_a.dot(nd_b)
t1 = time()
print(f'matrix multiplication in Numpy: {t1-t0:.3f} seconds')
Am example of ouput:
matrix multiplication in Python: 36.331 seconds
matrix multiplication in Numpy: 0.064 seconds
Element-wise product
Element-wise product(also called Hadamard product) is the product of two column vectors multiplied element by element which produce another column vector of the same size:
import numpy as np
from time import time
# Pure Python
a = list(range(10000000))
b = list(range(10000000))
t0 = time()
c = [x + y for x, y in zip(a, b)]
t1 = time()
print(f'Pure Python: {t1 - t0:.4f} seconds')
# NumPy
nd_a = np.arange(1000000)
nd_b = np.arange(1000000)
t0 = time()
nd_c = nd_a + nd_b
t1 = time()
print(f'NumPy scalar: {t1 - t0:.4f} seconds')
Python element-wise product: 0.6340 seconds
NumPy element-wise product: 0.010 seconds
Scalar product
import numpy as np
from time import time
# Pure Python
a = list(range(10000000))
b = list(range(10000000))
t0 = time()
c = sum(x + y for x, y in zip(a, b))
t1 = time()
print(f'Pure Python: {t1 - t0:.4f} seconds')
# NumPy
nd_a = np.arange(1000000)
nd_b = np.arange(1000000)
t0 = time()
nd_c = nd_a.dot(nd_b)
t1 = time()
print(f'NumPy scalar: {t1 - t0:.4f} seconds')
An example of output:
Python scalar product: 1.1081 seconds
NumPy scalar product: 0.0010 seconds
Here the code is extremely similar in code to element-wise multiplication, but extremely different in performance as it follows: the python implementation doubled in time because of the function calls of sum() on the stack of the program while the numpy approach reduced to half because of parallel computing.
Under the hood
Low-level C and Fortran optimization
This data science and linear algebra framework uses low level implementations of C and Fortran for optimizing the operations like matrix multiplication. For instance, in memory saving of a matrix you may choose between row-first(C-like) or column first(Fortran-like). Another example of improvement would be in the the nested loops of i, j and k from the code from above in matrix multiplication to switch the j loop and the k loop in the case of row-first memory saving of the matrix because thus you iterate within a whole word of the processor; if you iterate through the rows of the second matrix firstly and through the columns secondly the processor ought to grasp the data from the memory each and every time which is way less efficient rather than adding a[1, 1]*b[1, 1] to c[1, 1], then adding a[1, 1]*b[1, 2] to c[1, 2] and so on until adding a[1,n]*b[q,n] to c[1,n]
Iterators
One of the very inefficient methods that are used in a high-level language like regular Python are the iterators. The comprehensions used along with the ‘for i in range(n)’ redeclare at each iteration a new count variable instead of using the same variable whose memory space is already reserved.
Parallel computing
As we showed up in the code snippets from above, there is a huge performance improvement over the 2 different approaches thanks to parallel computing that numpy brings on the table for a real data scientist or data analyst or machine learning developer or whatsoever.
Numpy in action
Constructors
- Constructors based on other structures
import numpy as np
if __name__ == "__main__":
li = [[1, 2, 3], [4, 5, 6]]
list_based_nd = np.array(li) # 2-d array
print(list_based_nd)
# [[1 2 3]
# [4 5 6]]
t = (13, 42, 57)
tuple_based_nd = np.array(t) # 1-d array
print(tuple_based_nd)
# [13 42 57]
n = 13
int_based_nd = np.array(n) # 0-d array; basically is a scalar
print(int_based_nd)
# 13
new_li = [li[0], t, li[1]]
nested_nd = np.array(new_li) # 3-d array based on nested derived types
print(nested_nd)
# [[ 1 2 3]
# [13 42 57]
# [ 4 5 6]]
identity_matrix = np.ones((3,3), np.int16) # 3x3 array of short-int of 2 bytes
print(identity_matrix)
# [[1 1 1]
# [1 1 1]
# [1 1 1]]
zero_matrix = np.zeros((2,2), np.float64) # 2x2 matrix of floating-point of 8 bytes
print(zero_matrix)
# [[0 0]
# [0 0]]
The n-dimesional array my go on and on using the built-in constructors based on lists, tuples or nested lists and tuples which encapsulates each other recursively.
Reshaping
You can easily reshape and manipulate the numpy object as data types declaration, the size tuple or filling it with specified values or even with special patterns.
Rich built-in features
You have plenty of built-in features of linear algebra as QR, eigenvalues and eigenvectors extraction, Single Value Decomposition and Orthogonal Decomposition etc.
Further reading
Some cool stuff that I found out browsing and truly recommend for diving deeper are listed below:
https://chelseatroy.com/2018/11/07/code-mechanic-numpy-vectorization/