NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. This post acts as a cheat-sheet for using the NumPy library in python. It contains some important functions and submodules of the NumPy library which are used day to day in data science and machine learning.
Merits of numpy over python list: –
An easy way to install numpy is via pip. Create a virtual environment and activate it or activate an old virtual environment and run command to install
1
pip install numpy
1
import numpy
numpy can be imported by an alias name np or any which you like
1
import numpy as np
syntax: np.array(list_of_elements)
Creates a numpy array from the list passed as parameter
1
2
var=np.array([[2,3],[3,2]])
print(var)
[[2 3]
[3 2]]
syntax:
Same as range() which creates an array of numbers between the range specified
1
2
var=np.arange(100)
print(var)
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
96 97 98 99]
syntax: np.diag(list_of_values)
Creates a diagonal array of values provided in list send as parameter.
1
2
var=np.diag(range(4))
print(var)
[[0 0 0 0]
[0 1 0 0]
[0 0 2 0]
[0 0 0 3]]
syntax: np.linspace(value1,value2,number_of_elements) It will return a numpy array which contains numbers of elements passed in number_of_elements parameter whose values is between value1 and value2 which are at equal interval.
Say let value1 = 1 and value2 = 3 and we need 5 elements then we get
[1, 1.5, 2, 2.5, 3]
1
2
var=np.linspace(1,2,5)
print(var)
[1. 1.25 1.5 1.75 2. ]
syntax: np.random.randn(dimensions)
Generates an array with random numbers
1
2
var=np.random.randn(2,2,3)
print(var)
[[[ 0.62417698 -0.53171972 -0.25723222]
[ 1.36357605 -0.43352711 0.19280647]]
[[ 0.36747904 1.20299115 0.95669774]
[ 0.13572886 0.4454223 -1.07577104]]]
syntax: np.random.normal(size=(dimensions))
Generates an array with random numbers from a normal (Gaussian) distribution
1
2
var=np.random.normal(size=(2,2))
print(var)
[[ 0.09112449 0.9688921 ]
[ 1.09627332 -1.0526527 ]]
syntax: np.zeros((dimension))
Generates a zero or null array with all elements = 0 of dimension specified.
1
2
var=np.zeros((2,2))
print(var)
[[0. 0.]
[0. 0.]]
syntax: np.ones((dimension))
Generates a array with all elements = 1 of dimension specified.
1
2
var=np.ones((2,2))
print(var)
[[1. 1.]
[1. 1.]]
syntax: np.identity(dimension)
Generates an identity matrix with dimension specified
1
2
var=np.identity(2)
print(var)
[[1. 0.]
[0. 1.]]
For reference let
1
2
var= np.array([[2,3],[4,5]])
print(var)
[[2 3]
[4 5]]
gives out the dimension of the array
1
var.ndim
2
Gives out number of elements in array
1
var.itemsize
4
gives out data type of elements
1
var.dtype
dtype('int32')
gives out the size of array ie. Number of elements in the array
1
var.size
4
gives out the dimension of array in the form of tuple
1
var.shape
(2, 2)
reshape array dimensions to the desired one if the given shape is compatible. Make sure the product of dimension before and after is the same. ie.. If we have an array of dimension (2,2) in this case product of dimension in 2x2=4 so it can be reshaped into (1,4) since product of dimension is still 2X1=4
1
2
a=var.reshape((1,4))
print(a)
[[2 3 4 5]]
1
2
a=var.reshape((1,2))
print(a)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-20-62745c373b92> in <module>
----> 1 a=var.reshape((1,2))
2 print(a)
ValueError: cannot reshape array of size 4 into shape (1,2)
Find square of each element present in numpy array
finds sine value of all the elements of the array
finds cosine value of all the elements of the array
finds the tangent value of all the elements of the array
finds log value of all the elements of the array
finds log base 10 value of all the elements of the array
finds exponents (e raise to the power) value of all the elements of the array
return standard deviation of elements of the array
return mean of elements of the array
return variance of elements of the array
return max value in the array from axis specified
return sum of all elements of numpy array from the axis specified
1
print(np.sqrt(var))
[[1.41421356 1.73205081]
[2. 2.23606798]]
1
print(np.sin(var))
[[ 0.90929743 0.14112001]
[-0.7568025 -0.95892427]]
1
print(np.cos(var))
[[-0.41614684 -0.9899925 ]
[-0.65364362 0.28366219]]
1
print(np.tan(var))
[[-2.18503986 -0.14254654]
[ 1.15782128 -3.38051501]]
1
print(np.log(var))
[[0.69314718 1.09861229]
[1.38629436 1.60943791]]
1
print(np.log10(var))
[[0.30103 0.47712125]
[0.60205999 0.69897 ]]
1
print(np.exp(var))
[[ 7.3890561 20.08553692]
[ 54.59815003 148.4131591 ]]
1
print(np.std(var))
1.118033988749895
1
print(np.mean(var))
3.5
1
print(np.var(var))
1.25
1
print(np.max(var,axis=None))
5
1
print(np.sum(var,axis=None))
14
syntax: np.vstack((arrays_to_stack_verical))
stacks vertically all the arrays send in parameter
1
2
3
a=np.zeros(shape=var.shape)
vertical=np.vstack((var,a))
print(vertical)
[[2. 3.]
[4. 5.]
[0. 0.]
[0. 0.]]
syntax: np.hstack((arrays_to_horizontal_stack))
horizontal stacks all the arrays send in parameter
1
2
3
a=np.zeros(shape=var.shape)
horizontal=np.hstack((var,a))
print(horizontal)
[[2. 3. 0. 0.]
[4. 5. 0. 0.]]
syntax np.ravel()
flats out numpy array creating just a single row
1
2
flat=np.ravel(var)
print(flat)
[2 3 4 5]
syntax: np.dot(array1,array2)
find dot product of both arrays sends in the parameter.
1
2
3
a=np.array([[3,3],[5,2]])
dot_product=np.dot(var,a)
print(dot_product)
[[21 12]
[37 22]]
syntax: np.transpose(array)
finds the transpose of the array specified
1
np.transpose(var)
array([[2, 4],
[3, 5]])
syntax: np.linalg.inv(array)
finds the inverse of array send in parameter
1
np.linalg.inv(var)
array([[-2.5, 1.5],
[ 2. , -1. ]])
syntax: np.linalg.det(array)
returns determinant of array send in parameter
1
np.linalg.det(var)
-2.0
So thats what you needed to get started for everyday data science work. There are lot of other features but these can be learned as you progress your journey. All the Best !!!