Overview

Here are some notes I took while watching Keith Galli’s video providing an introduction to NumPy.

Colab Notebook

What is NumPy

  • A multi-dimensional array library

How are List different from NumPy?

  • Lists are very slow
    • Lists are dynamically typed
    • Lists need to store a lot more information to account for unfixed data types
      • Needs to keep track of the following information for single Integer
        • Size: 4 bytes
        • Reference Count: 8 bytes
        • Object Type: 8 bytes
        • Object Value: 8 bytes
      • Does not use contiguous memory
        • Different array elements are scattered in different parts of memory
  • NumPy is very fast
    • NumPy uses fixed types
      • Don’t need to do type checking
    • Default type is Int32 (4 bytes)
    • Faster to read less bytes of memory
    • Can specify specific data types (e.g. Int16, Int8)
    • Uses contiguous memory
      • Data for an array is in the same chunk of memory
      • faster to access
      • lower CPU overhead
      • Can leverage SIMD Vector Processing
        • Single Instruction Multiple Data
          • Can perform operations on all elements simultaneously
      • Effective CPU cache utilization
      • Lot’s more functionality
        • Example: array multiplication arrayA*arrayB

Applications of NumPy

  • MATLAB replacement
    • SciPy has even more mathematical capability
  • Plotting (Matplotlib)
  • Backend (Pandas, Digital Photography)
  • Machine Learning (Tensors)

Install NumPy

  • pip install numpy
  • conda install numpy

Import NumPy

import numpy as np

The Basics

Initialize a 1D array

# Initialize a 1D array
a = np.array([1,2,3])
a 
array([1, 2, 3])

Initialize a 2D array of floats

# Initialize a 2D array of floats
b = np.array([[9.0,8.0,7.0],[6.0,5.0,4.0]])
b
array([[9., 8., 7.],
    [6., 5., 4.]])

Get Dimension

# Get Dimension
a.ndim
1

Get Shape

# Get Shape
b.shape
(2, 3)

Get Type

# Get Type
a.dtype
dtype('int64')

Specify data type

# Specify data type
a = np.array([1,2,3], dtype='int16')
a.dtype
dtype('int16')

Get Size

# Get Size: the number of bytes per array element
a.itemsize
2 (for int16)

Get total size

# Get total size: number of elements times the number of bytes per element
a.size * a.itemsize
# or
a.nbytes
6 (for 3 int16 elements)

Accessing and Changing Arrays

Indexing - NumPy v1.13 Manual

a = np.array([[1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13, 14]])
print(f'Values: {a}')
print(f'Shape: {a.shape}')

# Get a specific element [r, c]
a[1, 5]
Values: [[ 1  2  3  4  5  6  7]
         [ 8  9 10 11 12 13 14]]
Shape: (2, 7)
13

Get a specific row

# Get a specific row
a[0, :]
array([1, 2, 3, 4, 5, 6, 7])

Get a specific column

# Get a specific column
a[:, 2]
array([ 3, 10])
# Getting a little more fancy [startindex:endindex:stepsize]
a[0, 1:6:2]
# or
a[0, 1:-1:2]
array([2, 4, 6])

Change elements

# Change elements
a[1,5] = 20
a
array([[ 1,  2,  3,  4,  5,  6,  7],
      [ 8,  9, 10, 11, 12, 20, 14]])

Change column index 2

# Change column index 2 
a[:, 2] = 5
a
array([[ 1,  2,  5,  4,  5,  6,  7],
      [ 8,  9,  5, 11, 12, 20, 14]])

Change colum with two different numbers

# Change colum with two different numbers
# Needs to be the same shape as the part you want to modify
# Two elements in each column means a lenght of 2
a[:, 2] = [1,2]
a
array([[ 1,  2,  1,  4,  5,  6,  7],
      [ 8,  9,  2, 11, 12, 20, 14]])

3D Example

# 3D Example
b = np.array([[[1,2],[3,4]], [[5,6],[7,8]]])
b
array([[[1, 2],
       [3, 4]],

      [[5, 6],
       [7, 8]]])

Get specific element

# Get specific element (work outside in)
# [first_dim, second_dim, third_dim]
b[0, 1, 1]
4

Get Specific Element

# Get specific element (work outside in)
b[:,1,:]
array([[3, 4],
      [7, 8]])

Replace values

# Replace
# New value needs to be the same dimensions as what is being replaced
b[:,1,:] = [[9,9],[8,8]]
b
array([[[1, 2],
        [9, 9]],

       [[5, 6],
        [8, 8]]])

Initialize Different Types of Arrays

Array creation routines - NumPy v1.21 Manual

All 0s matrix

# All 0s matrix
print(f'1D: {np.zeros(5)}')
print(f'2D: {np.zeros((2,3))}')
print(f'3D: {np.zeros((2,3,4))}')
1D: [0. 0. 0. 0. 0.]
2D: [[0. 0. 0.]
 [0. 0. 0.]]
3D: [[[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]

 [[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]]

All 1s matrix

# All 1s matrix
print(f'1D: {np.ones(5)}')
print(f'2D: {np.ones((2,3))}')
print(f'3D: {np.ones((2,3,4))}')
1D: [1. 1. 1. 1. 1.]
2D: [[1. 1. 1.]
 [1. 1. 1.]]
3D: [[[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]]

Any number

# Any other number
np.full((2,2), 99)
array([[99, 99],
       [99, 99]])

Any other number with the same shape as another array

# Any other number (full_like)
# Use the same shape as the provided array
np.full_like(a, 55)
array([[55, 55, 55, 55, 55, 55, 55],
       [55, 55, 55, 55, 55, 55, 55]])

Random decimal numbers between 0 and 1

# Random decimal numbers between 0 and 1
# shape of (4,2)
np.random.rand(4,2)
array([[0.90796667, 0.18775268],
       [0.36853663, 0.82186396],
       [0.75724737, 0.09608278],
       [0.5953758 , 0.57110868]])

Random decimal number from shape

# Random decimal number from shape
np.random.random_sample(a.shape)
array([[0.96539982, 0.72943229, 0.10863575, 0.84796304, 0.09610215,
        0.88132328, 0.56848496],
       [0.27198747, 0.2295634 , 0.40931032, 0.99669531, 0.90768254,
        0.1626064 , 0.80310083]])

Random integer values

# Random integer values
# Max value (exclusive) and shape
np.random.randint(7, size=(3,3))
array([[6, 2, 0],
       [4, 2, 4],
       [4, 0, 4]])

Random integer values in a range

# Random integer values
# Range of values (exclusive) and shape
np.random.randint(4,7, size=(3,3))
array([[5, 4, 6],
       [4, 5, 6],
       [4, 6, 6]])

Identity matrix

# Identity matrix
np.identity(3)
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

Repeat array

# Repeat array
arr = np.array([1,2,3])
# Repeat arr 3 times element-wise
r1 = np.repeat(arr,3)
r1
array([1, 1, 1, 2, 2, 2, 3, 3, 3])

Repeat 2D array

# Repeat 2D array
arr = np.array([[1,2,3]])
# Repeat arr 3 times element-wise
r1 = np.repeat(arr,3, axis=0)
r1
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

Recreate this array

[1, 1, 1, 1, 1]
[1, 0, 0, 0, 1]
[1, 0, 9, 0, 1]
[1, 0, 0, 0, 1]
[1, 1, 1, 1, 1]
c = np.ones((5,5), dtype='int32')
c[1:-1, 1:-1] = 0
c[2,2] = 9
c
array([[1, 1, 1, 1, 1],
       [1, 0, 0, 0, 1],
       [1, 0, 9, 0, 1],
       [1, 0, 0, 0, 1],
       [1, 1, 1, 1, 1]], dtype=int32)

Be careful when copying arrays!!!

# Shallow copy
a = np.array([1,2,3])
b = a
b[0] = 100
a
array([100,   2,   3])
# Deep copy
a = np.array([1,2,3])
b = a.copy()
b[0] = 100
a
array([1, 2, 3])

Mathematics

Mathematical functions - NumPy v1.21 Manual

a = np.array([1,2,3,4])
a
array([1, 2, 3, 4])

Add

a + 2
array([3, 4, 5, 6])

Subtract

a - 2
array([-1,  0,  1,  2])

Multiply

a * 2
array([2, 4, 6, 8])

Divide

a / 2
array([0.5, 1. , 1.5, 2. ])

Shorthand

a += 2
a
array([3, 4, 5, 6])

Add Arrays

b = np.array([1,0,1,0])
a + b
array([4, 4, 6, 6])

Exponents

a ** 2
array([ 9, 16, 25, 36])

Sine

# Take the sin
np.sin(a)
array([ 0.14112001, -0.7568025 , -0.95892427, -0.2794155 ])

Cosine

# Take the cosine
np.cos(a)
array([-0.9899925 , -0.65364362,  0.28366219,  0.96017029])

Linear Algebra

Linear algebra (numpy.linalg) - NumPy v1.21 Manual

a = np.ones((2,3))
a
array([[1., 1., 1.],
       [1., 1., 1.]])
b = np.full((3,2),2)
b
array([[2, 2],
       [2, 2],
       [2, 2]])

Matrix multiplication

# Matrix multiplication
np.matmul(a,b)
array([[6., 6.],
       [6., 6.]])

Find the determinant

# Find the determinant
c = np.identity(3)
np.linalg.det(c)
1.0

Statistics

stats = np.array([[1,2,3],[4,5,6]])
stats
array([[1, 2, 3],
       [4, 5, 6]])

Get lowest value in array

# Get lowest value in array
np.min(stats)
1

Get lowest value in array along specific axis

# Get lowest value in array along specific axis
# axis=0: min values in each column
np.min(stats, axis=0)
array([1, 2, 3])

Get lowest value in array along specific axis

# Get lowest value in array along specific axis
# axis=1: min values in each row
np.min(stats, axis=1)
array([1, 4])

Get highest value in array

# Get highest value in array
np.max(stats)
6

Sum up values in array

# Sum up values in array
np.sum(stats)
21

Sum up values in array across axis

# Sum up values in array across axis
# axis=0: sum values in each column
np.sum(stats, axis=0)
array([5, 7, 9])

Sum up values in array across axis

# Sum up values in array across axis
# axis=1: sum values in each row
np.sum(stats, axis=1)
array([ 6, 15])

Reorganizing Arrays

Note: New shape needs to maintain the same number of values

before = np.array([[1,2,3,4],[5,6,7,8]])
print(before)
print(f'Shape: {before.shape}')
[[1 2 3 4]
 [5 6 7 8]]
Shape: (2, 4)

**Reshape from (2,4) to (8,1) **

# Reshape array
# Reshape from (2,4) to (8,1) 
after = before.reshape((8,1))
after
array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8]])

Reshape from (2,4) to (4,2)

# Reshape array
# Reshape from (2,4) to (4,2) 
after = before.reshape((4,2))
after
array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

Reshape from (2,4) to (2,2,2)

# Reshape array
# Reshape from (2,4) to (2,2,2) 
after = before.reshape((2,2,2))
after
array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])

Vertically stacking vectors

# Vertically stacking vectors
v1 = np.array([1,2,3,4])
v2 = np.array([5,6,7,8])

np.vstack([v1,v2])
array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

Stack vectors multiple times

# Stack vectors multiple times
np.vstack([v1,v2,v2,v1])
array([[1, 2, 3, 4],
       [5, 6, 7, 8],
       [5, 6, 7, 8],
       [1, 2, 3, 4]])

Horizontal stacks

# Horizontal stacks
np.hstack([v1, v2])
array([1, 2, 3, 4, 5, 6, 7, 8])

Combining Horizontal and Vertical Stacks

np.hstack([np.vstack([v1,v2,v2,v1]), np.vstack([v1,v2,v2,v1])])
array([[1, 2, 3, 4, 1, 2, 3, 4],
       [5, 6, 7, 8, 5, 6, 7, 8],
       [5, 6, 7, 8, 5, 6, 7, 8],
       [1, 2, 3, 4, 1, 2, 3, 4]])

Miscellaneous

Load data from text file

# Load data from text file
# Pass in file path and the delimiter character that separates values
# Casts values to float
filedata = np.genfromtxt('data.txt', delimiter=',')

Cast array values to specific type

# Cast array values to specific type
filedata = filedata.astype('int32')

Boolean Masking and Advanced Indexing

stats = np.array([[10,2,3],[-4,5,6]])
stats
array([[10,  2,  3],
       [-4,  5,  6]])

Boolean mask for values greater than 3

# Boolean mask for values greater than 3
stats > 3
array([[ True, False, False],
       [False,  True,  True]])

Index array using a boolean mask

# Index array using a boolean mask
stats[stats > 3]
array([10,  5,  6])

Index with a list

# Index with a list
a = np.array([1,2,3,4,5,6,7,8,9])
# List of indices
a[[1,2,8]]
array([2, 3, 9])

Check if any values in array return true for a boolean

# Check if any values in array return true for a boolean
np.any(a > 3, axis=0)
True

Check if all values in array return true for a boolean

# Check if all values in array return true for a boolean
np.all(a > 3, axis=0)
False

Use multiple conditions

# Use multiple conditions
((a > 3) & (a < 7))
array([False, False, False,  True,  True,  True, False, False, False])

Use multiple conditions with negation

# Use multiple conditions with negation
(~((a > 3) & (a < 7)))
array([ True,  True,  True, False, False, False,  True,  True,  True])

Test Array

test_array = np.arange(36).reshape(6, -1)
test_array
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

Range of indices

# 2:4: range of rows to index
# 0:2: range of columns to index 
test_array[2:4, 0:2]
array([[12, 13],
       [18, 19]])

List of indices

# [0,1,2,3,4]: list of rows
# [1,2,3,4,5]: list of indexes for each row
test_array[[0,1,2,3,4], [1,2,3,4,5]]
array([ 1,  8, 15, 22, 29])

Combine range and list of indices

# [0,4,5]: The list of rows
# Columns 3 and later
test_array[[0,4,5], 3:]
array([[ 3,  4,  5],
       [27, 28, 29],
       [33, 34, 35]])

References: