Matrix Calculation with NumPy
Disclaimer: This post has been translated to English using a machine translation model. Please, let me know if you find any mistakes.
1. Summary
Let's take a brief introduction to the matrix calculation library NumPy
. This library is designed for all types of matrix calculations, so we will focus only on the part that will be useful to understand the calculations within neural networks, but we will leave out interesting things like the use of the library for linear algebra.
2. What is NumPy?
NumPy is a Python library designed for matrix calculations. Matrix calculations are widely used in science in general and in data science in particular, so it is necessary to have a library that does this very well.
Its name means Numerical Python
Its primary object is the ndarray
, which encapsulates n
-dimensional arrays of homogeneous data types, unlike Python lists, which can have data of different types.
NumPy aims to perform matrix calculations much faster than with Python lists, but how is this possible?
- NumPy uses compiled code, while Python uses interpreted code. The difference is that Python at runtime has to interpret, compile, and execute the code, while NumPy is already compiled, so it runs faster
ndarray
s have a fixed size, unlike Python lists which are dynamic. If you want to modify the size of an array in NumPy, a new one will be created and the old one will be deleted- All elements of
ndarray
s are of the same data type, unlike Python lists which can have elements of different types - Part of the NumPy code is written in C/C++ (much faster than Python)
- The data in arrays is stored in memory contiguously, unlike Python lists, which makes it much faster to manipulate them
Numpy offers the facility of using code that is simple to write and read, but is written and precompiled in C, which makes it much faster.
Suppose we want to multiply two vectors, this would be done in C in the following way
for (i = 0; i < rows; i++): {
for (j = 0; j < columns; j++):
c[i][j] = a[i][j]*b[i][j];
}
Numpy offers the possibility of executing this code underneath, but much easier to write and understand by means of
``python c = a * b
NumPy offers vectorized code, which means you don't have to write loops, but they are being executed underneath in optimized and precompiled C code. This has the following advantages:
- The code is easier to write and read
- Since fewer lines of code are needed, there is less likelihood of introducing errors
- The code is more similar to mathematical notation
2.1. NumPy as np
Usually when importing NumPy, it is typically imported with the alias np
import numpy as npprint(np.__version__)
1.18.1
3. Speed of NumPy
As explained, NumPy performs the calculation much faster than Python lists, let's see an example where the dot product of two matrices is performed using Python lists and ndarray
s.
from time import time# Dimensión de las matricesdim = 1000shape = (dim, dim)# Se crean dos ndarrays de NumPy de dimensión dim x dimndarray_a = np.ones(shape=shape)ndarray_b = np.ones(shape=shape)# Se crean dos listas de Python de dimensión dim x dim a partir de los ndarrayslist_a = list(ndarray_a)list_b = list(ndarray_b)# Se crean el ndarray y la lista de Python donde se guardarán los resultadosndarray_c = np.empty(shape=shape)list_c = list(ndarray_c)# Producto escalar de dos listas de pythont0 = time()for fila in range(dim):for columna in range(dim):list_c[fila][columna] = list_a[fila][columna] * list_b[fila][columna]t = time()t_listas = t-t0print(f"Tiempo para realizar el producto escalar de dos listas de Python de dimensiones {dim}x{dim}: {t_listas:.4f} ms")# Producto escalar de dos ndarrays de NumPyt0 = time()ndarray_c = ndarray_a * ndarray_bt = time()t_ndarrays = t-t0print(f"Tiempo para realizar el producto escalar de dos ndarrays de NumPy de dimensiones {dim}x{dim}: {t_ndarrays:.4f} ms")# Comparación de tiemposprint(f"\nHacer el cálculo con listas de Python tarda {t_listas/t_ndarrays:.2f} veces más rápido que con ndarrays de NumPy")
Tiempo para realizar el producto escalar de dos listas de Python de dimensiones 1000x1000: 0.5234 msTiempo para realizar el producto escalar de dos ndarrays de NumPy de dimensiones 1000x1000: 0.0017 msHacer el cálculo con listas de Python tarda 316.66 veces más rápido que con ndarrays de NumPy
4. Matrices in NumPy
In NumPy, a matrix is an ndarray
object
arr = np.array([1, 2, 3, 4, 5])print(arr)print(type(arr))
[1 2 3 4 5]<class 'numpy.ndarray'>
4.1. How to Create Matrices
With the array()
method, ndarray
s can be created by inputting Python lists (as in the previous example) or tuples.
arr = np.array((1, 2, 3, 4, 5))print(arr)print(type(arr))
[1 2 3 4 5]<class 'numpy.ndarray'>
With the zeros()
method, you can create matrices filled with zeros
arr = np.zeros((3, 4))print(arr)
[[0. 0. 0. 0.][0. 0. 0. 0.][0. 0. 0. 0.]]
The method zeros_like(A)
returns an array with the same shape as the array A, but filled with zeros
A = np.array((1, 2, 3, 4, 5))arr = np.zeros_like(A)print(arr)
[0 0 0 0 0]
With the ones()
method, you can create matrices filled with ones
arr = np.ones((4, 3))print(arr)
[[1. 1. 1.][1. 1. 1.][1. 1. 1.][1. 1. 1.]]
The method ones_like(A)
returns a matrix with the same shape as matrix A, but filled with ones.
A = np.array((1, 2, 3, 4, 5))arr = np.ones_like(A)print(arr)
[1 1 1 1 1]
With the empty()
method, we can create arrays with the desired dimensions, but they are randomly initialized.
arr = np.empty((6, 3))print(arr)
[[4.66169180e-310 2.35541533e-312 2.41907520e-312][2.14321575e-312 2.46151512e-312 2.31297541e-312][2.35541533e-312 2.05833592e-312 2.22809558e-312][2.56761491e-312 2.48273508e-312 2.05833592e-312][2.05833592e-312 2.29175545e-312 2.07955588e-312][2.14321575e-312 0.00000000e+000 0.00000000e+000]]
The method empty_like(A)
returns an array with the same shape as matrix A, but initialized randomly.
A = np.array((1, 2, 3, 4, 5))arr = np.empty_like(A)print(arr)
[4607182418800017408 4611686018427387904 46139378182410731524616189618054758400 4617315517961601024]
With the method arange(start, stop, step)
, you can create arrays within a specified range. This method is similar to Python's range()
method.
arr = np.arange(10, 30, 5)print(arr)
[10 15 20 25]
When arange
is used with floating-point arguments, it is generally not possible to predict the number of elements obtained, because the precision of floating-point is finite.
For this reason, it is usually better to use the linspace(start, stop, n)
function which takes the number of elements we want as an argument, instead of the step size.
arr = np.linspace(0, 2, 9)print(arr)
[0. 0.25 0.5 0.75 1. 1.25 1.5 1.75 2. ]
Finally, if we want to create matrices with random numbers we can use the random.rand
function with a tuple with the dimensions as a parameter
arr = np.random.rand(2, 3)print(arr)
[[0.32726085 0.65571767 0.73126697][0.91938206 0.9862451 0.95033649]]
4.2. Dimensions of matrices
In NumPy, we can create matrices of any dimension. To get the dimension of an array, we use the ndim
method.
Matrix of dimension 0, which would be equivalent to a number
arr = np.array(42)print(arr)print(arr.ndim)
420
1-dimensional matrix, which would be equivalent to a vector
arr = np.array([1, 2, 3, 4, 5])print(arr)print(arr.ndim)
[1 2 3 4 5]1
Matrix of dimension 2, which would be equivalent to a matrix
arr = np.array([[1, 2, 3, 4, 5],[6, 7, 8, 9, 10]])print(arr)print(arr.ndim)
[[ 1 2 3 4 5][ 6 7 8 9 10]]2
Matrix of dimension 3
arr = np.array([[[1, 2, 3, 4, 5],[6, 7, 8, 9, 10]],[[11, 12, 13, 14, 15],[16, 17, 18, 19, 20]]])print(arr)print(arr.ndim)
[[[ 1 2 3 4 5][ 6 7 8 9 10]][[11 12 13 14 15][16 17 18 19 20]]]3
N-dimensional array. When creating ndarray
s, the number of dimensions can be set using the ndim
parameter.
arr = np.array([1, 2, 3, 4, 5], ndmin=6)print(arr)print(arr.ndim)
[[[[[[1 2 3 4 5]]]]]]6
4.3. Size of the matrices
If instead of the dimension of the matrix, we want to see its size, we can use the shape
method.
arr = np.array([[[1, 2, 3, 4, 5],[6, 7, 8, 9, 10]],[[11, 12, 13, 14, 15],[16, 17, 18, 19, 20]]])print(arr.shape)
(2, 2, 5)
5. Data Type
The data that NumPy arrays can store are the following:
i
- Integerb
- Booleanu
- Unsigned integerf
- Floatc
- Complex floatm
- TimedeltaM
- DateTimeO
- ObjectS
- StringU
- Unicode stringV
- Fixed memory fragment for another type (void)
We can check the data type of an array using dtype
arr = np.array([1, 2, 3, 4])print(arr.dtype)arr = np.array(['apple', 'banana', 'cherry'])print(arr.dtype)
int64<U6
We can also create arrays indicating the data type we want them to have using dtype
arr = np.array([1, 2, 3, 4], dtype='i')print("Enteros:")print(arr)print(arr.dtype)arr = np.array([1, 2, 3, 4], dtype='f')print("\nFloat:")print(arr)print(arr.dtype)arr = np.array([1, 2, 3, 4], dtype='f')print("\nComplejos:")print(arr)print(arr.dtype)arr = np.array([1, 2, 3, 4], dtype='S')print("\nString:")print(arr)print(arr.dtype)arr = np.array([1, 2, 3, 4], dtype='U')print("\nUnicode string:")print(arr)print(arr.dtype)arr = np.array([1, 2, 3, 4], dtype='O')print("\nObjeto:")print(arr)print(arr.dtype)
Enteros:[1 2 3 4]int32Float:[1. 2. 3. 4.]float32Complejos:[1. 2. 3. 4.]float32String:[b'1' b'2' b'3' b'4']|S1Unicode string:['1' '2' '3' '4']<U1Objeto:[1 2 3 4]object
6. Mathematical operations
6.1. Basic Operations
Matrix operations are performed element-wise. For example, if we add two matrices, the elements of each matrix in the same position will be added, just as it is done in the mathematical addition of two matrices.
A = np.array([1, 2, 3])B = np.array([1, 2, 3])print(f"Matriz A: tamaño {A.shape}\n{A}\n")print(f"Matriz B: tamaño {B.shape}\n{B}\n")C = A + Bprint(f"Matriz C: tamaño {C.shape}\n{C}\n")D = A - Bprint(f"Matriz D: tamaño {D.shape}\n{D}")
Matriz A: tamaño (3,)[1 2 3]Matriz B: tamaño (3,)[1 2 3]Matriz C: tamaño (3,)[2 4 6]Matriz D: tamaño (3,)[0 0 0]
However, if we multiply two matrices, the multiplication of each element of the matrices is also performed (scalar product)
A = np.array([[3, 5], [4, 1]])B = np.array([[1, 2], [-3, 0]])print(f"Matriz A: tamaño {A.shape}\n{A}\n")print(f"Matriz B: tamaño {B.shape}\n{B}\n")C = A * Bprint(f"Matriz C: tamaño {C.shape}\n{C}\n")
Matriz A: tamaño (2, 2)[[3 5][4 1]]Matriz B: tamaño (2, 2)[[ 1 2][-3 0]]Matriz C: tamaño (2, 2)[[ 3 10][-12 0]]
To perform the matrix product that has been taught in mathematics all your life, you need to use the @
operator or the dot
method.
A = np.array([[3, 5], [4, 1], [6, -1]])B = np.array([[1, 2, 3], [-3, 0, 4]])print(f"Matriz A: tamaño {A.shape}\n{A}\n")print(f"Matriz B: tamaño {B.shape}\n{B}\n")C = A @ Bprint(f"Matriz C: tamaño {C.shape}\n{C}\n")D = A.dot(B)print(f"Matriz D: tamaño {D.shape}\n{D}")
Matriz A: tamaño (3, 2)[[ 3 5][ 4 1][ 6 -1]]Matriz B: tamaño (2, 3)[[ 1 2 3][-3 0 4]]Matriz C: tamaño (3, 3)[[-12 6 29][ 1 8 16][ 9 12 14]]Matriz D: tamaño (3, 3)[[-12 6 29][ 1 8 16][ 9 12 14]]
If instead of creating a new array, you want to modify an existing one, you can use the operators +=
, -=
or *=
A = np.array([[3, 5], [4, 1]])B = np.array([[1, 2], [-3, 0]])print(f"Matriz A: tamaño {A.shape}\n{A}\n")print(f"Matriz B: tamaño {B.shape}\n{B}\n")A += Bprint(f"Matriz A tras suma: tamaño {A.shape}\n{A}\n")A -= Bprint(f"Matriz A tras resta: tamaño {A.shape}\n{A}\n")A *= Bprint(f"Matriz A tras multiplicación: tamaño {A.shape}\n{A}\n")
Matriz A: tamaño (2, 2)[[3 5][4 1]]Matriz B: tamaño (2, 2)[[ 1 2][-3 0]]Matriz A tras suma: tamaño (2, 2)[[4 7][1 1]]Matriz A tras resta: tamaño (2, 2)[[3 5][4 1]]Matriz A tras multiplicación: tamaño (2, 2)[[ 3 10][-12 0]]
Operations can be performed on all elements of a matrix, this is thanks to a property called broadcasting
which we will discuss in more detail later.
A = np.array([[3, 5], [4, 1]])print(f"Matriz A: tamaño {A.shape}\n{A}\n")B = A * 2print(f"Matriz B: tamaño {B.shape}\n{B}\n")C = A ** 2print(f"Matriz C: tamaño {C.shape}\n{C}\n")D = 2*np.sin(A)print(f"Matriz D: tamaño {D.shape}\n{D}")
Matriz A: tamaño (2, 2)[[3 5][4 1]]Matriz B: tamaño (2, 2)[[ 6 10][ 8 2]]Matriz C: tamaño (2, 2)[[ 9 25][16 1]]Matriz D: tamaño (2, 2)[[ 0.28224002 -1.91784855][-1.51360499 1.68294197]]
6.2. Functions on Arrays
As you can see in the last calculation, NumPy offers function operators over arrays, there are plenty of functions that can be performed on arrays, such as mathematical, logical, linear algebra functions, etc. Below we show some.
A = np.array([[3, 5], [4, 1]])print(f"A\n{A}\n")print(f"exp(A)\n{np.exp(A)}\n")print(f"sqrt(A)\n{np.sqrt(A)}\n")print(f"cos(A)\n{np.cos(A)}\n")
A[[3 5][4 1]]exp(A)[[ 20.08553692 148.4131591 ][ 54.59815003 2.71828183]]sqrt(A)[[1.73205081 2.23606798][2. 1. ]]cos(A)[[-0.9899925 0.28366219][-0.65364362 0.54030231]]
There are some functions that return information from arrays, such as the mean
A = np.array([[3, 5], [4, 1]])print(f"A\n{A}\n")print(f"A.mean()\n{A.mean()}\n")
A[[3 5][4 1]]A.mean()3.25
However, we can obtain this information from each axis through the axis
attribute; if it is 0, it is done over each column, whereas if it is 1, it is done over each row.
A = np.array([[3, 5], [4, 1]])print(f"A\n{A}\n")print(f"A.mean() columnas\n{A.mean(axis=0)}\n")print(f"A.mean() filas\n{A.mean(axis=1)}\n")
A[[3 5][4 1]]A.mean() columnas[3.5 3. ]A.mean() filas[4. 2.5]
6.3. Broadcasting
Matrix operations can be performed with matrices of different dimensions. In this case, NumPy will detect this and project the smaller matrix to match the larger one.
This is a great feature of NumPy, which allows calculations to be performed on arrays without having to worry about matching their dimensions.
A = np.array([1, 2, 3])print(f"A\n{A}\n")B = A + 5print(f"B\n{B}\n")
A[1 2 3]B[6 7 8]
A = np.array([1, 2, 3])B = np.ones((3,3))print(f"A\n{A}\n")print(f"B\n{B}\n")C = A + Bprint(f"C\n{C}\n")
A[1 2 3]B[[1. 1. 1.][1. 1. 1.][1. 1. 1.]]C[[2. 3. 4.][2. 3. 4.][2. 3. 4.]]
A = np.array([1, 2, 3])B = np.array([[1], [2], [3]])print(f"A\n{A}\n")print(f"B\n{B}\n")C = A + Bprint(f"C\n{C}\n")
A[1 2 3]B[[1][2][3]]C[[2 3 4][3 4 5][4 5 6]]
7. Array Indexing
Array indexing is done the same way as with Python lists
arr = np.array([1, 2, 3, 4, 5])arr[3]
4
In the case of having more than one dimension, the index in each of them must be indicated.
arr = np.array([[1, 2, 3, 4, 5],[6, 7, 8, 9, 10]])arr[1, 2]
8
Negative indexing can be used
arr[-1, -2]
9
If one of the axes is not indicated, it is considered as an integer.
arr = np.array([[1, 2, 3, 4, 5],[6, 7, 8, 9, 10]])arr[1]
array([ 6, 7, 8, 9, 10])
7.1. Slices of arrays
When indexing, we can select parts of arrays just as we did with Python lists.
Remember that it was done as follows:
start:stop:step
Where the range goes from start
(inclusive) to stop
(exclusive) with a step
If step
is not specified, it defaults to 1
For example, if we want items from the second row and from the second to the fourth column:
- We select the second row with a 1 (since counting starts from 0)
- We select from the second to the fourth row using 1:4, the 1 to indicate the second column and the 4 to indicate the fifth (since the second number indicates the column where it ends without including this column). The two numbers considering that counting starts from 0
print(arr)print(arr[1, 1:4])
[[ 1 2 3 4 5][ 6 7 8 9 10]][7 8 9]
We can take from one position to the end
arr[1, 2:]
array([ 8, 9, 10])
From the beginning to a position
arr[1, :3]
array([6, 7, 8])
Set the range with negative numbers
Introducción
En matemáticas, los rangos negativos permiten representar intervalos en los que los números negativos están incluidos. Es una técnica útil, especialmente en el campo de la programación y gráficos por computadora. En este documento, vamos a aprender cómo establecer un rango que incluya números negativos.
Ejemplo en Python
A continuación se muestra un ejemplo sencillo de cómo podrías hacer esto en Python:
# Usando range() con números negativos
for i in range(-10, 0):
print(i)
Este código imprimirá números desde -10 hasta -1. La función range()
de Python nos permite definir fácilmente un rango con números negativos.
Aplicaciones
- Gráficos por computadora: Los rangos negativos son útiles para representar coordenadas en gráficos que pueden centrarse alrededor de un punto (0,0).
- Análisis de datos: En estadísticas, puede ser necesario trabajar con rangos que incluyan números negativos, especialmente al tratar con desviaciones y valores por debajo de la media.
- Simulaciones físicas: Muchos fenómenos físicos tienen estados negativos, como temperaturas y fuerzas, lo que hace esencial trabajar con rangos negativos para modelar estos sistemas.
Conclusión
Establecer rangos que incluyan números negativos es una herramienta esencial en varias disciplinas. Python ofrece una manera simple y eficaz para crear estos rangos, lo que facilita su uso en diversas aplicaciones.
arr[1, -3:-1]
array([8, 9])
Choose the step
arr[1, 1:4:2]
array([7, 9])
7.2. Iteration over Arrays
Iteration over multidimensional arrays is done with respect to the first axis
M = np.array( [[[ 0, 1, 2],[ 10, 12, 13]],[[100,101,102],[110,112,113]]])print(f'Matriz de dimensión: {M.shape}\n')i = 0for fila in M:print(f'Fila {i}: {fila}')i += 1
Matriz de dimensión: (2, 2, 3)Fila 0: [[ 0 1 2][10 12 13]]Fila 1: [[100 101 102][110 112 113]]
However, if what we want is to iterate over each item, we can use the 'flat' method
i = 0for fila in M.flat:print(f'Elemento {i}: {fila}')i += 1
Elemento 0: 0Elemento 1: 1Elemento 2: 2Elemento 3: 10Elemento 4: 12Elemento 5: 13Elemento 6: 100Elemento 7: 101Elemento 8: 102Elemento 9: 110Elemento 10: 112Elemento 11: 113
8. Copying Arrays
In NumPy we have two ways to copy arrays, using copy
, which makes a new copy of the array, and using view
which makes a view of the original array.
The copy is proprietary of the data and any changes made to the copy will not affect the original array, and any changes made to the original array will not affect the copy.
The view does not own the data and any changes made to the copy will affect the original array, and any changes made to the original array will affect the copy.
8.1. Copy
arr = np.array([1, 2, 3, 4, 5])copy_arr = arr.copy()arr[0] = 42copy_arr[1] = 43print(f'Original: {arr}')print(f'Copia: {copy_arr}')
Original: [42 2 3 4 5]Copia: [ 1 43 3 4 5]
8.2. View
arr = np.array([1, 2, 3, 4, 5])view_arr = arr.view()arr[0] = 42view_arr[1] = 43print(f'Original: {arr}')print(f'Vista: {view_arr}')
Original: [42 43 3 4 5]Vista: [42 43 3 4 5]
8.3. Data Owner
When in doubt if we have a copy or a view, we can use base
arr = np.array([1, 2, 3, 4, 5])copy_arr = arr.copy()view_arr = arr.view()print(copy_arr.base)print(view_arr.base)
None[1 2 3 4 5]
9. Shape of Matrices
We can know the shape of the matrix using the shape
method. This will return a tuple, the size of the tuple represents the dimensions of the matrix, and each element of the tuple indicates the number of items in each of the matrix dimensions.
arr = np.array([[[1, 2, 3, 4, 5],[6, 7, 8, 9, 10]],[[11, 12, 13, 14, 15],[16, 17, 18, 19, 20]]])print(arr)print(arr.shape)
[[[ 1 2 3 4 5][ 6 7 8 9 10]][[11 12 13 14 15][16 17 18 19 20]]](2, 2, 5)
9.1. Reshape
We can change the shape of the arrays to whatever we want using the reshape
method.
For example, the previous array, which has a shape of (2, 2, 4)
. We can reshape it to (5, 4)
arr_reshape = arr.reshape(5, 4)print(arr_reshape)print(arr_reshape.shape)
[[ 1 2 3 4][ 5 6 7 8][ 9 10 11 12][13 14 15 16][17 18 19 20]](5, 4)
Keep in mind that to resize matrices, the number of items in the new shape must have the same number of items as the original shape.
That is, in the previous example, the first matrix had 20 items (2x2x4), and the new matrix has 20 items (5x4). What we cannot do is resize it to a matrix of size (3, 4), as there would be a total of 12 items.
arr_reshape = arr.reshape(3, 4)
---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-12-29e85875d1df> in <module>()----> 1 arr_reshape = arr.reshape(3, 4)ValueError: cannot reshape array of size 20 into shape (3,4)
9.2. Unknown Dimension
In the case that we want to change the shape of a matrix and we don't care about one of the dimensions, or we don't know it, we can have NumPy calculate it for us by entering a -1
as a parameter
arr = np.array([[[1, 2, 3, 4, 5],[6, 7, 8, 9, 10]],[[11, 12, 13, 14, 15],[16, 17, 18, 19, 20]]])arr_reshape = arr.reshape(2, -1)print(arr_reshape)print(arr_reshape.shape)
[[ 1 2 3 4 5 6 7 8 9 10][11 12 13 14 15 16 17 18 19 20]](2, 10)
Keep in mind that you cannot put any number in the known dimensions. The number of items in the original matrix must be a multiple of the known dimensions.
In the previous example, the array has 20 items, which is a multiple of 2, a known dimension introduced. One could not have used 3 as a known dimension since 20 is not a multiple of 3, and there would be no number that could be put in the unknown dimension that would make a total of 20 items.
9.3. Array Flattening
We can flatten arrays, that is, convert them to a single dimension using reshape(-1)
. In this way, regardless of the dimensions of the original array, the new one will always have a single dimension.
arr = np.array([[[1, 2, 3, 4, 5],[6, 7, 8, 9, 10]],[[11, 12, 13, 14, 15],[16, 17, 18, 19, 20]]])arr_flatten = arr.reshape(-1)print(arr_flatten)print(arr_flatten.shape)
[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20](20,)
Another way to flatten an array is by using the ravel()
method
arr = np.array([[[1, 2, 3, 4, 5],[6, 7, 8, 9, 10]],[[11, 12, 13, 14, 15],[16, 17, 18, 19, 20]]])arr_flatten = arr.ravel()print(arr_flatten)print(arr_flatten.shape)
[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20](20,)
9.4. Transposed Matrix
You can obtain the transpose of a matrix using the T
method. Transposing a matrix means swapping the rows and columns of the matrix, as illustrated in the following image.
arr = np.array([[1, 0, 4],[0, 5, 0],[6, 0, -9]])arr_t = arr.Tprint(arr_t)print(arr_t.shape)
[[ 1 0 6][ 0 5 0][ 4 0 -9]](3, 3)
10. Stacking Arrays
10.1. Vertical stacking
Matrices can be stacked vertically (joining rows) using the vstack()
method.
a = np.array([[1, 1, 1],[2, 2, 2],[3, 3, 3]])b = np.array([[4, 4, 4],[5, 5, 5],[6, 6, 6]])c = np.vstack((a,b))c
array([[1, 1, 1],[2, 2, 2],[3, 3, 3],[4, 4, 4],[5, 5, 5],[6, 6, 6]])
If there are matrices with more than 2 dimensions, vstack()
will stack along the first dimension.
a = np.array([[[1, 1],[2, 2]],[[3, 3],[4, 4]]])b = np.array([[[5, 5],[6, 6]],[[7, 7],[8, 8]]])c = np.vstack((a,b))c
array([[[1, 1],[2, 2]],[[3, 3],[4, 4]],[[5, 5],[6, 6]],[[7, 7],[8, 8]]])
10.2. Horizontal Stacking
Arrays can be stacked horizontally (joining columns) using the hstack()
method
a = np.array([[1, 2, 3],[1, 2, 3],[1, 2, 3]])b = np.array([[4, 5, 6],[4, 5, 6],[4, 5, 6]])c = np.hstack((a,b))c
array([[1, 2, 3, 4, 5, 6],[1, 2, 3, 4, 5, 6],[1, 2, 3, 4, 5, 6]])
If you have matrices with more than 2 dimensions, hstack()
will stack along the second dimension
a = np.array([[[1, 1],[2, 2]],[[3, 3],[4, 4]]])b = np.array([[[5, 5],[6, 6]],[[7, 7],[8, 8]]])c = np.hstack((a,b))c
array([[[1, 1],[2, 2],[5, 5],[6, 6]],[[3, 3],[4, 4],[7, 7],[8, 8]]])
Another way to add columns to a matrix is through the column_stack()
method.
a = np.array([[1, 2, 3],[1, 2, 3],[1, 2, 3]])b = np.array([4, 4, 4])c = np.column_stack((a,b))c
array([[1, 2, 3, 4],[1, 2, 3, 4],[1, 2, 3, 4]])
10.3. Stacking in depth
Arrays can be stacked in depth (third dimension) using the dstack()
method
a = np.array([[[1, 1],[2, 2]],[[3, 3],[4, 4]]])b = np.array([[[1, 1],[2, 2]],[[3, 3],[4, 4]]])c = np.dstack((a,b))print(f"c: {c}\n")print(f"a.shape: {a.shape}, b.shape: {b.shape}, c.shape: {c.shape}")
c: [[[1 1 1 1][2 2 2 2]][[3 3 3 3][4 4 4 4]]]a.shape: (2, 2, 2), b.shape: (2, 2, 2), c.shape: (2, 2, 4)
If there are matrices with more than 4 dimensions, dstack()
will stack along the third dimension
a = np.array([1, 2, 3, 4, 5], ndmin=4)b = np.array([1, 2, 3, 4, 5], ndmin=4)c = np.dstack((a,b))print(f"a.shape: {a.shape}, b.shape: {b.shape}, c.shape: {c.shape}")
a.shape: (1, 1, 1, 5), b.shape: (1, 1, 1, 5), c.shape: (1, 1, 2, 5)
10.3. Custom Stacking
Using the concatenate()
method, you can choose the axis on which you want to stack the matrices.
a = np.array([[[1, 1],[2, 2]],[[3, 3],[4, 4]]])b = np.array([[[5, 5],[6, 6]],[[7, 7],[8, 8]]])conc0 = np.concatenate((a,b), axis=0) # concatenamiento en el primer ejeconc1 = np.concatenate((a,b), axis=1) # concatenamiento en el segundo ejeconc2 = np.concatenate((a,b), axis=2) # concatenamiento en el tercer ejeprint(f"conc0: {conc0}\n")print(f"conc1: {conc1}\n")print(f"conc2: {conc2}")
conc0: [[[1 1][2 2]][[3 3][4 4]][[5 5][6 6]][[7 7][8 8]]]conc1: [[[1 1][2 2][5 5][6 6]][[3 3][4 4][7 7][8 8]]]conc2: [[[1 1 5 5][2 2 6 6]][[3 3 7 7][4 4 8 8]]]
11. Splitting Matrices
11.1. Split Vertically
Matrices can be split vertically (separating rows) using the vsplit()
method
a = np.array([[1.1, 1.2, 1.3, 1.4],[2.1, 2.2, 2.3, 2.4],[3.1, 3.2, 3.3, 3.4],[4.1, 4.2, 4.3, 4.4]])[a1, a2] = np.vsplit(a, 2)print(f"a1: {a1}\n")print(f"a2: {a2}")
a1: [[1.1 1.2 1.3 1.4][2.1 2.2 2.3 2.4]]a2: [[3.1 3.2 3.3 3.4][4.1 4.2 4.3 4.4]]
If there are matrices with more than 2 dimensions, vsplit()
will split along the first dimension
a = np.array([[[1, 1],[2, 2]],[[3, 3],[4, 4]]])[a1, a2] = np.vsplit(a, 2)print(f"a1: {a1}\n")print(f"a2: {a2}")
a1: [[[1 1][2 2]]]a2: [[[3 3][4 4]]]
11.2. Split horizontally
Matrices can be divided horizontally (separating columns) using the hsplit()
method.
a = np.array([[1.1, 1.2, 1.3, 1.4],[2.1, 2.2, 2.3, 2.4],[3.1, 3.2, 3.3, 3.4],[4.1, 4.2, 4.3, 4.4]])[a1, a2] = np.hsplit(a, 2)print(f"a1: {a1}\n")print(f"a2: {a2}")
a1: [[1.1 1.2][2.1 2.2][3.1 3.2][4.1 4.2]]a2: [[1.3 1.4][2.3 2.4][3.3 3.4][4.3 4.4]]
If you have arrays with more than 2 dimensions, hsplit()
will split along the second dimension
a = np.array([[[1, 1],[2, 2]],[[3, 3],[4, 4]]])[a1, a2] = np.hsplit(a, 2)print(f"a1: {a1}\n")print(f"a2: {a2}")
a1: [[[1 1]][[3 3]]]a2: [[[2 2]][[4 4]]]
11.3. custom splitting
Using the array_split()
method, you can choose the axis on which to split the arrays.
a = np.array([[[1, 1],[2, 2]],[[3, 3],[4, 4]]])[a1_eje0, a2_eje0] = np.array_split(a, 2, axis=0)[a1_eje1, a2_eje1] = np.array_split(a, 2, axis=1)[a1_eje2, a2_eje2] = np.array_split(a, 2, axis=2)print(f"a1_eje0: {a1_eje0}\n")print(f"a2_eje0: {a2_eje0} ")print(f"a1_eje1: {a1_eje1}\n")print(f"a2_eje1: {a2_eje1} ")print(f"a1_eje2: {a1_eje2}\n")print(f"a2_eje2: {a2_eje2}")
a1_eje0: [[[1 1][2 2]]]a2_eje0: [[[3 3][4 4]]]a1_eje1: [[[1 1]][[3 3]]]a2_eje1: [[[2 2]][[4 4]]]a1_eje2: [[[1][2]][[3][4]]]a2_eje2: [[[1][2]][[3][4]]]
12. Matrix Search
If you want to search for a value within a matrix, you can use the where()
method which returns the positions where the matrix has the value we are looking for.
arr = np.array([1, 2, 3, 4, 5, 4, 4])ids = np.where(arr == 4)ids
(array([3, 5, 6]),)
Functions can be used to search, for example, if we want to find in which positions the values are even.
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])ids = np.where(arr%2)ids
(array([0, 2, 4, 6]),)
13. Sorting Arrays
Using the sort()
method we can sort arrays
arr = np.array([3, 2, 0, 1])arr_ordenado = np.sort(arr)arr_ordenado
array([0, 1, 2, 3])
If what we have are strings, it sorts them alphabetically
arr = np.array(['banana', 'apple', 'cherry'])arr_ordenado = np.sort(arr)arr_ordenado
array(['apple', 'banana', 'cherry'], dtype='<U6')
And it also sorts boolean arrays
arr = np.array([True, False, True])arr_ordenado = np.sort(arr)arr_ordenado
array([False, True, True])
If there are matrices with more than one dimension, it sorts them by dimensions, that is, if there is a 2-dimensional matrix, it sorts the numbers in the first row among themselves and the numbers in the second row among themselves.
arr = np.array([[3, 2, 4], [5, 0, 1]])arr_ordenado = np.sort(arr)arr_ordenado
array([[2, 3, 4],[0, 1, 5]])
By default, it always sorts with respect to the rows, but if sorting by another dimension is desired, it must be specified through the axis
variable.
arr = np.array([[3, 2, 4], [5, 0, 1]])arr_ordenado0 = np.sort(arr, axis=0) # Se ordena con respecto a la primera dimensiónarr_ordenado1 = np.sort(arr, axis=1) # Se ordena con respecto a la segunda dimensiónprint(f"arr_ordenado0: {arr_ordenado0}\n")print(f"arr_ordenado1: {arr_ordenado1}\n")
arr_ordenado0: [[3 0 1][5 2 4]]arr_ordenado1: [[2 3 4][0 1 5]]
14. Filters in Arrays
NumPy offers the possibility to search for certain elements in an array and create a new one
It does this by creating a matrix of boolean indices, that is, it creates a new matrix indicating which positions we keep from the matrix and which we do not.
Let's see an example of a boolean index array
arr = np.array([37, 85, 12, 45, 69, 22])indices_booleanos = [False, False, True, False, False, True]arr_filter = arr[indices_booleanos]print(f"Array original: {arr}")print(f"indices booleanos: {indices_booleanos}")print(f"Array filtrado: {arr_filter}")
Array original: [37 85 12 45 69 22]indices booleanos: [False, False, True, False, False, True]Array filtrado: [12 22]
As you can see, the filtered array (arr_filtr
), has only kept from the original array (arr
) the elements that match those where the indices_booleanos
array is True
Another thing we can see is that it has only kept the even elements, so now we will move on to see how to keep the even elements of a matrix without having to do it manually as we did in the previous example.
arr = np.array([[1, 2, 3, 4, 5],[6, 7, 8, 9, 10]])indices_booleanos = arr % 2 == 0arr_filter = arr[indices_booleanos]print(f"Array original: {arr}\n")print(f"indices booleanos: {indices_booleanos}\n")print(f"Array filtrado: {arr_filter}")
Array original: [[ 1 2 3 4 5][ 6 7 8 9 10]]indices booleanos: [[False True False True False][ True False True False True]]Array filtrado: [ 2 4 6 8 10]