Table of Contents
NumPy, short for "Numerical Python", is an open-source Python library used for scientific computing and data analysis. It provides support for large, multi-dimensional arrays and matrices, as well as a large collection of mathematical functions to operate on these arrays. NumPy is an essential tool for any Python programmer working with data, and it is widely used in fields such as data science, machine learning, and scientific computing.
NumPy was first released in 1995 by Jim Hugunin as part of the Python library, Numeric. Numeric was later rewritten and evolved into NumPy by Travis Oliphant, and the first version of NumPy was released in 2006. NumPy is now maintained by a team of developers and has a large and active community of users.
One of the main advantages of NumPy is its ability to handle large datasets efficiently. The library is designed to work with arrays of any dimensionality, and provides efficient algorithms for manipulating these arrays. NumPy also provides a range of mathematical functions that operate on these arrays, making it an ideal tool for scientific computing.
Installing NumPy
Before we begin, we need to make sure that NumPy is installed on our system. To install NumPy, we can use the following command in the terminal:
pip install numpy
Creating NumPy Arrays
The core of NumPy is the ndarray, or n-dimensional array. An ndarray is a collection of elements of the same type that can be accessed and manipulated efficiently. NumPy provides several functions for creating ndarrays, including:
- numpy.array(): Convert Python lists or tuples to ndarrays.
- numpy.zeros(): Create an array filled with zeros.
- numpy.ones(): Create an array filled with ones.
- numpy.full(): Create an array filled with a specific value.
- numpy.arange(): Create an array with a range of values.
- numpy.linspace(): Create an array with a specified number of evenly spaced values.
import numpy as np
# Create a one-dimensional array
a = np.array([1, 2, 3, 4, 5])
print(a)
# Create a two-dimensional array
b = np.array([[1, 2, 3], [4, 5, 6]])
print(b)
# Create an array filled with zeros
c = np.zeros((3, 4))
print(c)
# Create an array filled with ones
d = np.ones((2, 2))
print(d)
# Create an array filled with a specified value
e = np.full((3, 3), 7)
print(e)
# Create an array with a range of values
f = np.arange(0, 10, 2)
print(f)
# Create an array with evenly spaced values
g = np.linspace(0, 1, 5)
print(g)
# Create an array with random values between 0 and 1
h = np.random.rand(3, 3)
print(h)
# Create an array with random values from a normal distribution
i = np.random.randn(2, 2)
print(i)
# [1 2 3 4 5]
# [[1 2 3]
# [4 5 6]]
# [[0. 0. 0. 0.]
# [0. 0. 0. 0.]
# [0. 0. 0. 0.]]
# [[1. 1.]
# [1. 1.]]
# [[7 7 7]
# [7 7 7]
# [7 7 7]]
# [0 2 4 6 8]
# [0. 0.25 0.5 0.75 1. ]
# [[0.13732157 0.38519747 0.94311832]
# [0.20336453 0.18802563 0.83540154]
# [0.31365463 0.77575852 0.75865154]]
# [[-0.83527489 0.41275766]
# [-0.0156634 -0.22911079]]
Manipulating NumPy Arrays:
Once an ndarray is created, it can be manipulated using a variety of functions. NumPy provides several functions for manipulating arrays, including:
- Indexing and Slicing: Accessing elements or a subset of an array.
- Reshaping: Changing the shape of an array.
- Concatenation: Joining multiple arrays together.
- Splitting: Dividing an array into multiple smaller arrays.
import numpy as np
# Create an array
a = np.array([[1, 2], [3, 4]])
# Reshape the array
b = a.reshape(1, 4)
print(b)
# Transpose the array
c = a.transpose()
print(c)
# Flatten the array
d = a.flatten()
print(d)
# Concatenate two arrays
e = np.array([[5, 6]])
f = np.concatenate((a, e), axis=0)
print(f)
# Split an array
g = np.array_split(f, 2, axis=1)
print(g)
# Sort an array
h = np.array([3, 1, 4, 2, 5])
i = np.sort(h)
print(i)
# Filter an array
j = np.array([1, 2, 3, 4, 5])
k = j[j > 3]
print(k)
# Perform element-wise multiplication
l = np.array([[1, 2], [3, 4]])
m = np.array([[5, 6], [7, 8]])
n = l * m
print(n)
# Perform matrix multiplication
o = np.dot(l, m)
print(o)
# OUTPUT
# [[1 2 3 4]]
# [[1 3]
# [2 4]]
# [1 2 3 4]
# [[1 2]
# [3 4]
# [5 6]]
# [array([[1, 2],
# [3, 4]]), array([[5, 6]])]
# [1 2 3 4 5]
# [4 5]
# [[ 5 12]
# [21 32]]
# [[19 22]
# [43 50]]
Mathematical Operations on NumPy Arrays:
Basic Operations:
import numpy as np
# Create two arrays
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
# Addition
c = a + b
print(c)
# Subtraction
d = b - a
print(d)
# Multiplication
e = a * b
print(e)
# Division
f = b / a
print(f)
# Exponentiation
g = np.array([[2, 3], [4, 5]])
h = np.array([[3, 2], [1, 4]])
i = np.power(g, h)
print(i)
# OUTPUT
# [[ 6 8]
# [10 12]]
# [[4 4]
# [4 4]]
# [[ 5 12]
# [21 32]]
# [[5. 3. ]
# [2.33333333 2. ]]
# [[ 8 9]
# [ 4 625]]
Aggregation Functions:
import numpy as np
# Create an array
a = np.array([[1, 2], [3, 4]])
# Sum
b = np.sum(a)
print("Sum:", b)
# Mean
c = np.mean(a)
print("Mean:", c)
# Standard deviation
d = np.std(a)
print("Standard Deviation:", d)
# Computing on a specific axis
e = np.array([[1, 2], [3, 4], [5, 6]])
print("Original Array:")
print(e)
# Sum of rows
f = np.sum(e, axis=0)
print("Sum of Rows:")
print(f)
# Mean of columns
g = np.mean(e, axis=1)
print("Mean of Columns:")
print(g)
# Standard deviation of rows
h = np.std(e, axis=0)
print("Standard Deviation of Rows:")
print(h)
#OUTPUT
Sum: 10
Mean: 2.5
Standard Deviation: 1.118033988749895
Original Array:
[[1 2]
[3 4]
[5 6]]
Sum of Rows:
[ 9 12]
Mean of Columns:
[1.5 3.5 5.5]
Standard Deviation of Rows:
[1.24721913 1.24721913]
Universal Functions:
import numpy as np
# Create an array
a = np.array([1, 2, 3, 4])
# Square root
b = np.sqrt(a)
print("Square root of", a, "is", b)
# Exponential
c = np.exp(a)
print("Exponential of", a, "is", c)
# Logarithm
d = np.log(a)
print("Logarithm of", a, "is", d)
# Trigonometric functions
e = np.sin(a)
print("Sine of", a, "is", e)
f = np.cos(a)
print("Cosine of", a, "is", f)
g = np.tan(a)
print("Tangent of", a, "is", g)
# OUTPUT
Square root of [1 2 3 4] is [1. 1.41421356 1.73205081 2. ]
Exponential of [1 2 3 4] is [ 2.71828183 7.3890561 20.08553692 54.59815003]
Logarithm of [1 2 3 4] is [0. 0.69314718 1.09861229 1.38629436]
Sine of [1 2 3 4] is [ 0.84147098 0.90929743 0.14112001 -0.7568025 ]
Cosine of [1 2 3 4] is [ 0.54030231 -0.41614684 -0.9899925 -0.65364362]
Tangent of [1 2 3 4] is [ 1.55740772 -2.18503986 -0.14254654 1.15782128]
Linear Algebra:
import numpy as np
# Create arrays
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
# Matrix multiplication
c = np.matmul(a, b)
print("Matrix Multiplication:")
print(c)
# Matrix determinant
d = np.linalg.det(a)
print("Matrix Determinant of A:")
print(d)
# Inverse matrix
e = np.linalg.inv(a)
print("Inverse Matrix of A:")
print(e)
# Eigenvalues and eigenvectors
f, g = np.linalg.eig(a)
print("Eigenvalues of A:")
print(f)
print("Eigenvectors of A:")
print(g)
# OUTPUT
# Matrix Multiplication:
# [[19 22]
# [43 50]]
# Matrix Determinant of A:
# -2.0000000000000004
# Inverse Matrix of A:
# [[-2. 1. ]
# [ 1.5 -0.5]]
# Eigenvalues of A:
# [-0.37228132 5.37228132]
# Eigenvectors of A:
# [[-0.82456484 -0.41597356]
# [ 0.56576746 -0.90937671]]
Broadcasting:
import numpy as np
# Create arrays
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
b = np.array([10, 20, 30])
# Addition with broadcasting
c = a + b
print("Addition with broadcasting:")
print(c)
# Multiplication with broadcasting
d = a * b
print("Multiplication with broadcasting:")
print(d)
# Broadcasting with scalar
e = a * 2
print("Broadcasting with scalar:")
print(e)
# Output:
# Addition with broadcasting:
# [[11 22 33]
# [14 25 36]
# [17 28 39]]
# Multiplication with broadcasting:
# [[ 10 40 90]
# [ 40 100 180]
# [ 70 160 270]]
# Broadcasting with scalar:
# [[ 2 4 6]
# [ 8 10 12]
# [14 16 18]]
This program demonstrates various broadcasting operations in NumPy, such as addition, multiplication, and scalar broadcasting. Broadcasting is a powerful feature of NumPy that allows for operations on arrays of different shapes and sizes, making it much easier to work with arrays in Python
Performance Optimization:
Vectorization:
import numpy as np
# Create arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
# Vector addition
c = a + b
print("Vector Addition:")
print(c)
# Vector multiplication
d = a * b
print("Vector Multiplication:")
print(d)
# Vector dot product
e = np.dot(a, b)
print("Vector Dot Product:")
print(e)
# Vector cross product
f = np.cross(a, b)
print("Vector Cross Product:")
print(f)
# Output:
# Vector Addition:
# [5 7 9]
# Vector Multiplication:
# [ 4 10 18]
# Vector Dot Product:
# 32
# Vector Cross Product:
# [-3 6 -3]
This program demonstrates various vectorization operations in NumPy, such as vector addition, multiplication, dot product, and cross product. Vectorization is a powerful feature of NumPy that allows for operations on arrays of different shapes and sizes, making it much easier to work with arrays in Python.
Cython:
# example.pyx
import numpy as np
cimport numpy as np
def add_arrays(np.ndarray[np.int_t, ndim=1] a, np.ndarray[np.int_t, ndim=1] b):
cdef np.ndarray[np.int_t, ndim=1] result = np.zeros_like(a)
for i in range(len(a)):
result[i] = a[i] + b[i]
return result
# setup.py
from distutils.core import setup
from Cython.Build import cythonize
import numpy
setup(
ext_modules=cythonize("example.pyx"),
include_dirs=[numpy.get_include()]
)
# main.py
import numpy as np
import example
# Create arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
# Call the Cython function
c = example.add_arrays(a, b)
# Print the result
print(c)
#Output:
[5 7 9]
This program demonstrates how to use Cython with NumPy to optimize a function that adds two arrays element-wise. Cython is a superset of Python that allows for faster execution times by compiling Python code into C code, and NumPy is a Python library for scientific computing that provides fast array operations. By combining the two, we can achieve even faster performance in our Python code. In this example, the Cython function is called from a Python script, and the result is printed to the console.
Advanced Topics:
Masked Arrays:
import numpy as np
from numpy import ma
# Create an array with missing values
a = np.array([1, 2, -999, 4, -999, 6])
# Create a mask for the missing values
mask = a == -999
# Create a masked array
b = ma.masked_array(a, mask)
# Print the masked array
print("Masked Array:")
print(b)
# Apply an operation to the masked array
c = b * 2
print("Result of operation on Masked Array:")
print(c)
# Output:
# Masked Array:
# [1 2 -- 4 -- 6]
# Result of operation on Masked Array:
# [2 4 -- 8 -- 12]
This program demonstrates how to use masked arrays in NumPy to handle missing or invalid data. A masked array is an array that has certain elements marked as invalid or missing, and these elements are then ignored in operations that involve the array. In this example, we create a mask for the missing values in the original array and then use it to create a masked array. We then perform an operation on the masked array, and the missing values are automatically ignored. This is a powerful feature of NumPy that makes it easy to work with data that may contain missing or invalid values
Structured Arrays:
import numpy as np
# Define the data types for the structured array
dt = np.dtype([('name', 'S10'), ('age', np.int32), ('salary', np.float64)])
# Create an empty structured array with three elements
a = np.empty(3, dtype=dt)
# Fill the structured array with data
a['name'] = ['Alice', 'Bob', 'Charlie']
a['age'] = [25, 30, 35]
a['salary'] = [50000.0, 60000.0, 70000.0]
# Print the structured array
print("Structured Array:")
print(a)
# Access a single element of the structured array
print("Accessing a single element:")
print(a[1])
# Access a field of a single element
print("Accessing a field of a single element:")
print(a[1]['name'])
Output:
# Structured Array:
# [(b'Alice', 25, 50000.) (b'Bob', 30, 60000.) (b'Charlie', 35, 70000.)]
# Accessing a single element:
# (b'Bob', 30, 60000.)
# Accessing a field of a single element:
# b'Bob'
This program demonstrates how to create a structured array in NumPy. A structured array is an array that contains elements of different data types, similar to a table or spreadsheet. In this example, we define the data types for the structured array using the np.dtype function, and then create an empty structured array with three elements. We then fill the structured array with data and print it to the console. We also demonstrate how to access a single element of the structured array and how to access a field of a single element. This is a powerful feature of NumPy that allows you to work with structured data in a convenient and efficient way.
Integration with Other Libraries:
Conclusion
NumPy is a powerful library that provides support for large, multi-dimensional arrays and matrices, as well as a wide range of mathematical functions to operate on them. In this blog post, we explored the basic operations available in NumPy, including arithmetic operations, universal functions, aggregation functions, indexing and slicing, reshaping, transposing, concatenation, and stacking. These operations are the building blocks for more advanced operations in NumPy and are essential for scientific computing,
No comments:
Post a Comment