Building a Neural Network using PyTorch
Have you ever wondered how neural networks learn to make predictions or classify data? Neural networks serve as the foundation of artificial intelligence, allowing machines to recognize patterns and make informed decisions. In this tutorial, we will guide you through the process of constructing and training a simple neural network using PyTorch.
What are neural networks?
Neural networks are computational models inspired by the human brain’s structure and function. They are built from layers of interconnected artificial neurons that process and transform information through weighted connections. Each neuron receives inputs, applies mathematical transformations, and passes signals to connected neurons in subsequent layers. During training, the network automatically adjusts these connection weights through algorithms like backpropagation, learning to recognize patterns and make accurate predictions from data.
Intro to PyTorch and Neural Networks
Learn how to use PyTorch to build, train, and test artificial neural networks in this course.Try it for freeBasic architecture
A neural network typically has three types of layers:
- Input Layer: Receives the data.
- Hidden Layers: Process the data using weights and biases.
- Output Layer: Produces the final prediction.
This architecture enables neural networks to tackle complex tasks like image recognition and language processing by developing sophisticated internal representations of the input data.
If you want to deeply understand how neural networks work, be sure to check out this What are Neural Networks article!
PyTorch makes building and training these networks straightforward with its nn.Module
class, which allows us to define custom architectures.
Build a neural network using PyTorch
To get started, we need to import the required libraries. PyTorch provides everything we need to build neural networks, define loss functions, and train models. Here’s how we can import the necessary modules:
import torchimport torch.nn as nnimport torch.optim as optim
Here, torch
is the core PyTorch library. The torch.nn
module provides tools to define and work with neural networks, while torch.optim
offers various optimization algorithms for training models. With these in place, we’re ready to define our neural network.
Define the model
In PyTorch, a neural network is defined as a class that inherits from nn.Module
. This allows us to define the network’s architecture and how data flows through it. Below is the implementation of a simple feedforward neural network:
classSimpleNN(nn.Module):def__init__(self):super(SimpleNN, self).__init__()self.fc1 = nn.Linear(2,5)self.relu = nn.ReLU()# Activation functionself.fc2 = nn.Linear(5,1)defforward(self, x):x = self.fc1(x)x = self.relu(x)x = self.fc2(x)return x
In this network, the nn.Linear
module represents a fully connected (dense) layer in a neural network. The first layer fc1
, transforms an input of size 2 into a representation of size 5. The ReLU activation function is applied to introduce non-linearity, which is essential for the network to learn complex patterns. The final layer fc2
, then reduces the representation size to 1, generating the model’s output.
Now, let’s initialize this model and inspect its structure:
model = SimpleNN()print(model)
When we run the code above, we’ll see the network’s architecture printed out, showing each layer and its configuration:
Model architecture:SimpleNN((fc1): Linear(in_features=2, out_features=5, bias=True)(relu): ReLU()(fc2): Linear(in_features=5, out_features=1, bias=True))
Training the model
Training a neural network involves defining a loss function to measure how well the model is performing and an optimizer to adjust the model’s weights based on this loss. Here’s how we set them up:
criterion = nn.MSELoss()optimizer = optim.SGD(model.parameters(), lr=0.01)
The MSELoss()
function calculates the average squared difference between predicted and actual values. It focuses on larger errors because squaring the differences penalizes them more heavily. This approach encourages the model to prioritize minimizing significant discrepancies.
For example, if the predicted output is far from the target value, MSELoss
amplifies this difference, guiding the optimizer to make significant adjustments to the weights.
The SGD optimizer iteratively updates the weights to reduce the error, with the lr
parameter controlling the learning rate.
- Preparing the Data
Let’s create some dummy data for our training:
inputs = torch.tensor([[1.0,2.0],[2.0,3.0],[3.0,4.0]])targets = torch.tensor([[5.0],[7.0],[9.0]])
Here, inputs represent the input features, and targets are the corresponding output values we want the network to learn to predict.
- Training Loop
The training process involves multiple iterations (epochs) over the dataset. In each epoch, we compute the predictions, calculate the loss, perform a backward pass to compute gradients, and update the model’s weights. Let’s implement this:
for epoch inrange(5):# Training for 5 epochsoptimizer.zero_grad()# Clear previous gradientsoutputs = model(inputs)# Forward passloss = criterion(outputs, targets)# Calculate lossloss.backward()# Backward pass to compute gradientsoptimizer.step()# Update weightsprint(f'Epoch [{epoch +1}/5], Loss: {loss.item():.4f}')
During each epoch, the model makes predictions (outputs
), calculates how far off they are from the actual values (loss
), computes gradients to understand how to adjust weights (loss.backward()
), and finally updates the weights (optimizer.step()
).
Running this loop produces an output like:
Epoch [1/5], Loss: 53.3861Epoch [2/5], Loss: 48.7173Epoch [3/5], Loss: 41.8362Epoch [4/5], Loss: 30.5981Epoch [5/5], Loss: 15.2334
Notice how the loss decreases with each epoch as the model learns to make better predictions.
Evaluating the Model
After training, it’s time to evaluate the model’s performance on new, unseen data. Let’s test it with a new input:
test_data = torch.tensor([[4.0,5.0]])prediction = model(test_data)print(f'Prediction: {prediction.item():.4f}')
The output might look like:
Prediction: 10.8516
This result indicates the model’s prediction for the input [4.0, 5.0]. While not perfect, it’s reasonably close to the expected value of 4 + 5 + 2 = 11 (based on the pattern in the training data).
Note: The outputs will change each time because neural networks are randomly initialized and use stochastic (random) methods during training.
Conclusion
In this tutorial, we’ve walked through building and training a simple neural network using PyTorch. Here are the key takeaways:
- PyTorch offers a robust framework for developing neural networks through the
nn.Module
class - The neural network architecture includes an input layer for data, hidden layers for processing with weights and biases, and an output layer for generating predictions.
- Training a neural network involves crucial steps: defining a loss function (like
MSELoss
) to measure model performance, choosing an optimizer (like SGD) to adjust weights, and iterating through multiple epochs to improve predictions. - The training loop consists of a forward pass for predictions, followed by loss calculation, a backward pass for computing gradients, and weight updates via the optimizer. These components work together to progressively reduce the model’s error.
- Model evaluation is important after training, as testing with unseen data confirms whether the network has learned underlying patterns rather than just memorizing the training data.
Remember, this tutorial demonstrates basic concepts - neural networks can be made much more complex for tackling sophisticated real-world problems. Take this free course on Intro to PyTorch and Neural Network to learn more about how you can create more complex neural networks.
'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'
Meet the full teamRelated articles
- Article
Understanding Neural Networks and Their Components
Discover the neural network architecture and its core components to understand how they work. - Article
Building a Neural Network Model Using TensorFlow
Learn how to build a neural network model in TensorFlow by creating a digits classification model using the MNIST dataset. - Article
What are Neural Networks?
An artificial neural network is an interconnected group of nodes, an attempt to mimic to the vast network of neurons in a brain.
Learn more on Codecademy
- Free course
Intro to PyTorch and Neural Networks
Learn how to use PyTorch to build, train, and test artificial neural networks in this course.Intermediate3 hours - Course
Generating Text with PyTorch
Learn how to use Python to build text generation models based on neural networks like RNNs and LSTMs in this PyTorch tutorial.With CertificateIntermediate1 hour - Course
PyTorch for Classification
Build AI classification models with PyTorch using binary and multi-label techniques.With CertificateBeginner Friendly3 hours