Effective Transfer Learning - A Guide to Feature Extraction and Fine-Tuning Techniques
Transfer learning is a technique in machine learning that allows a model trained on one task to be reused and fine-tuned for another similar task. The idea behind transfer learning is that a model that has already learned to recognize patterns in one set of data can be applied to a different but related problem, allowing the model to learn faster and with less data than if it were trained from scratch.
There are two main ways to perform transfer learning:
- Feature extraction: In this approach, you take a pre-trained model and remove the last layers (the ones that are responsible for making the final prediction), and add new layers on top. The pre-trained model has already learned useful features from the data, so by reusing these features, you can train a new classifier with less data. This is useful when you have a small dataset and want to leverage the knowledge of a pre-trained model.
- Fine-tuning: In this approach, you take a pre-trained model and unfreeze some of the layers near the bottom of the network, and then retrain the entire model with a new dataset. This allows the model to learn from both the new data and the pre-trained weights, which can lead to better performance on the new task. This is useful when you have a larger dataset and want to adjust the pre-trained model to work better for your specific task.
Feature Extraction using Keras
Here's an example of how you can use Keras to perform feature extraction using the ResNet50 model:
1from keras.applications import ResNet50
2from keras.layers import Dense
3from keras.models import Model
4
5# Load the ResNet50 model with pre-trained weights
6base_model = ResNet50(weights='imagenet', include_top=False)
7
8# Freeze the layers of the model
9for layer in base_model.layers:
10 layer.trainable = False
11
12# Add a new fully connected layer for the output
13x = base_model.output
14x = Dense(1024, activation='relu')(x)
15predictions = Dense(10, activation='softmax')(x)
16
17# Create a new model that takes the base_model as input and the new output layer as output
18model = Model(inputs=base_model.input, outputs=predictions)
19
20# Compile and train the model
21model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
22model.fit(X_train, y_train, epochs=10, batch_size=32)
In this example, the ResNet50 model is loaded with pre-trained weights and all of its layers are frozen. Then, a new fully connected layer is added on top of the ResNet50 model, which is connected to the output of the model. This new fully connected layer is then trained using the X_train data and y_train labels.
Fine-tuning using Keras
Here's an example of how you can use Keras to fine-tune the ResNet50 model:
1from keras.applications import ResNet50
2from keras.layers import Dense
3from keras.models import Model
4
5# Load the ResNet50 model with pre-trained weights
6base_model = ResNet50(weights='imagenet', include_top=False)
7
8# Unfreeze some of the layers of the model
9for layer in base_model.layers[:15]:
10 layer.trainable = False
11for layer in base_model.layers[15:]:
12 layer.trainable = True
13
14# Add a new fully connected layer for the output
15x = base_model.output
16x = Dense(1024, activation='relu')(x)
17predictions = Dense(10, activation='softmax')(x)
18
19# Create a new model that takes the base_model as input and the new output layer as output
20model = Model(inputs=base_model.input, outputs=predictions)
21
22# Compile and train the model
23model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
24model.fit(X_train, y_train, epochs=10, batch_size=32)
In the above code,
- The
base_model
variable is assigned the ResNet50 model with pre-trained weights on 'imagenet' dataset. By settinginclude_top=False
it removes the last layers of the ResNet50 model, these layers are responsible for making the final prediction, so that we can add new layers on top. - The next block uses a for loop to iterate over the layers of the base model. The first 15 layers are set to be untrainable, while the remaining layers are set to be trainable. This is done using the
trainable
attribute of the layers, which controls whether the gradients of the weights of the layer should be updated during training. - Next, a new fully connected layer is added on top of the output of the base model. This layer, called
x
, applies a ReLU activation function to the input and has 1024 units. - Then, another dense layer is added on top of the
x
layer, this is calledpredictions
.This new layer applies a softmax activation function and has 10 units as this is a classification problem with 10 classes. - Then, a new model is created with base_model as input and the new output layer as output. This is done by instantiating the Model class, passing in the base_model input and the new output layer.
Author: Sadman Kabir Soumik
Posts in this Series
- Ace Your Data Science Interview - Top Questions With Answers
- Understanding Top 10 Classical Machine Learning Algorithms
- Machine Learning Model Compression Techniques - Reducing Size and Improving Performance
- Understanding the Role of Normalization and Standardization in Machine Learning
- One-Stage vs Two-Stage Instance Segmentation
- Machine Learning Practices - Research vs Production
- Transformer - Attention Is All You Need
- Writing Machine Learning Model - PyTorch vs. TF-Keras
- GPT-3 by OpenAI - The Largest and Most Advanced Language Model Ever Created
- Vanishing Gradient Problem and How to Fix it
- Ensemble Techniques in Machine Learning - A Practical Guide to Bagging, Boosting, Stacking, Blending, and Bayesian Model Averaging
- Understanding the Differences between Decision Tree, Random Forest, and Gradient Boosting
- Different Word Embedding Techniques for Text Analysis
- How A Recurrent Neural Network Works
- Different Text Cleaning Methods for NLP Tasks
- Different Types of Recommendation Systems
- How to Prevent Overfitting in Machine Learning Models
- Effective Transfer Learning - A Guide to Feature Extraction and Fine-Tuning Techniques