Use tqdm to monitor model training progress
When we're training a deep learning model, it helps to have a small progress bar giving us an estimation of how long the process would take to complete. To achieve this, we can use the Python external library tqdm. In this post, we will use tqdm to show a progress bar as we are loading data in the training loop.
Installation
You can install the tqdm package by
pip3 install tqdm
Basic Usage
Installation
You can install the tqdm package by
pip3 install tqdm
How to import a tqdm object
-
use
from tqdm import tqdm
if you're using a terminal -
use
from tqdm.notebook import tqdm
if you're using a Jupyter Notebook
Commonly used parameters in a tqdm object
-
total
: the total number of expected iterations if not specified already or needs modification, ex: 300 -
desc
: description of your progress bar, ex: "My Progress Bar" -
disable
: set this toTrue
if you want to disable the progress bar
Example 1
from tqdm.notebook import tqdm
for i in tqdm(range(int(10e6)),desc= "My Progress Bar"):
pass
We will create a DataLoader with the training data of agnews dataset.
from datasets import load_dataset
agnews = load_dataset('ag_news')
train_dataset = agnews['train']
The dataset has two data fields, text and label.
train_dataset
Now, we can initiate a DataLoader with the agnews training data
from torch.utils.data import DataLoader
train_dataloader = DataLoader(train_dataset,batch_size=64)
If you need more detailed information about loading datasets and using DataLoader, you can check
from tqdm.notebook import tqdm
for batch_index, data in tqdm(enumerate(train_dataloader),
total=len(train_dataloader),
desc ="My Progress Bar"):
text = data['text']
label = data['label']
# print batch information every 700 batches
if batch_index % 700 == 0:
print(f'\nBatch {batch_index}\n firs text: {text[0]},\n first label: {label[0]}')
from tqdm import tqdm
for batch_index, data in tqdm(enumerate(train_dataloader),
total=len(train_dataloader),
desc ="My Progress Bar"):
text = data['text']
label = data['label']
# print batch information every 700 batches
if batch_index % 700 == 0:
print(f'\nBatch {batch_index}\n firs text: {text[0]},\n first label: {label[0]}')
You can see that if you print something within the for loop, the progress bar will show on a new line in each iteration.
If you are using some gpu cloud platforms, such as Colab, you may have to run your python script in Jupyter notebook.
In such a case, you can
- use
from tqdm.auto import tqdm
to import tqdm - use
%run
instead of!python
Example
Suppose we have a file called example.py with the below code:
# example.py
from tqdm.auto import tqdm
for i in tqdm(range(int(10e6)),desc= "My Progress Bar"):
pass
Here's what we get if we use the !python
command to run example.py
You can see that the progress bar is not displaying and it prints a newline in every iteration.
!python example.py
To fix this, we can use the %run
command instead of !python
to run example.py
%run example.py
Now the progress bar displays as expected.
trange
is a shortcut for tqdm(range(args), **kwargs)
Example
Using tqdm
from tqdm.notebook import tqdm
from time import sleep
for i in tqdm(range(10), desc="Text you want"):
sleep(.1)
Using trange
from tqdm.notebook import trange
from time import sleep
for i in trange(10,desc="Text you want"):
sleep(.1)
You can see that the outputs are the same.
We often train a deep learning model with more than one epoch. In this case, we can use trange
to keep track of the progress of epochs.
Example
from tqdm.notebook import trange, tqdm
for i in trange(3,desc= 'Epoch'):
print('\nEpoch', i)
for batch_index, data in tqdm(enumerate(train_dataloader),
total=len(train_dataloader),
desc ="Text You Want",
#disable = True,
#file=sys.stdout,
#initial=1000
):
text = data['text']
label = data['label']
# print batch information every 700 batches
if batch_index % 1000 == 0:
print(f'\nBatch {batch_index}\n firs text: {text[0]},\n first label: {label[0]}')