Skip to main content

Tuning Hyperparameters and visualizing on TensorBoard

Hyperparameters tuning is one of the most crucial steps of machine or deep learning process. Hyperparameters are configurations for a machine learning model that are not learned from the data but are set before the training process begins. These parameters are essential for controlling the overall behavior of the model.

While training a machine learning model, you may have to experiment with different hyperparameters such as learning rate, batch size, dropout size, optimizers etc. in order to achieve the model with best accuracy. Performing experiments with hyperparameters one by one can be a tedious and time-consuming process. For instance, you initiate the training process with a specific combination of hyperparameters, and subsequently, you repeat the procedure with a different set of hyperparameters, and so forth.

TensorFlow allows you to run experiments with different sets of hyperparameters in a single execution, enabling you to visualize the metrics on HParam dashboard in TensorBoard. This capability allows you to efficiently determine the best possible combination of hyperparameters for training your model. Let’s see how it is done.

Here, due to the availability of a significant amount of data, I have chosen to use the prebuilt MNIST dataset rather than the animal-building dataset referenced in the earlier posts. Its already available in TensorFlow libraries.



Setting up the experiments

You should be able to identify which hyperparameters you want to perform the experiment on. Here, I have used below hyperparameters. I would use different values and their possible combinations for these hyperparameters and perform the experiment.


Now, I have created a simple model where I have used two dense layers with a dropout layer in between them. Here, I am not using the hardcoded values for optimizer and learning rate, instead, I am using them from hparams dictionary defined (see above image), which are used throughout the training.


I have used only one epoch (see below image), for demonstrating purposes.


Log hparams summary with hyperparameters for each run (see below image). This run function will be called for all the combinations of hyperparameters under experiment.


Execute runs and log them all in a directory

You can perform multiple experiments, each with different set of hyperparameters. I have used all the possible combinations of hyperparameters (optimizers & learning rate) here.

  • Adam & .001
  • Adam & .002
  • SGD & .001
  • SGD & .002


Visualize the metrics in TensorBoard HParams tab

You can visualize the outcomes in TensorBoard HParams tab. You need to run the TensorBoard against the directory where you have recorded all the runs. (tensorboard –logdir=C:/cnn-models/hparam_tuning)


You can see the best accuracy achieved with optimizer as ‘Adam’ and learning rate as .002, which are ideal for your model.

Try experimenting with different hyperparameters and comment your results below.

Comments

Popular posts from this blog

Exploring CNN with TensorFlow & Keras

Convolutional Neural Network or CNN for short, is one of the widely used neural network architecture for image recognition. It’s use cases can be widely extended to various powerful tasks, such as, object detection within an image, image classification, facial recognition, gesture recognition etc. Indeed, Convolutional Neural Networks (CNNs) are designed with some level of resemblance to the image recognition process in the human brain. For instance, In the visual cortex, neurons have local receptive fields, meaning they respond to stimuli only in a specific region of the visual field, which is achieved by CNN using kernels or filters. Both human brain and CNN process the visual information in hierarchical manner. Basic information of an image is extracted via lower level of neurons in human brain, and higher-level neurons integrate the information from lower-level neurons to identify the complex patterns. On the other hand, in CNN, we use multiple convolutional layers to extract hiera...

Finetuning of Transformers in Natural Language Processing

Transformers are the essential parts of deep neural network, and widely used in Natural language processing tasks. We have a wide variety of usages where transformers are used in real time scenarios, such as, translations, text generation, question answering and various other NLP tasks. One of the widely used examples of transformer is Chat GPT. More information about transformer architecture and its mechanism can be accessed on page Understanding Transformers (BERT & GPT) . One of the very important processes in transformers is Finetuning. Finetuning is the way for adapting the OOB (out of the box) model for your specific tasks. In other words, it is the process of training a pre-trained model on your specific datasets to adapt the knowledge from new dataset. During fine-tuning, the parameters of the pre-trained model are adjusted based on the task-specific dataset. The goal is to adapt the model’s knowledge to perform well on the particular task of interest. Let’s understand how ...