Know about EfficientNet & Implementation from Scratch Using Pytorch

Sahil -
4 min readApr 16, 2024

--

Hi Guys! In this blogs, I will share my knowledge, after reading this research paper, what it is all about!

Abstract

  • Study model scaling and identify that carefully balancing network depth, width and resolution can lead to better performance.
  • Propose — a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple called ‘compound coefficient
  • Use — design a new baseline network and scale it up to obtain a family of models called ‘EfficientNets’ which achieve better accuracy and efficiency.
  • Result — EfficientNet-B7 achieve top-1 accuracy on ImageNet while being 8.4x smaller and 6.1 faster on inference than earliers ConvNets models.

Introduction

  • The process of scaling up ConvNets has never been well understood and there are many ways to do it.
  • Common way is to increase the depth or width.
  • Another less common way is to scale up models by image resolution.
  • Though it is possible to scale two or three dimensions arbitrarily, scaling requires tedious manual tuning and still often yields sub-optimal accuracy and efficiency.
  • Questions arises?

Is there a principled method to scale up ConvNets that can achieve better accuracy and efficiency?

Empirical study shows that it is critical to balance all dimensions of network width/depth/resolution and balance can be achieved by simply scaling each of them with constant ratio.

Fixed Scaled Coefficient
If input image is bigger…

So, this paper was the first empirically quantify the relationship among all three dimensions of network width, depth and resolution.

Compound Model Scaling

Problem Formulation

In this section, this paper explained the mathematics how ConvNets is computed.

Output of each ConvNet Layer
Output of ConvNet N
Definition of ConvNet
For layer <i>

Now, as this paper proposed that the network should scaled based on width (w), depth (d) and resolution (r).

Objective

Scaling Dimensions

Setting baseline model of one multiplier, and others to keep constant. For example, increase width multiplier and keep other two constants value 1.

Based on these, the graph has shown that accuracy quickly saturated after certain value of the multiplier (width or depth or resolution).

Compound Scaling

Constraints on multipliers

EfficientNet Architecture

Table for EfficientNet-B0 Baseline Architecture
MBConv Block (Info from Google)

In this paper, the MBConv block information was provided too vague. I explored from Google and ChatGPT. This is where the idea came into this picture.

Code

In the above code, you have noticed that gamma was not used for resolution multiplier. During the training, we have to provide the resolution multiplier (r) and according to (r), we need to adjust the width multiplier (w) and depth multiplier (d). Make sure that these are under the constraint on these multipliers.

Results

Comparison EfficientNet with earlier ConvNets models
Inteference Latency
Comparison on different dataset for pre-Trained EfficientNet model with other ConvNet models

I tried my best to build from scratch to get “close” with number of parameter mentioned in research paper. It is not much nearly but not specifically mentioned where the author has tweaked the parameter to match.

Fixed d,w,r and increasing pie value

Please do let us know if I missed anything or share your comments so we can also learn each other.

That’s all folks!

Thanks for reading it. Happy Learning! :D

Here my LinkedIn Profile.

--

--

Sahil -
Sahil -

Written by Sahil -

Techie with a Passion for Learning and Sharing

No responses yet