In the context of learning rate (“LR”) scheduling according to an exponential learning rate schedule, i.e. decaying the learning rate exponentially over iterations we have the following closed form expression for the LR at time step (iteration) t, denoted LRt given the base LR (i.e. LR at time step zero) LR0:
LRt=LR0⋅γt
(Reminder: We’re zero-based indexing here, so we start at t=0.)
Rearranging, we get the expression for the scaling factor, γ, given a target learning rate which we would like to reach after T time steps, LRT which for us will be 10−5 (or 1e-5):