Responsive exponential backoff in Go
using exponentially scaling up and down the sleep time

Rate Limiting

  • Often when we use various cloud services we end up seeing throttling or rate limiting from the service
  • Various cloud services are often shared across multiple tenants and rate limiting mechanisms are used to prevent one tenant consuming much more resources than allocated/provisioned limits

Client side throttling

  • In order to handle the service throttling, client needs to add a delay between subsequent calls
  • This can achieved via a linear back off or an exponential back off

Linear back off

  • Simplest approach is to start with a base sleep time and add sleep time for each failure
  • Say 1 secs as base sleep time. For every request that is failed due to rate limiting, we add another 1 seconds
  • So, it will go like this 1s, 2s, 3s, 4s, 5s, ... and stops growing once request is successful


// for each failure we add interval to current delay
func sleepTime(currentDelay, interval, maxInterval int64) int64 {
    currentDelay += interval
    if currentDelay > maxInterval {
        currentDelay = maxInterval
    return currentDelay

Exponential back off

  • In this case, instead of increasing the sleep time by a constant time, it is multiplied by a factor
  • If this multiplication factor is 2 and initial sleep is 1 second
  • Then, it will grow exponentially as 1s, 2s, 4s, 8s, 16s, ...
  • This will be the better than linear back off as it can get successful delay with less iterations


// for each failure we multiple the current delay by a multiplication factor
func sleepTime(currentDelay, initialDelay, maxInterval int64, multiplier float64) int64 {
    if currentDelay == 0 {
       return initialDelay
    currentDelay = int64(float64(currentDelay) * multiplier)
    if currentDelay > maxInterval {
        return maxInterval
    return currentDelay

Rate Limiting in DynamoDB

  • DynamoDB is a Fast, flexible NoSQL database service from AWS
  • DynamoDB provides two capacity modes for each table: on-demand and provisioned.
  • Irrespective of the capacity mode, DynamoDB will reject incoming write requests when it exceeds WCUs (Write Capacity Units)
  • I built this responsive exponential back off to handle this rate limiting
  • But, same can be used for handling rate limiting from any service
  • As part of a job, multiple workers are writing to the DynamoDB in parallel

Responsive Exponential back off

  • In addition to exponential back off, this one is responsive in nature
  • As we see more and more requests succeeding it will start decreasing the delay time
  • Down multiplier factor is used once down multiplier threshold is met. When we see N successful requests, down multiplier is applied
  • Additionally a RandomizationFactor is applied while scaling up and down.
  • Say RandomizationFactor=0.2, current delay is 1s and up multiplier 2, on next failure it can become (1 x 2) +/- 0.2 * (1 x 2) = 1.6 to 2.4
  • RandomizationFactor aids in getting slightly different new interval when applying multipliers


Following configurations are used for the sleeper.

type AutoSleeper struct {
  InitialInterval         time.Duration  // Used for sleeping first time
  MaxInterval             time.Duration  // Max interval for sleeping
  MaxRandomization        time.Duration  // Max randomization interval
  UpMultiplier            float64        // Multiplied for increasing the sleep time
  DownMultiplier          float64        // Multiplied for decreasing the sleep time
  RandomizationFactor     float64        // Randomize the new sleep value 
  DownMultiplierThreshold int            // Threshold for triggering sleep time reduction

Go Code


package main

import (

const (
  DefaultMaxInterval             = 15 * time.Minute
  DefaultInitialInterval         = 500 * time.Millisecond
  DefaultRandomizationFactor     = 0.3
  DefaultMaxRandomization        = 2 * time.Minute
  DefaultUpMultiplier            = 1.5
  DefaultDownMultiplier          = 0.9
  DefaultDownMultiplierThreshold = 10

func NewAutoSleeper() *AutoSleeper {
  return &AutoSleeper{
    MaxInterval:             DefaultMaxInterval,
    InitialInterval:         DefaultInitialInterval,
    RandomizationFactor:     DefaultRandomizationFactor,
    MaxRandomization:        DefaultMaxRandomization,
    UpMultiplier:            DefaultUpMultiplier,
    DownMultiplier:          DefaultDownMultiplier,
    DownMultiplierThreshold: DefaultDownMultiplierThreshold,

type AutoSleeperMetrics struct {
  TotalInvocation int
  TotalWentUp     int
  TotalWentDown   int
  TotalSlept      int
  TotalSleepTime  time.Duration

type AutoSleeper struct {
  InitialInterval         time.Duration
  MaxInterval             time.Duration
  MaxRandomization        time.Duration
  UpMultiplier            float64
  DownMultiplier          float64
  RandomizationFactor     float64
  DownMultiplierThreshold int

  metrics         AutoSleeperMetrics
  currentInterval time.Duration
  currentSuccess  int

func (s *AutoSleeper) GetMetrics() AutoSleeperMetrics {
  return s.metrics

func (s *AutoSleeper) SleepOnFailure() {
  s.metrics.TotalInvocation += 1

func (s *AutoSleeper) SleepOnSuccess() {
  s.metrics.TotalInvocation += 1
  if s.currentInterval == 0 {
  s.currentSuccess += 1
  if s.currentSuccess == s.DownMultiplierThreshold {
    s.currentSuccess = 0

func (s *AutoSleeper) sleep() {
  s.metrics.TotalSleepTime += s.currentInterval
  s.metrics.TotalSlept += 1

func (s *AutoSleeper) goDown() {
  s.metrics.TotalWentDown += 1
  interval := float64(s.currentInterval) * s.DownMultiplier
  random := getNextRandomInterval(interval, s.RandomizationFactor, float64(s.MaxRandomization))
  if random < float64(s.InitialInterval) {
    s.currentInterval = 0
  s.currentInterval = time.Duration(random)

func (s *AutoSleeper) goUp() {
  s.metrics.TotalWentUp += 1
  if s.currentInterval == 0 {
    s.currentInterval = s.InitialInterval
  interval := float64(s.currentInterval) * s.UpMultiplier
  random := getNextRandomInterval(interval, s.RandomizationFactor, float64(s.MaxRandomization))
  if random > float64(s.MaxInterval) {
    s.currentInterval = s.MaxInterval
  s.currentInterval = time.Duration(random)

func getNextRandomInterval(currentInterval, randomizationFactor, maxRandomization float64) float64 {
  if randomizationFactor == 0 {
    return currentInterval
  delta := randomizationFactor * currentInterval
  if delta > maxRandomization {
    delta = maxRandomization
  randomization := 2 * delta * rand.Float64()
  minInterval := currentInterval - delta
  return minInterval + randomization


s := NewAutoSleeper()
s.SleepOnFailure() // Uses up multiplier
s.SleepOnSuccess() // Uses down multiplier on N threshold

Graph of delay generated

For the following configuration,

InitialInterval = 1 millisecond
MaxInterval = 15 minutes
MaxRandomization = 2 minutes
UpMultiplier = 1.5
DownMultiplier = 0.6
RandomizationFactor = 0.0
DownMultiplierThreshold = 5

  • Y axis represents the delay time generated for each request
  • Responses succeeded are shown as blue and failed ones are shown in red
  • Initially, requests were failing till it reached 291.9 ms
  • Later point, delay of greater than 60 ms were succeeding
  • As shown in the graph, responsive exponential back off quickly generates suitable delay
  • It goes up and down as the requests were failing and suceeding respectively
  • Setting an appropriate value for DownMultiplierThreshold controls how frequently we want to reduce the delay
  • Setting an appropriate value for DownMultiplier controls how fast we want to reduce the delay
  • Setting an appropriate values for RandomizationFactor and MaxRandomization controls the noise introduced while scaling up/down

Usage of Responsive Exponential back off

  • In Auto scaling DynamoDB use case, DynamoDB will increase the capacity as it sees more load, but this will take time, not instantaneous
  • Each job is run by multiple workers. Workers count increase as the job started and decrease when individual tasks are completed
  • So this back off, initially increases the delay and tries to succeed
  • Once requests are succeeding, it will try to reduce the delay
  • This will be auto balancing as with the increase/decrease of workers as well as DynamoDB WCUs


Great!! You read till this point, just go ahead and share this post to your followers, collegues and friends. Thanks!

About Author

Sakthi Priyan H
Passionate Programmer

  • I am passionate about building excellent teams, processes and systems.
  • Primarily I use Java, Scala and Python for building various systems and tools.
  • Building API services, Big data processing and Machine Learning systems in Crayon Data.
  • Also, interested in Golang and building web apps using Javascript ecosystem.
  • I wrote my first program in BASIC in 1998, Passionate about computers since then.