300+ instances in 1 minute


GCP got you back when you need it.

The first question I often get when I mention my peers about our application on top firestore is how do you manage the scalability? and also how do you do to keep the costs low… Let’s talk about the first one.


The short answer is it depends of the workload we need for a given feature , I made a few lab tests to show you, my dear friend, how this looks like in practice, so next time someone ask you about you are covered.

I regulary write a lot of stuff about cloud functions, firebase and GPC in general, if yo uare down to keep learning more and more, subscribe to my newsletter and you won't miss a thing.

I set up a few labs:

  • Save 10000 events sequentially on firestore to simulate a queue of events. These events are triggered a onCreate cloud function, processed expensive operations and save results.
  • Restore a 10000 record collection a firebase project and to the same expensive operation (but this time all at once).
  • Just for fun, I hit 10000 times a callable function that only saves a record on a firestore.

The results were really interesting and I hope this can help you to understand how scalability works behind the scene, let’s see it.

The setup

The configuration details look like this:

  • Functions were set with default 256 Mb of ram.
  • Each one proccsed a bcrypt hash process with 15 salts with an average time of 8 seconds to completed.

In general lines, the function looks like this:

import { report } from './logging'
import * as moment from 'moment'
const short = require('short-uuid')
const bcrypt = require('bcrypt')
const Chance = require('chance')
const chance = new Chance()

import * as admin from 'firebase-admin'
import * as functions from 'firebase-functions'

const orderRef = admin.firestore().collection(`orders`)

const execHeavyProccess = async (myPlaintextPassword: string) => bcrypt.hash(myPlaintextPassword, 15).then((hash: string) => hash)

const instance = short.generate()

export const heavyProcess = firestore.collection(`queue/{id}`).onCreate(async (snap: any) => {
  const eventId = short.generate()
  try {
    const password = chance.string({ length: 10 })
    const hash = await execHeavyProccess(password)
    const body = snap.data()
    const timestamp = moment().format('YYYY-MM-DD hh:mm:ss')
    return orderRef.doc(eventId).set({
      _id: eventId,
      timestamp: timestamp,
      instance: instance,
    }).then(async () => {
      await report(`orders_completed`)({ instance, body, timestamp, eventId })
  } catch (error) {
    await report(`orders_completed_error`)({ instance, error: error.message, eventId })

The Trigger

A small script to stress the instances:

const orders = 10000

import * as admin from 'firebase-admin'
const serviceAccount = require(`./private-key.json`)

  credential: admin.credential.cert(serviceAccount)

const Chance = require('chance')
const chance = new Chance()
const moment = require('moment-timezone')
const short = require('short-uuid')

const firestore = admin.firestore();
const queueRef = firestore.collection(`queue`);

(async () => {
  for (let i = 1; i <= orders; i++) {
    const id = short.generate()
    const timestamp_create = moment().subtract({ hours: "01" }).format('YYYY-MM-DD hh:mm:ss')
    const payload = {
      name: chance.name(),
      email: chance.email(),
      _id: id,
      timestamp_two_created: timestamp,
      ref: i
    await queueRef.doc(id).set(payload)
    .then((res: any) => {
      console.log(`proccesed ${i}`)
    .catch((err: any) => {
      console.log(`error processing ${err.message}`)

The Results

It took 26 minutes to deliver all the 10000 events and a total of 69 minutes and 52 instances to process them all. I know you guys like the data visualizations so let's see this.


It took 10 minutes to process all the orders with the52 instances:



During the first 3 minutes there was a burst of scaling of 45 instances, after this time, the offer kept constant up to 10 minutes which makes perfect sense.


It seems to me that stressing these babies means it will take around 5 seconds to make a decision about scaling, which seems to be really nice for the vast majority of use cases.

This was a CPU-heavy workload, it had 0 (zero) lost, all processed were completed with success, not a single operation was missing.

Case Two - 10000 events at once 🀯

Ok, for this case, events were not secuential, they came all at once, same process, same configuration, different methodology. These are the results.


It took a total of 6:03 minutes and 310 instances to process all the 10000 events

The burst of the first 2 minutes was crazy:


The workload delivery seems very linear as well:



Given a high demand in short periods of time, GCP scales very aggresively, on this case , it was not linear, which is very interesting, as result it took around 6 minutes to resolve all the operations on the lowest CPU profile of the functions. I wonder how it would look with more horsepower πŸ€”, this is topic for another post ;-).

So far, I feel confident that CPU got my back if I suddenly have a massive load of events to process, still 10000 events is a very modest amount of operations.

Case 3 - 10000 peaceful events 🐒

This case was very similar to the first one but without the expensive operation, just a single http function that received 10000 events consecutive. Let's see how GCP behaves on low cpu operations.


This is interesting, it took a total of 21 minutes to deliver all the http and only a single instance was used, the average completion time was 1500 ms.

What I found really interesting about this was the fact that I've been mentioned that reraly the same instance will execute the same function, I guess there are some variance based on the fact (but not limited to) that:

  • The function was warmed.
  • The type of function HTTP
  • The execution time, which in average was 0.8ms


It's clear the way the GPC scale the serverless functions, on this case a small instances was good enough to process all the data. Also this looks like a very interesting way to allocate results based on the type of operation we have per function.

What's next?

I have others experiments in mind that I want to try out to document:

  • What about 100000 events?
  • Same cases but with 1Gb of memory and more horsepower, how the scaling would look like?

Subscribe to my newsletter if you are down to see more interesting topics like this.

Enjoyed this post? Receive the next one in your inbox!

I hand pick all the best resources about Firebase and GCP around the web.

Not bullshit, not spam, just good content, promised 😘.