If a collection contains documents with sequential indexed values, Cloud Firestore limits the write rate to 500 writes per second. This page describes how to shard a document field to overcome this limit. First, let's define what we mean by "sequential indexed fields" and clarify when this limit applies.
Sequential indexed fields
"Sequential indexed fields" means any collection of documents that contains a monotonically increasing or decreasing indexed field. In many cases, this means a timestamp
field, but any monotonically increasing or decreasing field value can trigger the write limit of 500 writes per second.
For example, the limit applies to a collection of user
documents with indexed field userid
if the app assigns userid
values like so:
1281, 1282, 1283, 1284, 1285, ...
On the other hand, not all timestamp
fields trigger this limit. If a timestamp
field tracks randomly distributed values, the write limit does not apply. The actual value of the field does not matter either, only that the field is monotonically increasing or decreasing. For example, both of the following sets of monotonically increasing field values trigger the write limit:
100000, 100001, 100002, 100003, ...
0, 1, 2, 3, ...
Sharding a timestamp field
Assume your app uses a monotonically increasing timestamp
field. If your app doesn't use the timestamp
field in any queries, you can remove the 500 writes per second limit by not indexing the timestamp field. If you do require a timestamp
field for your queries, you can work around the limit by using sharded timestamps:
- Add a
shard
field alongside thetimestamp
field. Use1..n
distinct values for theshard
field. This raises the write limit for the collection to500*n
, but you must aggregaten
queries. - Update your write logic to randomly assign a
shard
value to each document. - Update your queries to aggregate the sharded result sets.
- Disable single-field indexes for both the
shard
field and thetimestamp
field. Delete existing composite indexes that contain thetimestamp
field. - Create new composite indexes to support your updated queries. The order of the fields in an index matters, and the
shard
field must come before thetimestamp
field. Any indexes that include thetimestamp
field must also include theshard
field.
You should implement sharded timestamps only in use cases with sustained write rates above 500 writes per second. Otherwise, this is a pre-mature optimization. Sharding a timestamp
field removes the 500 writes per second restriction but with the trade-off of needing client-side query aggregations.
The following examples show how to shard a timestamp
field and how to query a sharded result set.
Example data model and queries
As an example, imagine an app for near real-time analysis of financial instruments like currencies, common stocks, and ETFs. This app writes documents to an instruments
collection like so:
Node.js
asyncfunctioninsertData(){constinstruments=[{symbol:'AAA',price:{currency:'USD',micros:34790000},exchange:'EXCHG1',instrumentType:'commonstock',timestamp:Timestamp.fromMillis(Date.parse('2019-01-01T13:45:23.010Z'))},{symbol:'BBB',price:{currency:'JPY',micros:64272000000},exchange:'EXCHG2',instrumentType:'commonstock',timestamp:Timestamp.fromMillis(Date.parse('2019-01-01T13:45:23.101Z'))},{symbol:'Index1 ETF',price:{currency:'USD',micros:473000000},exchange:'EXCHG1',instrumentType:'etf',timestamp:Timestamp.fromMillis(Date.parse('2019-01-01T13:45:23.001Z'))}];constbatch=fs.batch();for(constinstofinstruments){constref=fs.collection('instruments').doc();batch.set(ref,inst);}awaitbatch.commit();}
This app runs the following queries and orders by the timestamp
field:
Node.js
functioncreateQuery(fieldName,fieldOperator,fieldValue,limit=5){returnfs.collection('instruments').where(fieldName,fieldOperator,fieldValue).orderBy('timestamp','desc').limit(limit).get();}functionqueryCommonStock(){returncreateQuery('instrumentType','==','commonstock');}functionqueryExchange1Instruments(){returncreateQuery('exchange','==','EXCHG1');}functionqueryUSDInstruments(){returncreateQuery('price.currency','==','USD');}
insertData().then(()=>{constcommonStock=queryCommonStock().then((docs)=>{console.log('--- queryCommonStock: ');docs.forEach((doc)=>{console.log(`doc = ${util.inspect(doc.data(),{depth:4})}`);});});constexchange1Instruments=queryExchange1Instruments().then((docs)=>{console.log('--- queryExchange1Instruments: ');docs.forEach((doc)=>{console.log(`doc = ${util.inspect(doc.data(),{depth:4})}`);});});constusdInstruments=queryUSDInstruments().then((docs)=>{console.log('--- queryUSDInstruments: ');docs.forEach((doc)=>{console.log(`doc = ${util.inspect(doc.data(),{depth:4})}`);});});returnPromise.all([commonStock,exchange1Instruments,usdInstruments]);});
After some research, you determine that the app will receive between 1,000 and 1,500 instrument updates per second. This surpasses the 500 writes per second allowed for collections containing documents with indexed timestamp fields. To increase the write throughput, you need 3 shard values, MAX_INSTRUMENT_UPDATES/500 = 3
. This example uses the shard values x
, y
, and z
. You can also use numbers or other characters for your shard values.
Adding a shard field
Add a shard
field to your documents. Set the shard
field to values x
, y
, or z
which raises the write limit on the collection to 1,500 writes per second.
Node.js
// Define our 'K' shard valuesconstshards=['x','y','z'];// Define a function to help 'chunk' our shards for use in queries.// When using the 'in' query filter there is a max number of values that can be// included in the value. If our number of shards is higher than that limit// break down the shards into the fewest possible number of chunks.functionshardChunks(){constchunks=[];letstart=0;while(start < shards.length){constelements=Math.min(MAX_IN_VALUES,shards.length-start);constend=start+elements;chunks.push(shards.slice(start,end));start=end;}returnchunks;}// Add a convenience function to select a random shardfunctionrandomShard(){returnshards[Math.floor(Math.random()*Math.floor(shards.length))];}
asyncfunctioninsertData(){constinstruments=[{shard:randomShard(),// add the new shard field to the documentsymbol:'AAA',price:{currency:'USD',micros:34790000},exchange:'EXCHG1',instrumentType:'commonstock',timestamp:Timestamp.fromMillis(Date.parse('2019-01-01T13:45:23.010Z'))},{shard:randomShard(),// add the new shard field to the documentsymbol:'BBB',price:{currency:'JPY',micros:64272000000},exchange:'EXCHG2',instrumentType:'commonstock',timestamp:Timestamp.fromMillis(Date.parse('2019-01-01T13:45:23.101Z'))},{shard:randomShard(),// add the new shard field to the documentsymbol:'Index1 ETF',price:{currency:'USD',micros:473000000},exchange:'EXCHG1',instrumentType:'etf',timestamp:Timestamp.fromMillis(Date.parse('2019-01-01T13:45:23.001Z'))}];constbatch=fs.batch();for(constinstofinstruments){constref=fs.collection('instruments').doc();batch.set(ref,inst);}awaitbatch.commit();}
Querying the sharded timestamp
Adding a shard
field requires that you update your queries to aggregate sharded results:
Node.js
functioncreateQuery(fieldName,fieldOperator,fieldValue,limit=5){// For each shard value, map it to a new query which adds an additional// where clause specifying the shard value.returnPromise.all(shardChunks().map(shardChunk=>{returnfs.collection('instruments').where('shard','in',shardChunk)// new shard condition.where(fieldName,fieldOperator,fieldValue).orderBy('timestamp','desc').limit(limit).get();}))// Now that we have a promise of multiple possible query results, we need// to merge the results from all of the queries into a single result set..then((snapshots)=>{// Create a new container for 'all' resultsconstdocs=[];snapshots.forEach((querySnapshot)=>{querySnapshot.forEach((doc)=>{// append each document to the new all containerdocs.push(doc);});});if(snapshots.length===1){// if only a single query was returned skip manual sorting as it is// taken care of by the backend.returndocs;}else{// When multiple query results are returned we need to sort the// results after they have been concatenated.// // since we're wanting the `limit` newest values, sort the array// descending and take the first `limit` values. By returning negated// values we can easily get a descending value.docs.sort((a,b)=>{constaT=a.data().timestamp;constbT=b.data().timestamp;constsecondsDiff=aT.seconds-bT.seconds;if(secondsDiff===0){return-(aT.nanoseconds-bT.nanoseconds);}else{return-secondsDiff;}});returndocs.slice(0,limit);}});}functionqueryCommonStock(){returncreateQuery('instrumentType','==','commonstock');}functionqueryExchange1Instruments(){returncreateQuery('exchange','==','EXCHG1');}functionqueryUSDInstruments(){returncreateQuery('price.currency','==','USD');}
insertData().then(()=>{constcommonStock=queryCommonStock().then((docs)=>{console.log('--- queryCommonStock: ');docs.forEach((doc)=>{console.log(`doc = ${util.inspect(doc.data(),{depth:4})}`);});});constexchange1Instruments=queryExchange1Instruments().then((docs)=>{console.log('--- queryExchange1Instruments: ');docs.forEach((doc)=>{console.log(`doc = ${util.inspect(doc.data(),{depth:4})}`);});});constusdInstruments=queryUSDInstruments().then((docs)=>{console.log('--- queryUSDInstruments: ');docs.forEach((doc)=>{console.log(`doc = ${util.inspect(doc.data(),{depth:4})}`);});});returnPromise.all([commonStock,exchange1Instruments,usdInstruments]);});
Update index definitions
To remove the 500 writes per second constraint, delete the existing single-field and composite indexes that use the timestamp
field.
Delete composite index definitions
Firebase Console
Open the Cloud Firestore Composite Indexes page in the Firebase console.
For each index that contains the
timestamp
field, click the button and click Delete.
GCP Console
In the Google Cloud console, go to the Databases page.
Select the required database from the list of databases.
In the navigation menu, click Indexes, and then click the Composite tab.
Use the Filter field to search for index definitions that contain the
timestamp
field.For each of these indexes, click the
button and click Delete.
Firebase CLI
- If you haven't set up the Firebase CLI, follow these direction to install the CLI and run the
firebase init
command. During theinit
command, make sure to selectFirestore: Deploy rules and create indexes for Firestore
. - During setup, the Firebase CLI downloads your existing index definitions to a file named, by default,
firestore.indexes.json
. Remove any index definitions that contain the
timestamp
field, for example:{"indexes":[//Deletecompositeindexdefinitionthatcontainthetimestampfield{"collectionGroup":"instruments","queryScope":"COLLECTION","fields":[{"fieldPath":"exchange","order":"ASCENDING"},{"fieldPath":"timestamp","order":"DESCENDING"}]},{"collectionGroup":"instruments","queryScope":"COLLECTION","fields":[{"fieldPath":"instrumentType","order":"ASCENDING"},{"fieldPath":"timestamp","order":"DESCENDING"}]},{"collectionGroup":"instruments","queryScope":"COLLECTION","fields":[{"fieldPath":"price.currency","order":"ASCENDING"},{"fieldPath":"timestamp","order":"DESCENDING"}]},]}
Deploy your updated index definitions:
firebase deploy --only firestore:indexes
Update Single-field index definitions
Firebase Console
Open the Cloud Firestore Single Field Indexes page in the Firebase console.
Click Add Exemption.
For Collection ID, enter
instruments
. For Field path, entertimestamp
.Under Query scope, select both Collection and Collection group.
Click Next
Toggle all the index settings to Disabled. Click Save.
Repeat the same steps for the
shard
field.
GCP Console
In the Google Cloud console, go to the Databases page.
Select the required database from the list of databases.
In the navigation menu, click Indexes, and then click the Single Field tab.
Click the Single Field tab.
Click Add Exemption.
For Collection ID, enter
instruments
. For Field path, entertimestamp
.Under Query scope, select both Collection and Collection group.
Click Next
Toggle all the index settings to Disabled. Click Save.
Repeat the same steps for the
shard
field.
Firebase CLI
Add the following to the
fieldOverrides
section of your index definitions file:{ "fieldOverrides": [ // Disable single-field indexing for the timestamp field { "collectionGroup": "instruments", "fieldPath": "timestamp", "indexes": [] }, ] }
Deploy your updated index definitions:
firebase deploy --only firestore:indexes
Create new composite indexes
After removing all the previous indexes containing the timestamp
, define the new indexes that your app requires. Any index containing the timestamp
field must also contain the shard
field. For example, to support the queries above, add the following indexes:
Collection | Fields indexed | Query scope |
---|---|---|
instruments | shard, price.currency, timestamp | Collection |
instruments | shard, exchange, timestamp | Collection |
instruments | shard, instrumentType, timestamp | Collection |
Error Messages
You can build these indexes by running the updated queries.
Each query returns an error message with a link to create the required index in the Firebase Console.
Firebase CLI
Add the following indexes to your index definition file:
{"indexes":[//Newindexesforshardedtimestamps{"collectionGroup":"instruments","queryScope":"COLLECTION","fields":[{"fieldPath":"shard","order":"DESCENDING"},{"fieldPath":"exchange","order":"ASCENDING"},{"fieldPath":"timestamp","order":"DESCENDING"}]},{"collectionGroup":"instruments","queryScope":"COLLECTION","fields":[{"fieldPath":"shard","order":"DESCENDING"},{"fieldPath":"instrumentType","order":"ASCENDING"},{"fieldPath":"timestamp","order":"DESCENDING"}]},{"collectionGroup":"instruments","queryScope":"COLLECTION","fields":[{"fieldPath":"shard","order":"DESCENDING"},{"fieldPath":"price.currency","order":"ASCENDING"},{"fieldPath":"timestamp","order":"DESCENDING"}]},]}
Deploy your updated index definitions:
firebase deploy --only firestore:indexes
Understanding the write limit for sequential indexed fields
The limit on the write rate for sequential indexed fields comes from how Cloud Firestore stores index values and scales index writes. For each index write, Cloud Firestore defines a key-value entry which concatenates the document name and the value of each indexed field. Cloud Firestore organizes these index entries into groups of data called tablets. Each Cloud Firestore server holds one or more tablets. When the write load to a particular tablet becomes too high, Cloud Firestore scales horizontally by splitting the tablet into smaller tablets and spreading the new tablets across different Cloud Firestore servers.
Cloud Firestore places lexicographically close index entries on the same tablet. If the index values in a tablet are too close together, such as for timestamp fields, Cloud Firestore cannot efficiently split the tablet into smaller tablets. This creates a hot spot where a single tablet receives too much traffic, and read and write operations to the hot spot become slower.
By sharding a timestamp field, you make it possible for Cloud Firestore to efficiently split workloads across multiple tablets. Although the values of the timestamp field might remain close together, the concatenated shard and index value give Cloud Firestore enough space between index entries to split the entries among multiple tablets.
What's next
- Read the best practices for designing for scale
- For cases with high write rates to a single document, see Distrubted counters
- See the standard limits for Cloud Firestore