So I'm trying to implement a new cloud function for my application but it requires a little adjustment to my existing database data model. Each user in my database has a field that I am trying to update each week. I don't want this function to run for all users each week as that would be an unnecessarily expensive operation. Instead I've just decided to track the last time I updated that user by storing a 'last-updated' field in their user document.
The problem is none of my existing 400+ users have this field. So I'm looking for a way to add this field, initiated to some default time, for all existing users in the database.
I've thought about using a 'batch write' as described here: https://firebase.google.com/docs/firestore/manage-data/transactions#batched-writes
but it seems like you need to specify the id of each document you want to update. All of my users have a UUID that was generated by Firestore so it's not really practical for me to manually write to each user. Is there a way for me to create a new field in each document of an existing collection? Or if not perhaps a way for me to get a list of all document id's so that I can iterate through it and do a really ugly batch write? I only have to do this mass update once and then never again. Unless I discover a new piece of data I would like to track.
You could use a Cloud Function: for example you will find below the code of a Cloud Function that you would trigger by creating a doc in a collection named batchUpdateTrigger (note that this is just a way to trigger the Cloud Function. You could very well use an HTTPS Cloud Function for that).
In this Cloud Function we take all the docs of the collection named collection and we add to each of them a new field with the current date/time (ServerValue.TIMESTAMP). We use Promise.all() to execute all the update asynchronous work in parallel. Don't forget to add write access to the batchUpdateTrigger collection and to delete the Cloud Function once it has run.
exports.batchUpdate = functions.firestore
.document('batchUpdateTrigger/{triggerId}')
.onCreate((snap, context) => {
var collecRef = db.collection('collection');
return admin.collecRef.get()
.then(snapshot => {
const ts = admin.database.ServerValue.TIMESTAMP;
var promises = [];
snapshot.forEach(doc => {
const ref = doc.ref;
promises.push(
ref.update({
lastUpdate: ts
});
);
});
return Promise.all(promises);
});
});
One problem you may encounter here is that you reach the timeout of the Cloud Function. The default timeout is 60 seconds but you can increase it on the Google Cloud console (https://console.cloud.google.com/functions/list?project=xxxxxxx)
Another approach would be, like you said, to use a batched write.
The Cloud Function would then look as follows:
exports.batchUpdate = functions.firestore
.document('batchUpdateTrigger/{triggerId}')
.onCreate((snap, context) => {
var collecRef = db.collection('collection');
return admin.collecRef.get()
.then(snapshot => {
const ts = admin.database.ServerValue.TIMESTAMP;
let batch = db.batch();
snapshot.forEach(doc => {
const ref = doc.ref;
batch.update(ref, {
lastUpdate: ts
});
});
return batch.commit();
});
});
However, you would need to manage, in the code, the maximum limit of 500 operations in a batch.
Below is a possible simple approach (i.e. not very sophisticated...). Since you are going to set the default values only once and you have only few hundreds of docs to treat we can consider it acceptable! The following Cloud Function will treat documents by batch of 500. So you may have to manually re-trigger it until all the docs are treated.
exports.batchUpdate = functions.firestore
.document('batchUpdateTrigger/{triggerId}')
.onCreate((snap, context) => {
var collecRef = db.collection('collection');
return admin.collecRef.get()
.then(snapshot => {
const docRefsArray = [];
snapshot.forEach(doc => {
if (doc.data().lastUpdate == null) {
//We need to "treat" this doc
docRefsArray.push(doc.ref);
)
});
console.log("Nbr of doc to treat: " + docRefsArray.length); //When the output is 0 you are done, i.e. all the docs are treated
if (docRefsArray.length > 0) {
const ts = admin.database.ServerValue.TIMESTAMP;
let batch = db.batch();
if (docRefsArray.length < 500) {
//We can "treat" all the documents in the QuerySnapshot
docRefsArray.forEach(ref => {
batch.update(ref, {
lastUpdate: ts
});
});
} else {
//We need to "treat" only 500 documents
for (let i = 0; i < 500; i++) {
batch.update(docRefsArray[i], {
lastUpdate: ts
});
}
return batch.commit();
} else {
return null;
}
});
});
The advantage with this last technique is that if you encounter some problems of Cloud Function timeout, you can reduce the size of the batch.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With