Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure Functions EventGrid trigger

I have implemented an EventGrid Trigger to respond to Blob Storage Events the logic of which is simplified below :

public static async void Run(
    JObject eventGridEvent,
    TraceWriter log,
    ExecutionContext context)
{
    string eventContent = ParseEvent(eventGridEvent);

    HttpClient client = GetProxyClient();
    HttpResponseMessage response = await client.GetAsync("blabla/" + eventContent);
    string responseContent = await response.Content.ReadAsStringAsync();
    log.Info("Here is the response :" + responseContent);
}

The external API does not take long to respond (1 second or less) and my configuration for the host is set to default (so an unbounded number of concurrent calls is allowed).

I am getting a lot of duplicated events in the logs when adding multiple blobs (starting at just 2 blobs) at the same time (a script is quickly uploading the blobs one by one with no wait time in between).

I feel that this might be due to the fact that I never acknowledge receiving the events and I don't know if I am supposed to do this in my code or whether the EventGrid Trigger does that automatically.

Is the logic for acknowledging the processing of an event supposed to be implemented within an EventGrid Trigger (Http 200 response) or is this handled automatically?

If not should I still be getting duplicated events? Typically, when uploading a single blob I receive the event for it 3-4 times.

The reason I ask this question is that when using a Http Trigger and returning a 400 response I also get duplicated events which makes sense since I am not acknowledging having correctly processed the event. However, when I return a 200 response then I do not receive duplicated events.

Thanks

like image 596
Thomas Pouget Avatar asked Oct 20 '25 12:10

Thomas Pouget


2 Answers

You don't need to do anything special to indicate success to Event Grid. If your function execution succeeds (does not throw exception), the trigger will respond with success status code automatically.

like image 97
Mikhail Shilkov Avatar answered Oct 23 '25 02:10

Mikhail Shilkov


You may try using an EventGrid Advanced Filter of data.api String ends with FlushWithClose. The reason my Azure Function was executing multiple times upon blob upload was because an EventGrid message was created for every AppendFile action being performed for the blob upload.

I found out (by trial and error) that Azure Data Factory uses a series of API calls to write a single blob to Blob Storage.

Ends up looking something like this:

  • CreateFilePath
  • LeaseFile
  • AppendFile
  • AppendFile
  • AppendFile (each append puts a chunk of the blob until the blob is complete)
  • FlushFile (this is the actual indication that the file has finished; hence the Advanced Filter shown above)
  • LeaseFile

Here is a sample query to view this upload flow yourself:

  • Note: You'll need the Uri of a sample file uploaded to the blob container
//==================================================//
// Author: Eric
// Created: 2021-05-26 0900 
// Query: ADF-to-Blob Storage reference flow
// Purpose: 
// To provide a reference flow of ADF-to-Blob Storage
// file uploads
//==================================================//
// Assign variables
//==================================================//
let varStart = ago(10d);
let varEnd = now();
let varStorageAccount = '<storageaccountname>';
let varStatus = 'Success';
let varSampleUri = 'https://<storageaccountname>.dfs.core.windows.net/<containername>/<parentfolder1>%2F<parentfolder2>%2F<samplefilename.extension>'
//==================================================//
// Filter table
//==================================================//
StorageBlobLogs
| where TimeGenerated between (varStart .. varEnd)
  and AccountName == varStorageAccount
  and StatusText == varStatus
  and split(Uri, '?')[0] == varSampleUri
//==================================================//
// Group and parse results
//==================================================//
| summarize 
  count() by OperationName,
  CorrelationId,
  TimeGenerated,
  UserAgent = tostring(split(UserAgentHeader, ' ')[0]),
  RequesterAppId,
  AccountName, 
  ContainerName = tostring(split(tostring(parse_url(url_decode(Uri))['Path']), '/')[1]),
  FileName = tostring(split(tostring(parse_url(url_decode(Uri))['Path']), '/')[-1]),
  ChunkSize = format_bytes(RequestBodySize, 2, 'MB'),
  StatusCode,
  StatusText
| order by TimeGenerated asc

Its interesting to upload samples from different sources (Azure Data Factory, Azure Storage Explorer, Python/C# SDK, Azure Portal, etc.) and see the different API methods they use. In fact, you'll likely need to do this to get your logging and alerting dialed in.

Its too bad the methods aren't standardized across tools as this particular issue is a great pain to discover on your own!

Again, EventGrid Advanced Filters are your friend in this case.

like image 27
ericOnline Avatar answered Oct 23 '25 01:10

ericOnline