A Developers Blog by Anthony Johnston

Durable Functions

12 May 2024

Durable Tasks are orchestrations and activities for complex operations and can be used with Azure Functions.
Compute or gather source data, only once, in parallel and/or in sequence and over time.

You may need data from many sources, databases, APIs, even from humans, any of which maybe unreliable or busy and you don't want to ask twice.
You may have a large amount of complex and costly calculations to perform and you don't want to pay twice.
You may have lots of tasks which can be run in parallel and can scale out to save time.

This is where an "Orchestration" of "Activities" get the job done.

Overview

I have had my first dabble in Azure Functions and dug right into the new Isolated Worker runtime and Durable Tasks.

Azure Functions are a part of the Serverless architecture everyone is talking about. You can run discrete pieces of code in the form of a function, pass some input and get some output.

Durable Tasks add to these functions, by storing the output so that if the same function with the same input and same instance id is called again it gets the output from a store. Which allows a parent coordinating task (Orcestration), to be stopped and restarted (more below) but still work towards the result and you only pay for the compute once.

The latest version adds a strongly typed TaskOrchestrator and TaskActivity abstract classes, along with a (preview) code generator for strongly typed methods on the Durable Task Client.

The main actors in Azure Functions and Durable Tasks (Durable Functions) are the following:

Triggers

These are as they sound, they start the operation off.

They can be http requests, messages from a queue, a timer etc.

Orchestrators

The coordinating task, this calls out to other functions (Activities) that do the work.

You can organise them to call activities in parallel or in sequence.

Activities

These do the work and cost the cash in the main.

They will only be called once on success, but can be retried on failure to add resilience to the orchestration.

Where's The Fn Code?

You can find the code of my investigation of Azure Functions and Durable Tasks on my sandbox-functions-investigation GitHub repository. I will be using examples as we go.

A Trigger Schedules An Orchestration

You can schedule an orchestration using the DurableTaskClient.ScheduleNewOrchestrationInstanceAsync method.

var instanceId = $"ID_{Guid.NewGuid():N}";
var startOptions = new StartOrchestrationOptions(
    InstanceId: instanceId,                         // optional
    StartAt: DateTimeOffset.Now.AddMinutes(1)       // optional, start in the furture
    );

await client
    .ScheduleNewOrchestrationInstanceAsync(         // returns instanceId
        nameof(WelcomeOrchestrator),
        startOptions
        );

Notice the await on the call to schedule, it's a scheduling of the orchestrator, it's not run, the task returns when the job to run the orchestrator has be scheduled. You will not get the result of the orchestrator here, for that you will need to employ a strategy to check the state, if you want a result outside of the orchestrator itself.

If you supply an instanceId, not a new random one as I have here, you will only ever get one instance scheduled. Even if that instance has finished, it will not be scheduled again.

There are other methods to deal with suspending, resuming, terminating and purging orchestrations, but I won't cover them here, they are pretty self explanatory.

Checking the State

Built-in State Endpoints

One way to do this is to return a result which gives details that a caller can use to check the state by calling an endpoint. There is a helper method for this on the client DurableTaskClient.CreateCheckStatusResponseAsync.

return await client
  .CreateCheckStatusResponseAsync(req, instanceId);

{
  "id": string,
  "statusQueryGetUri": string,
  "sendEventPostUri": string,
  "terminatePostUri": string,
  "purgeHistoryDeleteUri": string
}

Poll The Function

Another way, would be to call your trigger function with an instance id, that way you can check the state, schedule if needed, and return that as a result of your trigger.

var meta = await client
  .GetInstanceAsync(instanceId);
switch (meta.RuntimeStatus)
{
  ...
}

For this you could use a queue, like Service Bus. If configured to redeliver a message until it's marked as complete. You can tie this in with your Orchestrator state to perform actions on completion or failure.

See the sequence diagram below for a visual representation of this.

The Running Orchestrator

So you have scheduled an Orchestrator, and it has started, this is where Durable Tasks really kicks in.

An Orchestrator is designed to handle Activities, call them in sequence or in parallel and deal with the results. They will only ever run an activity once, but can be retried on failure.

It is important to know that the orchestrator stops at the first await for an activity called using the CallActivityAsync which does not have a result.

It will then Start Again from the beginning, so anything you have there needs to be idempotent, ie. it can be run multiple times and still give the same result.

CallActivityAsync is a wrapper for the Durable Tasks framework to check for the result of a previous run and return that or run the Activity, store the result and restart the Orchestrator, so you if you have anything that you want to be sure is called only once, put it in an Activity.

[DurableTask(nameof(WelcomeOrchestrator))]
public class WelcomeOrchestrator : 
  TaskOrchestrator<string, string>
{
    public override async Task<string> RunAsync(
        TaskOrchestrationContext context,
        string input
        )
    {
        var welcomes = new List<string>();

        // set up retry options
        var options = new TaskOptions(
            new TaskRetryOptions(
                new RetryPolicy(5, TimeSpan.FromSeconds(1)))
            );

        // await the first activity
        welcomes.Add(
          await context.CallWelcomeUnreliableActivityAsync("One", options)
        );

        // await the next two activities in parallel
        var tasks = new Task<string>[] {
            context.CallWelcomeUnreliableActivityAsync("Two", options),
            context.CallWelcomeUnreliableActivityAsync("Three", options)
        };

        await Task.WhenAll(tasks);
        welcomes.AddRange(tasks.Select(t => t.Result));

        // return the result
        return string.Join(" ", welcomes);
    }
}

This provides a way to handle complex, long running, sequenced and/or paralell activities in a durable manner. And remember, your costs are only when the code is running, you don't pay for the duration of the orchestration.

Conclusion

This is a great way handling complex and long running operations in Azure. It's flexible and easy to understand, once you know how it works with the scheduling and stop starting etc.

But, there are some difficulties in testing operations that are so very async.
And, a runtime that is on Node, crashing and locking resources.
And, I wish MS would make it easier to mock their code.

All in all, though, a thumbs up from me. 👍

AI Image. a cat in a top hat conducting an orchestra of cats

… back