Skip to main content

Unlocking the Power of Concurrency and Parallelism in Modern .NET Applications


Unlocking the Power of Concurrency and Parallelism in Modern .NET Applications

What is Concurrency?

Concurrency, a vital concept in modern programming, refers to a program's ability to execute multiple independent operations simultaneously, with these concurrent tasks often occurring in an overlapping fashion. Importantly, the order in which these tasks are executed doesn't affect the final result, as long as they remain independent. They can start, run, and complete in an interleaved manner without the need to run at the exact same moment.

It's crucial to note that concurrency and parallelism aren't synonymous, though concurrency serves as a means to achieve parallelism. Parallelism entails the execution of two or more tasks at the same instant, or in other words, simultaneously.

Even in the case of a single-core CPU, which can't run multiple tasks simultaneously, it can offer a form of virtual parallelism by rapidly and seamlessly switching between different tasks. This creates the illusion of parallel processing. In contrast, a multi-core CPU can genuinely achieve parallelism when the program is well-optimized to utilize the multiple cores effectively.

Concurrency and parallelism are powerful techniques that empower developers to create scalable and efficient applications, making the most of today's hardware and resources. By employing appropriate tools and techniques, it's possible to greatly enhance the responsiveness and performance of these applications.

Process and Thread

Now that we've gained an understanding of what concurrency and parallelism means, let's delve into the realm of concurrent programming, where processes and threads are the key players responsible for carrying out tasks.

In the context of programming, each application initiates with one or more processes and a process have at least one thread. Any running instance of a program is essentially a process. These processes operate in isolation from one another, with each having its dedicated memory space. Within a process, multiple threads can exist, all of which share the same memory space and have the capability to access memory locations used by other threads within the same process. Threads are the fundamental units to which the operating system allocates processor time, essentially abstracting the concept of a virtual processor.

Asynchronous Programming

The concept of asynchronous programming often kicks in when considering concurrency in an application. Almost every modern programming language supports asynchronous programming. Async programming is the way of writing code so that long-running functions continue to execute whereas the control is quickly returned to the caller. In case of tasks that are I/O bound or CPU bound, we can invoke such methods asynchronously and perform independent operations until the result is required. As a result, it will increase the responsiveness and performance of the application.

To make any method asynchronous, we have to mark the method with ‘async’ keyword. It gives us ability to use another keyword ‘await’ inside the method that allows to wait the time-consuming calls asynchronously.

Let’s consider a simple example

public class NasaService

{

    private static readonly HttpClient client = new HttpClient();

    const string API_KEY = "DEMO"; // Replace with your API key

 

    public async Task<string> GetApodAsync()

    {

        string apiUrl = $"https://api.nasa.gov/planetary/apod?api_key={API_KEY}";

 

        try

        {

            Task<string> getResult = client.GetStringAsync(apiUrl);

           

            // <<Perform independent tasks if any>>

           

            string json = await getResult; // Async call

            JObject apod = JObject.Parse(json);

            return $"title: {apod["title"]}\nhdurl: {apod["hdurl"]}\nexplanation: {apod["explanation"]}";

        }

        catch (Exception ex)

        {

            return $"Error: {ex.Message}";

        }

    }

}

In the above example, when the program execution reaches await keyword, the control is returned back to the caller until the result is received. This means that the thread is now free and can handle other tasks or requests. Once the awaited operation completes, execution resumes and the remaining code is processed. The way execution continues and how the thread is managed may vary depending on the type of application (WinForms, Console, Web, etc.).

Does the application a separate thread for the execution of awaited task? The simple answer is NO. While there are conditions where async/await can create a new thread, a true async operation does not necessarily spin up a new thread. In this case, the GetStringAsync operation is an asynchronous operation that sends an HTTP GET request to an external resource, making it an I/O-bound operation. As soon as the execution hits await part, the current thread is released and sent back to the pool (or synchronization context) which can now be utilized by another task or request. When the result is available, a notification is issued so that that the thread can continue the remaining execution.

In a deeper level, the I/O operation is offloaded to the OS and IOCP (I/O Completion Ports) is used for network I/O on Windows (for other OSs there are options like epoll and kqueue). When the operation is completed, the OS notifies .NET about the completion via IOCP, which is essentially a part of thread pool. An I/O thread from the pool is briefly used for notifying the completion. Hence, an async operation is completed without the need for creation of additional threads.

Parallelism with TPL

Asynchronous programming is about dealing with non-blocking execution and optimal utilization of current thread. But there are certain scenarios where we need more computational resources executing in parallel instead of merely waiting for them to complete. This is where TPL (Task Parallel Library) comes to play, allowing us to use all the available processors efficiently via multithreading. The Task Parallel Library (TPL) is a set of public types and APIs in the .NET Framework that makes parallel programming easier. It provides a higher-level abstraction for multithreading and parallelism, allowing developers to efficiently manage parallel tasks with minimal effort.

Let’s consider a scenario where we are processing a large volume of documents. The processing steps comprises of Spelling Check, Plagiarism, Formatting and Summary Generation. Each step itself is computationally expensive operation but independent of the others. In an efficient application, instead of processing each document sequentially, we can parallelize the process and speedup the overall workflow.

public class Program

{

    public static void Main(string[] args)

    {

        List<string> documents = new List<string> { "Doc1", "Doc2", "Doc3", "Doc4", "Doc5" };

        Console.WriteLine("Document Processing Started...");

 

        Parallel.ForEach(documents, document =>

        {

            ProcessDocument(document);

        });

 

        Console.WriteLine("\nAll documents processed successfully!");

    }

 

    static void ProcessDocument(string document)

    {

        Parallel.Invoke(

            () => CheckSpelling(document),

            () => DetectPlagiarism(document),

            () => AdjustFormatting(document),

            () => GenerateSummary(document)

        );

        Console.WriteLine($"{document} processing completed.\n");

    }

 

    static void CheckSpelling(string doc)

    {

        Console.WriteLine($"{doc}: Spelling & Grammar Checked.");

        Task.Delay(1000).Wait();

    }

 

    static void DetectPlagiarism(string doc)

    {

        Console.WriteLine($"{doc}: Plagiarism Detection Completed.");

        Task.Delay(1500).Wait();

    }

 

    static void AdjustFormatting(string doc)

    {

        Console.WriteLine($"{doc}: Formatting Adjusted.");

        Task.Delay(1200).Wait();

    }

 

    static void GenerateSummary(string doc)

    {

        Console.WriteLine($"{doc}: Summary Generated.");

        Task.Delay(800).Wait();

    }

}

In the above example, the line Parallel.ForEach initiates the processing of multiple documents concurrently. Again, the line Parallel.Invoke runs the processing steps simultaneously, resulting in faster processing.

At first glance, this approach might seem complex due to the large number of threads being created. However, TPL simplifies execution by abstracting away the complexities of work partitioning, thread scheduling, state management, and other low-level details, so the user doesn't have to manage them directly.

Final Thoughts

There are certain things to consider before using async/await or TPL. As a starting point, we can ask these questions.

  • Is there any I/O bound operation? Yes => async/await
  • Is there any CPU bound operation?
    • Are you worried for responsiveness of the application? Yes => async/await
    • Is the task appropriate for Parallelism (e.g. multiple independent tasks)? Yes => TPL

The important thing is to analyze and look for the execution of the code to determine if actually there is a need for Asynchronous or Parallel execution. Sometimes, the overhead of these techniques may not lead to desired efficiency as there is lot going under the hood.

Comments

Popular posts from this blog

Run GitHub Actions Locally with Act

GitHub Actions has become a prominent tool in the CI/CD space. It offers easy and hassle-free configuration, and workflows can be set up in minutes, with many components available out of the box. The most challenging part, however, is testing the workflow before actually pushing it to production. During the initial setup, we often need enough room to run and experiment with the workflow. Running such experiments directly in GitHub Actions is not always convenient. Wouldn’t it be amazing if we could run them locally, make changes, experiment, finalize, and then push everything to production? I always thought that would be a great idea. Recently, while consulting with one of my clients, I needed to set up an automation workflow with GitHub Actions. During this process, I discovered a gem called Act . It's an open-source tool available on GitHub that makes it easy to run GitHub Actions locally. The tool is incredibly simple to install and use. GitHub: https://github.com/nektos/a...