Parallel.ForEach vs. Parallel.ForEachAsync in C#
I needed to improve an application to process and transform multiple records received from an API. Among a few options, I tried parallel tasks to convert the information and perform various network calls. A colleague suggested using Parallel.ForEach; however, I wasn’t very convinced about the adequacy of using that method over Parallel.ForEachAsync for I/O calls. I decided to do more research.
Parallel.ForEach
It allows for parallel iteration over a collection, leveraging multiple threads to perform operations concurrently.
Advantages:
- Parallel.ForEach is straightforward to implement and requires minimal setup.
- It can speed up operations by using multiple CPU Cores.
Drawbacks:
- Parallel.ForEach is a blocking call. Do not run it in the UI thread.
- As mentioned in the documentation, writing to non-thread-safe instance methods from a parallel loop can lead to data corruption.
- It may not be very efficient for I/O operations.
Parallel.ForEachAsync
It offers asynchronous parallel iteration over a collection.
Advantages:
- Parallel.ForEachAsync is non-blocking, allowing tasks to run concurrently while waiting for I/O operations.
Drawbacks:
- There might be a slight overhead due to the asynchronous state machine.
Enough with the theory. Let’s compare the output with a simple API call that will retrieve a list of breweries. Then, it will obtain detailed information for the first 40 businesses, randomly assigning the number of brew varieties and a random amount of items sold per variety.
The average execution time for synchronous operation is 340ms.
The average execution time for asynchronous operation is 270ms. Slightly faster than the sync method. This small difference is true for scenarios with lower overhead due to quick network operations or low API calls. It is even possible to get a faster Parallel.ForEach execution compared to its async counterpart.
Let’s use another API that will allow us to add a delay for each call. In this case, 5 seconds per call:
Parallel.ForEachAsync will create as many tasks as needed, and it may generate a resource contention issue and a more significant overhead that will cause a considerable delay. In this case, the average response times were approximately 20 seconds for the async method and 12 seconds for Parallel.ForEach. However, this changes if we limit the number of concurrent tasks:
Under this scenario Parallel.ForEachAsync performs faster than Parallel.ForEach with an average difference of 10 seconds between each method depending on the maximum concurrent tasks set.
It may pay to use an async method in scenarios with a higher latency or higher number of calls as long as we don’t create a significant overhead or other issues such as resource contention.