One of the most commonly used patterns in software development is Caching. It’s a simple, but a very effective concept. The idea is to reuse operation results. When performing a heavy operation, we will save the result in our cache container. The next time that we need that result, we will pull it from the cache container, instead of performing the heavy operation again.

For example, to get a person’s Avatar you might need a trip to the database. Instead of performing that trip every time, we will save that Avatar in the cache, pulling it from memory every time you need it.

Caching works great for data that changes infrequently. Or even better, never changes. Data that constantly changes, like the current machine’s time shouldn’t be cached or you will get wrong results.

In-process Cache, Persistant in-process Cache, and Distributed Cache

There are 3 types of caches:

  • In-Memory Cache is used for when you want to implement cache in a single process. When the process dies, the cache dies with it. If you’re running the same process on several servers, you will have a separate cache for each server.
  • Persistent in-process Cache is when you back up your cache outside of process memory. It might be in a file, or in a database. This is more difficult, but if your process is restarted, the cache is not lost. Best used when getting the cached item is expansive, and your process tends to restart a lot.
  • Distributed Cache is when you want to have shared cache for several machines. Usually, it will be several servers. With a distributed cache, it is stored in an external service. This means if one server saved a cache item, other servers can use it as well. Services like Redis are great for this.

We’re going to talk just about in-process cache.

Naive Implementation

Let’s create a very simple cache implementation in C#:


This simple code solves a crucial problem. To get a user’s avatar, only the first request will actually perform a trip to the database. The avatar data (byte[]) is then saved in process memory. All following requests for the avatar will be pulled from memory, saving time and resources.

But, as most things in programming, nothing is so simple. The above solution is not good for a number of reasons. For one thing, this implementation is not thread-safe. Exceptions can occur when used from multiple threads. Besides that, cached items will stay in memory forever, which is actually very bad.

Here’s why we should be removing items from Cache:

  1. Cache can take up a lot of memory, eventually leading to an out-of-memory exceptions and crashes.
  2. High memory consumption can lead to GC Pressure (aka Memory Pressure). In this state, the garbage collector works more than it should, hurting performance.
  3. Cache might need to be refreshed if the data changes. Our caching infrastructure should support that ability.

To handle these problems, cache frameworks have Eviction policies (aka Removal policies). These are rules to have items removed from cache according to some logic. Common eviction policies are:

  • Absolute Expiration policy will remove an item from cache after a fixed amount of time, no matter what.
  • Sliding Expiration policy will remove an item from cache if it wasn’t accessed in a fixed amound of time. So if I set the expiration to 1 minute, the item will keep staying in cache as long as I use it every 30 seconds. Once I don’t use it for longer than a minute, the item is evicted.
  • Size Limit policy will limit the cache memory size.

Now that we know what we need, let’s continue on to better solutions.

Better Solutions

To my great dismay as a blogger, Microsoft already created a wonderful cache implementation. This deprived me the pleasure of creating a similar implementation myself, but at least I have less work writing this blog post.

I’ll show you Microsoft’s solution, how to effectively use it, and then how to improve it in some scenarios.

System.Runtime.Caching/MemoryCache vs Microsoft.Extensions.Caching.Memory

Microsoft has 2 solutions 2 different NuGet packages for caching. Both are great. As per Microsoft’s recommendation, prefer using Microsoft.Extensions.Caching.Memory because it integrates better with Asp. NET Core. It can be easily injected into Asp .NET Core’s dependency injection mechanism.

Here’s a basic example with Microsoft.Extensions.Caching.Memory:


This is very similar to my own NaiveCache, so what changed? Well, for one thing, this is a thread-safe implementation. You can safely call this from multiple threads at once.

The second thing thing is the MemoryCache allows for all the eviction policies we talked about before. Here’s an example:

IMemoryCache with eviction policies:

Let’s analyze the new additions:

  1. SizeLimit was added in MemoryCacheOptions. This adds a size-based policy to our cache container. Size doesn’t have a unit. Instead, we need to set the size amount on each cache entry. In this case, we set the amount to 1 each time with SetSize(1). This means that the cache is limited to 1024 items.
  2. When we reach the size limit, which cache item should be removed? You can actually set priority with .SetPriority(CacheItemPriority.High). The levels are Low, Normal, High, and NeverRemove.
  3. SetSlidingExpiration(TimeSpan.FromSeconds(2)) was added, which sets sliding expiration to 2 seconds. That means if an item was not accessed in over 2 seconds it will be removed.
  4. SetAbsoluteExpiration(TimeSpan.FromSeconds(10)) was added, which sets absolute expiration to 10 seconds. This means than the item will be evicted within 10 seconds if it wasn’t already.

In addition to the options in the example, you can also set a RegisterPostEvictionCallback delegate, which will be called when an item is evicted.

That’s a pretty comprehensive feature set. It makes you wonder if there’s even anything else to add. There are actually a couple of things.

Problems and Missing features

There are a couple of important missing pieces in this implementation.

  1. While you can set the size-limit, the caching doesn’t actually monitor gc pressure. If we did monitor it, we could tighten policies when the pressure is high, and loosen up policies when the pressure is low.
  2. When requesting the same item with multiple threads at the same time, the requests don’t wait for the first one to finish. The item will be created multiple times. For example, let’s say we are caching the Avatar, and getting an avatar from the database takes 10 seconds. If we request an avatar 2 seconds after the first request, it will check if the avatar is cached (it isn’t yet), and start another trip to the database.

As for the first problem of gc pressure: It’s possible to monitor GC pressure with several techniques and heuristics. This blog post is not about that, but you can read my article Find, Fix, and Avoid Memory Leaks in C# .NET: 8 Best Practices to learn of some helpful methods.

The second problem is easier to solve. In fact, here’s an implementation of MemoryCache that solves it entirely:


With this, when trying to get an item, if the same item is in the middle of being created by another thread, you will wait for the other to finish first. Then, you will get the already cached item created by the other thread.

Explanation of the code

This implementation locks the creation of an item. The lock is specific to the key. For example, if we’re waiting to get Alex’s Avatar, we can still get cached values of John or Sarah on another thread.

The dictionary _locks stores all the locks. Regular locks don’t work with async/await, so we need to use SemaphoreSlim.

There are 2 checks to see if the value is already cached if (!_cache.TryGetValue(key, out cacheEntry)). The one inside the lock is the one that ensures there’s a single creation. The one outside of the lock is for optimization.

When to use WaitToFinishMemoryCache

This implementation obviously has some overhead. Let’s consider when it’s even necessary.

Use WaitToFinishMemoryCache when:

  • When the creation time of an item has some sort of cost, and you want to minimize creations as much as possible.
  • When the creation time of an item is very long.
  • When the creation of an item has to be ensured to be done once per key.

Don’t use WaitToFinishMemoryCache when:

  • There’s no danger of multiple threads accessing the same cache item.
  • You don’t mind creating the item more than once. For example, if one extra trip to the database won’t change much.


Caching is a very powerful pattern. It’s also dangerous and has its own complexities. Cache too much and you can cause GC pressure. Cache too little and you can cause performance issues. Then, there’s distributed caching, which is a whole new world to explore. That’s software development for you, always something new to learn.

I hope you enjoyed this post. If you’re interested in memory management, my next article is going to be about the dangers of GC pressure and techniques to prevent it, so keep following. Happy coding.

Enjoy the blog? I would love you to subscribe! Performance Optimizations in C#: 10 Best Practices (exclusive article)