Entity Framework Core and CosmosDb Provider

I've been wanting to play around with the Cosmos DB provider for a while, so when EF Core 3 was released I decided its time to take a look.

I use Azure Cosmos DB for a few projects and have always found the SDK to feel clunky, although the new version 3 SDK looks much better.  I also use Entity Framework Core in many of my projects and really like the simplicity of using it, so i'm excited & hopeful about the new provider.

The Cosmos DB provider will help remove a lot of that boiler plate code we have to implement when using the SDK.  The provider will enable us to use the Entity Framework and all the good things it provides along with Cosmos DB for persistence.

For this post I'm assuming you have some knowledge of Cosmos DB and Entity Framework.  Microsoft docs are great for an introduction to Cosmos DB and Entity Framework if not
Entity Framework Core Docs
Cosmos DB Docs

So lets get started and have a play around with it...

Cosmos DB

I'm going to use a Cosmos instance that I've already created in Azure, if you don't have an Azure account you can create one for free to give Cosmos a try or alternatively you can install and use the Cosmos DB Emulator

NOTE: The emulator currently only works on Windows

Project setup

You can view this posts sample at the following github repo
CraigMellon/efcore-cosmosdb no-readme

You will need to download and install the .NET Core 3.0 SDK which can found here: https://dotnet.microsoft.com/download/dotnet-core/3.0

Lets create a new empty console app using the dotnet CLI and add Entity Framework Core and the Cosmos provider nuget packages.

dotnet new console

Install Microsoft.EntityFrameworkCore.Cosmos Nuget Package

dotnet add package Microsoft.EntityFrameworkCore.Cosmos
Adding the Cosmos nuget package will also add Entity Framework Core package as a dependency

Lets get started

For the first experiment we will use a simple Job model with an embedded Address type:

    public class Job
    {
        public Guid Id { get; set; }
        public Address Address { get; set; }
    }

    public class Address
    {
        public string Line1 { get; set; }
        public string Line2 { get; set; }
        public string Line3 { get; set; }
        public string Town { get; set; }
        public string PostCode { get; set; }
    }

We need to create a DbContext and configure it to use the Cosmos provider, add a new JobContext file with the following code

    public class JobContext : DbContext
    {
        public DbSet<Job> Jobs { get; set; }

        protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) {
            optionsBuilder.UseCosmos(
               "*YOUR-COSMOSDB-ENPOINT*",
                "*YOUR-KEY*",
                "EFCoreTest"
            );
        }

        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {
            modelBuilder.Entity<Job>().OwnsOne(j => j.Address);
        }
    }
For simplicity we have hard-coded the endpoint and key into the DbContext, I wouldn't recommend doing this in a real application

We are overriding the OnModelCreating method to configure the Address as an owned entity of Job, this will ensure that the Address entity is embedded into the Job document when saved

Now that we have the DbContext and models setup lets update the Program.cs to create a Job and save it into our Cosmos database using EF Core

        Console.WriteLine("Welcome to the EFCore Cosmos DB Provider...");

        var job = new Job
        {
            Id = Guid.NewGuid(),
            Address = new Address
            {
                Line1 = "21 Some Street",
                Line2 = "Somewhere",
                Town = "Birmingham",
                PostCode = "B90 4SS",
            }
        };

        using (var context = new JobContext())
        {
            context.Database.EnsureCreated();

            context.Add(job);

            context.SaveChanges();
        }

        using (var context = new JobContext())
        {
            var loadedJob = context.Jobs.First();
            Console.WriteLine($"Job created and retreived with address: {job.Address.Line1}, {job.Address.PostCode}");
            }
The context.Database.EnsureCreated() will create a collection in your Cosmos DB if it doesn't already exist

If we run this code it will add a document into our Cosmos DB, and if we take a look in the Azure Portal we can see that a new collection named JobContext has been created and a new document has been added.

efcore cosmos JobContext default collection name

By default the cosmos provider will use the name of your context as the collection name.  This can be overridden in the OnModelCreating method of your DbContext.

Lets set the default name of the collection to Jobs, re-run our app and take a look at our database again in the portal.  Add the following line to the top of the OnModelCreating method in our JobContext.cs

modelBuilder.HasDefaultContainer("Jobs");
efcore cosmos job collection name set with HadDefaultContainer

Embedded collections

As well as embedding entities within your document you can also embed collections which will embed the entities as an array in your document.  To test this out we will add a new Contact entity and add a collection of contacts to our Job.

    public class Job
    {
        public Guid Id { get; set; }
        public Address Address { get; set; }

        public List<Contact> Contacts { get; set; }
    }

    public class Contact
    {
        public string Title { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string TelephoneNumber { get; set; }
    }

Also update the JobContext OnModelCreating method to configure the Contacts on Job as an owned collection using the OwnsMany method

        modelBuilder.Entity<Job>().OwnsMany(j => j.Contacts);

Now update the Program.js to add a list of contacts to our job.

We now have multiple documents in our collection so the call context.Jobs.First() will no longer guarantee to retrieve our new job, instead we will update it to fetch the Job with the Id we just created
Console.WriteLine("Welcome to the EFCore Cosmos DB Provider...");

var job = new Job
{
    Id = Guid.NewGuid(),
    Address = new Address
    {
        Line1 = "21 Some Street",
        Line2 = "Somewhere",
        Town = "Birmingham",
        PostCode = "B90 4SS",
    },
    Contacts = new List<Contact>()
    {
        new Contact { Title = "Mr", FirstName = "Craig", LastName = "Mellon", TelephoneNumber = "34441234" },
        new Contact { Title = "Mrs", FirstName = "Cara", LastName = "Mellon", TelephoneNumber = "53665554" }
    }
};

using (var context = new JobContext())
{
    context.Database.EnsureCreated();

    context.Add(job);

    context.SaveChanges();
}

using (var context = new JobContext())
{
    var loadedJob = context.Jobs.First(x => x.Id == job.Id);
    Console.WriteLine($"Job created and retreived with address: {job.Address.Line1}, {job.Address.PostCode}");
    Console.WriteLine($"  Contacts ({job.Contacts.Count()})");
    job.Contacts.ForEach(x =>
    {
        Console.WriteLine($"    Name: {x.FirstName} {x.LastName}");
    });
}

If we now have a look on the portal we can see the Contacts have been embedded into our Job as an array

efcore cosmos contact embedded in job document

Multiple models in one collection

The Cosmos provider supports storing multiple entity types in the same collection.  To do this EF Core adds a field named Discriminator to our documents. If we take a look at the last document in the database we can see the objects each have a Discriminator field.

efcore cosmos model discriminator highlighted

Also notice the id column which is unique for each item in the Cosmos collection/partition, EF Core generates the value by concatenating the discriminator and the primary key values, using '|' as a delimiter

Linked Entities/Documents

As mentioned above EF Core can store multiple entities within the same collection. To see how this works lets create a new entity called Resource and add it as a linked entity to our Job.

First create a new model for our Resource and add a new property to our Job named AssignedResource with a type of Resource, we will also add AssignedResourceId which will give us access to the Id of the resource linked to the Job.  If we don'd add this field EF Core will automatically add it under the covers but we wont have easy access to it without this property.

public class Job
{
    public Guid Id { get; set; }
    public Address Address { get; set; }

    public List<Contact> Contacts { get; set; }

    public Guid AssignedResourceId { get; set; }
    public Resource AssignedResource { get; set; }
}

public class Resource
{
    public Guid Id { get; set; }
    public string Title { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string TelephoneNumber { get; set; }
}

Now we need to update our JobContext to configure the Resource as a linked entity using the HasOne method on the model builder.

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    ...

    modelBuilder.Entity<Job>().HasOne(j => j.AssignedResource);
}

Update the Program.cs to now create a resource with the job

Console.WriteLine("Welcome to the EFCore Cosmos DB Provider...");

var job = new Job
{
    Id = Guid.NewGuid(),
    Address = new Address
    {
        Line1 = "21 Some Street",
        Line2 = "Somewhere",
        Town = "Birmingham",
        PostCode = "B90 4SS",
    },
    Contacts = new List<Contact>()
    {
        new Contact { Title = "Mr", FirstName = "Craig", LastName = "Mellon", TelephoneNumber = "34441234" },
        new Contact { Title = "Mrs", FirstName = "Cara", LastName = "Mellon", TelephoneNumber = "53665554" }
    },
    AssignedResource = new Resource
    {
        Id = Guid.NewGuid(),
        Title = "Mr",
        FirstName = "Bob",
        LastName = "Builder",
        TelephoneNumber = "0800 1234567"
    }
};

using (var context = new JobContext())
{
    context.Database.EnsureCreated();

    context.Add(job);

    context.SaveChanges();
}

If we run this code and take a look in the portal we can see that two documents have been created.  One for our Job and one for our Resource, also notice that EF Core has set the AssignedResourceId field in our Job to the id of the new Resource document:

efcore cosmos job with separate resource document

Loading linked entities

Normally to fetch our Resource along with the Job we would use the Include() method when fetching the Job, in the relational world EF Core would translate this to a join on the backend and return the Job with the Resource populated.  If we try to do this with the Cosmos provider EF Core throws an exception informing us that joins are not yet supported.

The current version of the Cosmos provider does not support joining entities so we can't retrieve a Job with the Resource in one go. This feature is on the road map and hopefully it will drop into the 3.1 release

So how can we get the Resource populated on the Job when fetching from the database?

We could load the Job first and then load the Resource after as we have the AssignedResourceId on the Job.  Lets try that out and see how it would work.

First we need to add a DbSet for the Resource to our DbContext:

    public DbSet<Resource> Resources { get; set; }

Next lets update the loading portion of our code in Program.cs to fetch the resource and assign it to the job, then finally lets output the resource name to the console:

using (var context = new JobContext())
{
    var loadedJob = context.Jobs.First(x => x.Id == job.Id);

    // now load the resource and assign it to the Job
    var resource = context.Resources.First(x => x.Id == loadedJob.AssignedResourceId);
    loadedJob.AssignedResource = resource;

    Console.WriteLine($"Job created and retreived with address: {job.Address.Line1}, {job.Address.PostCode}");
    Console.WriteLine($"  Contacts ({job.Contacts.Count()})");
    job.Contacts.ForEach(x =>
    {
        Console.WriteLine($"    Name: {x.FirstName} {x.LastName}");
    });

    Console.WriteLine($"  Assigned Resource: {loadedJob.AssignedResource?.FirstName} {loadedJob.AssignedResource?.LastName}");
}

If your code ran successfully you should see output similar to below:

Job created and retreived with address: 21 Some Street, B90 4SS
  Contacts (2)
    Name: Craig Mellon
    Name: Cara Mellon
  Assigned Resource: Bob Builder

Although this works its not the cleanest code, hopefully we will be able to include linked resources in the next version of the Cosmos provider.

Loading multiple linked entities

Another scenario I thought about is when we have multiple jobs all linked to the same Resource.  How could we load them efficiently?

When we fetch an entity with EF Core it begins tracking that entity in the DbContext.  If we know the Resource before hand maybe we could pre-load the resource into the DbContext and then when loading a Job which is linked to the Resource hopefully EF Core will automatically reference the AssignedResource to the entity already in the DbContext.

To test this option I will perform the following steps:

  • Create a Resource entity
  • Create 2 Job entities
  • Assign the Resource to the 2 jobs
  • Save to the database
  • Create a new DbContext instance (this will ensure we have a clean DbContext with no tracked entities)
  • Load the Resource
  • Load the 2 Job entities
  • Check to see if the AssignedResource has automatically been set by EF Core using the pre-fetched Resource

Great news is this seems to work, so if the linked entity is already been tracked by EF Core then when you fetch your entities which link to it EF Core will automatically reconstruct your entities

var resourceId = Guid.NewGuid();

var resource = new Resource
{
    Id = resourceId,
    Title = "Mr",
    FirstName = "Bob",
    LastName = "Builder",
    TelephoneNumber = "0800 1234567"
};

var job1 = new Job
{
    Id = Guid.NewGuid(),
    Address = new Address
    {
        Line1 = "Job 1 Address"
    },
    AssignedResource = resource
};

var job2 = new Job
{
    Id = Guid.NewGuid(),
    Address = new Address
    {
        Line1 = "Job 2 Address"
    },
    AssignedResource = resource
};

using (var context = new JobContext())
{
    context.Database.EnsureCreated();

    context.Add(job1);
    context.Add(job2);

    context.SaveChanges();
}

using (var context = new JobContext())
{
    var loadedResource = context.Resources.First(x => x.Id == resourceId);

    // Load all jobs with the same assigned resource id
    var jobs = context.Jobs.Where(x => x.AssignedResourceId == resourceId).ToList();

    jobs.ForEach(job =>
    {
        Console.WriteLine($"Job: {job.Id} - Resource: {job.AssignedResource?.FirstName} {job.AssignedResource?.FirstName}");
    });
}

Summary

Overall I'm liking the the new Cosmos provider for EF Core, It simplifies the creation and querying of your documents with Cosmos significantly.

Granted the scenarios above are simple, I still have many more advanced scenarios I'd like to try. (Maybe another post):

The current limitations of the Cosmos provider are documented here:
Cosmos Provider limitations
Cosmos provider feature backlog