1

I'm very puzzled by a specific part of the repository pattern that seems simple but turn out to be tricky.
I took this great explanation of this topic by Mosh Hamedani; it's a C# implementation of repository pattern with Entity Framework. https://www.youtube.com/watch?v=rtXpYpZdOzM&feature=youtu.be The key points is: repository should return domain objects. Not IQueryable and not other types of objects. Decoupling, fair enough.
An example of a method of a repository that Mosh proposes in that video is this:

public IEnumerable<Course> GetCoursesWithAuthors(int pageIndex, int pageSize) { return PlutoContext.Courses .Include(c => c.Author) .OrderBy(c => c.Name) .Skip((pageIndex - 1) * pageSize) .Take(pageSize) .ToList(); } 

So we can see that there is complete separation between repository and the upper levels; they don't have to know about the Context (provided that Repository gets its own copy at the construction). The upper levels can query the repository as if it was an in-memory collection: it returns plain IEnumerables.
Everything seems to be decoupled... but one thing. The Domain Object itself, in this case: Course.
The Course is the domain object and is also the object used by the repository as EF Entity.
Actually the repository pattern definition doesn't mention how the repository should build its entities, so one can say that it's fair for the repository to 'reuse' the domain object; we can also say that this is just another way of implementing, and any working implementation is ok since it's encapsulated. That is, the repository's client doesn't know the details of how a domain object is build. Nevertheless the choice of using the very same class and instance to be both domain object and entity has non-obvious consequences. I can see the following:

  1. this way you can leverage the EF ability to trace changes on entities to detect changes to domain objects. That would be quite challanging to implement by yourself.
  2. the fact that 'Course' class is both an entity and a domain object is something that you have to be aware of when you implement additional methods, let's say validation code or code related to the domain logic. This seems to me a violation of SRP, which practically means that if the aims of the "two classes merged into one" happen to diverge, you've got a problem.
  3. this approach saves from coping values from dao to domain object and vice versa

The important point here is the first; if it's true, using entities as domain objects is more a necessity than a choice. So even if you wanted to, implementing a domain object 'Course' and a data access object (or entity) 'CourseDAO' would be quite challenging. And this seems to pose an obstacle to the achievement of a full decoupling. At the end of the day and after all this work, full decoupling seems always to be a layer away!

This are my conclusions after having read different tutorials (especially the Mosh's one), so I'm not yet sure if i got the right idea. Or if instead I spotted a 'gray area' in the way the repository pattern is implemented in those examples.

2
  • 1
    The most significant problem you are going to encounter with Entity Framework if you intend to take the hard stance and provide separate objects in another layer is that you are going to lose change tracking. This has an impact on performance and greatly increases the difficulty and complexity of using Entity Framework as a Data Access Layer.CommentedJul 8, 2020 at 0:13
  • 1
    @RobertHarvey: You can actually get around the loss of change tracking if you use interfaces instead of objects, and both the domain object and entity implement the interface. E.g. if I pass a Domain.Foo : IFoo to my DAL (as an IFoo) it gets converted to a DAL.Foo : IFoo, but my DAL returns DAL.Foo (as an IFoo). If I then take something from the DAL and then hand it back to the DAL, it's actually still using the same tracked DAL.Foo object. This does come with some additional mapping logic and requires some extra effort, but it works.
    – Flater
    CommentedJul 8, 2020 at 7:43

2 Answers 2

3

This pattern is fine as a repository, because the whole point is that the datalayer is separated, so you can use EF or whatever you like internally.

However, In order to keep EF attributes out of your business class you'll have to write a mapping layer for any complexities of your objects, such as keys and relationships that don't follow the default.

Once you do this you, might as well not use EF at all; as you have thrown out most of it's extra features whilst still retaining its.. idiosyncrasies.

Instead, use sqlclient, sql and map datarows to your business objects. Slightly more typing, but you'll never have to google why EF isn't working as expected ever again.

    2

    Repositories and EF

    This is a topic that a lot can be said about, but I want to keep it short here because I want to focus on better solutions.

    The short answer is that the repository pattern doesn't mesh well with Entity Framework, because EF itself is already an implementation of the repository pattern. The DbContextis a unit of work. The DbSet<T>is a repository.

    Trying to abstract the same abstraction leads to reinventing the wheel a second time, and if you want to use all of EF's features, required you to build that second wheel so that it can wrap and handle every EF feature, which is sheer madness.

    To be fair, past versions of EF were lacking some features that meant you couldn't quite separate your database from your domain, but EF Core ticks all necessary boxes to provide a separation.

    I strongly suggest listening to Jason Taylor's explanation, he expresses the underlying reasoning better than I am able to. The explanation takes just over 3 minutes, I've bookmarked the video at the right starting point.

    If you prefer a written explanation, this article sums it up well.

    Anecdotally, I have met an worked with several developers (who I think of as good developers) who are entrenched in their "EF must be abstracted" position. It's not hard to find blog posts that still strongly advocate for it. I think this is an outdated argument that applied to past versions of EF, but is no longer relevant.

    My general take on it is that "it's called Entity Framework, not Entity Library", i.e. it's not worth the effort to fully abstract it. If you want to know more, I wrote a much longer answer on this in the past, specifically the "But Entity Framework is a framework!" part of the answer.

    I want to use the rest of this answer to explore better ways of dealing with this data/domain segregation.


    From experience, not everyone intuitively agrees on the direction an arrow should be drawn in, in my answer I use A -> B to mean that B has a dependency on A.


    Dependencies can be inverted

    You point out several issues, all of which suggest that you are thinking of your dependency graph as a linear sequence:

     Web <- Domain <- Data <-------------------------- APPLICATION FLOW 

    Generally speaking, the order of the dependencies follows the flow of the application. The web controller speaks to the domain and the domain speaks to the DAL, so this is a linear sequence, right?

    While that seems to be unavoidable based on which layer depends on which, the dependency between Domain and Data isn't as strictly defined as it would first seem. To cut a long story short, this is dependency inversion, but I'll use an example here to elaborate.

    Forget the example and think of two classes: A and MoreThanA. As the name suggests, MoreThanA contains everything A contains and then some more. This suggests the following dependency graph:

    MoreThanA <- A 

    Whether this is a layer dependency or class dependency is irrelevant for the point I'm trying to make. The focus is on the direction of the arrow.

    But wait... DAL entities are usually more complex than the domain objects they represent, because there may be persistence-specific properties that are being stored which the domain doesn't care about and doesn't work with, since the domain shouldn't be tightly coupled to the persistence implementation logic.

    So that suggests the following class dependency graph:

    MyDomainObject -> MyEntity 

    Which in turn suggests the following layer dependency graph:

    Domain -> Data 

    Notice that the arrow is different from the initial example that lies at the base of your question. This is called an inverted dependency, because the arrow is reversed compared to the flow of the algorithm:

     Web <- Domain -> Data <-------------------------- APPLICATION FLOW 

    When you look at Clean Architecture diagrams, the DAL is often put on the same level as the top-level consumer (e.g. Web project) since both individually depend on the domain.

    enter image description here

    Note:

    • In this answer, the domain and application layers have been lumped into one big "domain" blob because their distinction isn't the topic at hand.
    • Persistence is the DAL. This is often just lumped into the infrastructure category in other diagrams.
    • In other similar diagrams, "entities" are part of the domain, but beware that in those cases, "entities" means "domain objects".

    Implementing an inverted dependency

    So how do we implement this, on a technical level? There are some key points that are different to how you may have been implementing your original "linear sequence" dependency graph. When using an inverted dependency:

    • The domain defines the interface (e.g. Domain.IFooRepository)
    • The domain's logic (e.g. FooManager) has an injected IFooRepository dependency which it operates on.
    • The DAL implements the interface in e.g. Data.FooRepository. It has to fulfill the contract that the Domain demands, which means that the domain forces the DAL to provide all features that the domain uses when interacting with its own Data.IFooRepository
    • The top-level project uses a DI container which registers <Domain.IFooRepository, Data.FooRepository>.

    Those are the basic steps to starting with inverted dependencies. For more detailed information, you might want to look up a tutorial online.

    Also note that while I use "repository" and "manager" classes, it's highly contested whether you should use the repository pattern in conjuction with EF. Especially when your dependency is inverted, there's actually little reason to do so. Me calling it a "repository" was only done to keep the explanation simple by using recognisable class names.


    Direct feedback

    1. This way you can leverage the EF ability to trace changes on entities to detect changes to domain objects. That would be quite challanging to implement by yourself.

    Even when not using inverted dependencies, you can actually get around the loss of change tracking if you use interfaces instead of objects, and both the domain object and entity implement the same interface.

    E.g. if I pass a Domain.Foo : IFoo to my DAL (as an IFoo) it gets converted to a DAL.Foo : IFoo, but my DAL returns DAL.Foo (as an IFoo) to the domain as the result of its queries.

    If I then take something from the DAL and then hand it back to the DAL, it's actually still using the same tracked DAL.Foo object.

    This does come with some additional mapping logic and requires some extra effort, but it works.

    1. the fact that 'Course' class is both an entity and a domain object is something that you have to be aware of when you implement additional methods, let's say validation code or code related to the domain logic. This seems to me a violation of SRP, which practically means that if the aims of the "two classes merged into one" happen to diverge, you've got a problem.

    Different layers can define their own validation rules.

    If you wish to use the same class in both layers (both as the domain object and the entity), then that object inherently needs to pass both kinds of validations (each executed at an appropriate time in their own layer).

    That's an inevitable consequence from using the same object in two different layers which both demand validation. The alternative is using a separate dto for each layer, which brings me to your next point:

    1. This approach saves from copying values from dao to domain object and vice versa

    Tools like Automapper are a huge help here. You still need to create the two separate classes, but most properties can be automatically mapped using convention-based rules (e.g. properties of the same name and type get picked up automatically).

    2
    • 1
      "but I want to keep it short".....
      – Ewan
      CommentedJul 8, 2020 at 15:51
    • @Ewan: I did keep the repositories+EF part short, so I could move on to discussing alternative solutions. The link to my earlier answer contains an elaborate analysis of the issues encountered when implementing the repository pattern on top of EF. This answer kept it much shorter on that topic, wouldn't you say?
      – Flater
      CommentedJul 9, 2020 at 12:55

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.