1

We have a very messy data repository component, with dozens of methods to interface a DbContext (entire database) in Entity Framework.

It was (and is) coded in a way that adds a new repo method for each table combination a query needs to run on/return.

For example:

  1. GetEmployees(), returning something like DbContext.Employees.Where(...)
  2. GetEmployeesWithDepartments, returning DbContext.Employees.Include(x => x.Deparments).Where(...)
  3. GetEmployeesWithDeparmentsWithManagers, returning DbContext.Employees.Include(x => x.Departments).Include(x => x.Managers).Where(...)

something like that.

I am now working on refactoring that, and proposed to unify all such repo methods - into one around the same main table. In the example above, those 3 methods would give place to a single one looking like

GetEmployees(bool include includeDepartments = false, bool include includeManager = false) { var query = dbContext .Employees(); if (includeDepartments) query.Include(x => x.Departments); if (includeManagers) query.Include(x => x.Managers); //apply query predicate and return... } 

Someone said this conflicts with S in SOLID. I don't see how (the responsibility is one - querying a specific db main table - with the convenience of allowing to return additional stuff). The proposed above has the obvious advantage that, when refactoring in the future - we will only need to touch a single place.

Am I wrong in principle ? Are there other motifs my suggestion is not a good idea ?

Looks like MSDN does encourage the behavior I am proposing, in their Repository best practices (see Implement a Generic Repository and a Unit of Work Class).

5
  • Did they say how it conflicts with the Single Responsibility Principle?CommentedFeb 7, 2023 at 14:11
  • Not yet, I am equipping myself for a meeting on that :). I confronted this, however, like exposed above
    – Veverke
    CommentedFeb 7, 2023 at 14:12
  • Without understanding how they think it violates SRP, there is no way you can defend your position. Frankly, there is no way for them to defend their position either.CommentedFeb 7, 2023 at 14:23
  • As a start, see The Single Responsibility Principle by the person who invented the term.CommentedFeb 7, 2023 at 14:26
  • The purpose of the question was to make sure I am not missing other aspects that make this idea not a good one. I understand you are one that do not see such.
    – Veverke
    CommentedFeb 7, 2023 at 14:26

3 Answers 3

1

I am now working on refactoring that, and proposed to unify all such repo methods

I agree with the feedback you received that this isn't what I'd do. However, ...

Someone said this conflicts with S in SOLID.

I disagree. It violates the O.

The proposed above has the obvious advantage that, when refactoring in the future - we will only need to touch a single place.

Needing to touch the same place for future extensions is precisely what you don't want, as per the OCP. If you add a new table, it forces you to have to change this method to add a parameter (and thus break the contract) for the existing method.

More importantly, and this is not a letter of SOLID but in my opinion an even more pertinent pain point: whatever code handles the result that is returned from this method, it has to run a gauntlet of potential null reference exception for each and any dependent dataset that you did not include in your query.

What I think is happening here is that you're falling into the honeytrap that catches so many (driven and dedicated) developers: eagerly wanting to maximize reusability.

Taking a bit of a detour here, what is your response when you see this?

enter image description here

There's two ways to respond to this:

  • Wow, that's an amazing achievement to have these cars interleave without collisions. I want to be able to do that.
  • Holy hell that's a huge accident waiting to happen. I'm not going near this intersection.

I suspect a lot of developers are drawn to the former, because we are drawn to the elegance of a well designed machine. To some degree, that is good. This is exactly the kind of attitude that built the kinds of complex manufacturing machines we have nowadays.
However, there's a point of overdoing it, where we are drawn like moths to a flame to a complex machine, when it would be significantly easier to create several simple machines and avoid any collisions or ambiguities.

In my opinion, this is one of those cases. The complexity and risk of merging these repository methods seems to outweigh the benefit of doing so. I suspect your interest in building this is biasing your decision on whether it is the right thing to do.

If you stick with the repository pattern, separate methods seems better. Note also that there's nothing wrong with having these public methods depend on some shared private logic for the parts that are in fact reusable across these public methods (e.g. applying the same employee filter in all cases).

I'm personally a fan of the query object approach, where each public repo method is split off into a class of its own, effectively creating "one method repos". This prevents the needless categorization exercise of "what repo should this query live in?", which is IMHO a futile exercise. Instead, I think it's more productive to focus on each query on its own, so that you can build each individual query to fit with its own individual specifications.

Whether you use repositories or query objects is not the main focus here, but I do suggest that you keep the queries separate, whether as public repo methods or individual query objects.

    0

    Not sure which letter it comes under but yes, having parameters that change the function of a method is considered bad practice.

    Generally speaking I blame your problem on entity framework. Essentially you are forced to add child objects to your models, but optionally populating them is a bad idea. They should always be populated.

    In order to make this work in a sensible way you will have to look at your models and decide on your "aggregate roots" ie where you want child objects and where you are going to just include the ids and make a separate call.

    eg, you could have

    Employee { Name DepartmentId ManagerId } 

    and make a separate endpoint:

    GetAllDepartments() 

    Now the calling code can get the employees and be sure that all the data is always included. If they want to show the department data they can make a second call and join the data together on their side with map reduce.

    If they want the list of departments for a drop down selection, the method is already there etc.

    8
    • Possibly, and this would work for a small number, but I'm wary of "client side joins" since the performance can be much worse.
      – pjc50
      CommentedFeb 7, 2023 at 16:17
    • it can also be much better. the trick is to choose carefully what you include. big objects == pain
      – Ewan
      CommentedFeb 7, 2023 at 16:27
    • I don't think Entity Framework specifically is to blame here. Any model that accounts for database relations has to list all relationships somehow, and obviously forcing a connected graph to be downloaded entirely at all times is not a great idea. The issue here is more that the EF entities are being returned publicly to the consumer, which shouldn't happen. E.g. the GetEmployeesWithDepartments method should return a DTO which contains the department as a nested value but not any of the unloaded ones such as the employee's manager. That is unrelated to EF.
      – Flater
      CommentedFeb 7, 2023 at 23:06
    • @Flater: wait, we do return DTOs (well we do not, and this is wrong, but we are returning the POCO object that maps the database table. And it includes whichever nested collections we include when querying).
      – Veverke
      CommentedFeb 8, 2023 at 8:23
    • yeah I disagree with @Flater although you could wrap the entitles it doesn't really make any difference. You just end up with EmployeeLite which is just as bad
      – Ewan
      CommentedFeb 8, 2023 at 15:36
    0

    The SRP is usually concerned with the responsibility of classes.

    When I got this right, your repository originally was a class with multiple public methods, and you just changed the API, so it is now a class with one method and some boolean parameters instead. This did not change any responsibility of the class. Hence, if the SRP wasn't violated beforehand, it cannot be afterwards - your refactoring did not change any responsibility of the repo.

    Moreover, having one repo bundling all kind of queries for a single entity is pretty standard design and unlikely to violate the SRP in the first place. If it is a good idea to allow partial loading of entities or not (which can be sometimes justified for performance reasons), or to have public methods with boolean flags in a repo's API (which may be a code smell, but nothing more) is a completely different question, unrelated to yours.

    2
    • The first paragraph is more or less what I had in mind, but @Flatter at the bottom points out different pertinent problems.
      – Veverke
      CommentedFeb 8, 2023 at 8:01
    • The second paragraph is great, it strikes right at the heart of why this isn't SRP related. I could not verbalize why it wasn't SRP in my answer but you did a great job in explaining it.
      – Flater
      CommentedFeb 8, 2023 at 22:06

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.