Aggregate with a huge list of value objects

Question

I'm currently reading "Implementing Domain-Driven Design" while going through the code samples, but I'm having trouble modeling aggregates that stores a huge list of value objects. For example, here's one of the subdomains dealt with the book (adapted for Rust):

struct Person { id: PersonId, } struct GroupMember { person_id: PersonId, joined: DateTime<Utc>, } struct Group { id: GroupId, members: HashSet<GroupMember>, }

Person and Group here is an aggregate and GroupMember is a value object inside the Group aggregate.

What troubles me is that members here can represent a really big list, so loading them into memory every time can really hurt performance.

I tried to look into any prior art for dealing with this problem, but I can't find any information on it. I'm not very familiar with Java, but it looks like that a lot of DDD examples uses some kind of lazy-loading mechanism with some Java ORM. But I'm using a raw SQL library, so that's not an option for me.

The way I deal with it right now is that I'm extending the Group aggregate's repository to include methods for fetching the GroupMember value object separately:

trait GroupRepository { fn group_member_of_group_id(group_id: GroupId, limit: u32) -> Vec<GroupMember>; fn add_group_member(group_id: GroupId, member: GroupMember); }

The limit is here so that you can fetch them gradually, without having to loads tens of thousands of them straight to memory.

Is this a valid approach? Has a concept like this been explored before in DDD? Some advice would be appreciated.

Keep in mind that aggregates exist to enforce invariants. Let’s say there’s a business rule that says a group can have no more than 10k members, for that you don’t need a list of members, only a count of the number of members. — Rik D, CommentedJan 16 at 11:15

Ewan · Accepted Answer · 2025-01-16 13:32:26Z

I think this is fine. Obviously it goes against the DDD aggregate principal, but you can still enforce the aggregate rules at the repo level as it has access to the database. Plus, there is some number of members where it will simply be impossible to have them all in memory at the same time.

You have some alternatives.

Put the database connection in the Group aggregate. This will allow lazy loading of members and functions like IsMember(personid)
The downside is your model is no longer "pure", you can get into trouble with your dependency references and now have networking restraints on where the object can be used.
Change the aggregate. Pick something with a smaller list
This is only going to work when the problem is your design, rather than the inherent business case though. You have already reduced the group members to an id and a date rather than the full person
Abandon the aggregate in favour of a Domain Service.
Say for example you have TelephoneDirectory you can't pass the whole thing around, it's just that big. But you can expose TelephoneDirectoryService with various AddEntry()SearchFor() etc methods.

VoiceOfUnreason · Accepted Answer · 2025-01-16 13:12:59Z

What troubles me is that members here can represent a really big list, so loading them into memory every time can really hurt performance.

One possibility is that "group membership" is a collection of information with its own distinct lifecycle that should be treated as an aggregate in and of itself.

The question to examine would be what, if any, constraints you have that require looking at the entire list of members.

The original definition of aggregate (within the context of domain driven design):

An aggregate is a cluster of associated objects that we treat as a unit for the purpose of data changes.

Therefore, lazy loading an aggregate is a little bit weird, insofar as you are considering a data change but not loading the entire unit?

Large complicated aggregates aren't necessarily wrong, but very often indicate that analysis of the problem prior to coding was at best superficial. Often, the analysis was focused on the nouns in the domain, where a more thorough analysis would have been paying attention to the processes through which the information in the system changes.

Note: lazy loading for query support? Knock yourself out; read patterns and write patterns are sufficiently different that we should not be handicapping the performance of our reads by unnecessarily coupling them to the designs of our writes.

gnasher729 · Accepted Answer · 2025-01-18 11:47:23Z

A common method to implement this is that for example a GroupMember instance appears only once in persistent storage, appears at most once as an actual object in memory, and the rest of the time it is identified by an id.

So loading a group with a million members only requires loading a million integers which are some time later turned into PersonIds, and a million numbers which are turned some time later into a DateTime object.

The struct GroupMember is an implementation detail that shouldn’t be exposed.

Stack Exchange Network

Aggregate with a huge list of value objects

3 Answers 3

Hot Network Questions

Aggregate with a huge list of value objects

3 Answers 3

Related

Hot Network Questions