Java instanceof and Clean architecture

Question

It seems to me that there is a conflict between clean architecture and the recommendation not to use instanceof. Consider the following code:

class ParentEntity { }

class AEntity extends ParentEntity { List<Integer> entityIds; }

class SaveUseCase { IEntityRepository entityRepository; void save(List<ParentEntity> list) { entityRepository.save(list); } }

class EntityRepository implements IEntityRepository { void save(List<ParentEntity> list) { list.forEach(e -> { if (e instanceOf AEntity) validate((AEntity) e) // Do save e in the database ... } } void validate(AEntity a) { List<ParentEntity> list = a.getEntityIds().stream().map(id -> get(id)) // Do some checking based on the value of list ... } ParentEntity get(int id) { // Do get ParentEntity with identifier id from the database ... } }

The code has a usecase which calls the save method in the repository. The save method first checks the object only if the object is of type AEntity and then saves the object.

The problem is the use of instanceof in the save method of EntityRepository. If we want to prevent using instanceof, one solution is to make validate a method of ParentEntity and do the validation inside AEntity by overriding it. However, according to the clean architecture we have separated the entities and repositories, so inside entities we do not have access to get method of the repository, which is required for being able to do the validation.

The workaround to this is to put a reference to IEntityRepository (or at least something like GetUseCase) inside the entity so it can do the validation itself. But, this doesn't seem a very good idea to me, especially if we assume that validation is a logic of the repository and is there only to check, e.g., what other layers give to it as parameters are valid.

So, using clean architecture biases us to using instanceof and using it is not bad in scenarios like the one I mentioned. Am I right or am I misunderstanding something?

Update: I quote some sentences from here, that I think are related to my point of view:

Some forms of validation are more efficient at the database layer, especially when referential integrity checks are needed (e.g. to ensure that a state code is in the list of 50 valid states).
Some forms of validation must occur in the context of a database transaction due to concurrency concerns, e.g. reserving a unique user name has to be atomic so some other user doesn't grab it while you are processing.
I have seen some developers try to codify all the validation rules in the business layer, and then have the other layers call it to extract the business rules and reconstruct the validation at a different layer. In theory this would be great because you end up with a single source of truth. But I have never, ever seen this approach do anything other than needlessly complicate the solution, and it often ends very badly.

What is the intended scope of EntityRepository? Is it intended to be repo for all kind of derivations of ParentEntity, knowing the whole inheritance hierarchy, or is it a repo intended just for dealing with AEntity objects? Or are you trying to implement it in a generic fashion for ParentEntitys, just in terms of virtual functions of this class? — Doc Brown, CommentedAug 2, 2021 at 12:44
I think you are missing something, but I don't think anyone of us can tell you exactly what, because your question is expressed in terms of the mechanics, but doesn't explain what the business problem is. 1/2 — Filip Milovanović, CommentedAug 2, 2021 at 20:39
But here are some general things you could look into, things that could be "code smells". SaveUseCase is doesn't look like a use case - it's more like a generic save service. It doesn't do anything that can be understood as meaningful on its own, it's just a data sink (single void method). Then, all it appears to do is entityRepository.save(list), so, why have the SaveUseCase class at all? Also, it's unclear why your AEntity extends ParentEntity. What is the design purpose of ParentEntity? Furthermore entities in CA are not database entities. 2/2 — Filip Milovanović, CommentedAug 2, 2021 at 20:39
@RobertBräutigam "while CA tells you explicitly that you have to start with these technicalities" - no it does not, you just read it that way. It describes a generalized way of dividing up the system; it only really asks you to separate policies by how hi-level they are. It doesn't actually prescribe the number of layers (you can have more or less), doesn't actually require you to have controllers/presenters, the "input/output data" could be data structures or parameter lists, the "interactor" could be one class or 3 classes or just a function, and "entities" are just a role. — Filip Milovanović, CommentedAug 3, 2021 at 19:01
@FilipMilovanović "It describes a generalized way of dividing up the system". This is exactly what I mean. It is not based on requirements, it is generic, i.e. technical. It defines "high" and "low" level based on whether it has anything to do with UI for example. This is also technical. A non-technical (i.e. business-relevant) division would say the "Amount" is higher level than "Balance". That is business-relevant. I'm not making a value judgement, it's just an observation. — Robert Bräutigam, CommentedAug 4, 2021 at 7:32

amon · Accepted Answer · 2021-08-03 11:01:01Z

If you want to add a method to a class hierarchy without actually adding the method, consider the Visitor Pattern. You could create a validation visitor, and let each entity select the appropriate method of the visitor.

First, your ParentEntity class hierarchy would need a bit of boilerplate to support visitors:

interface EntityVisitor<T> { T visitA(AEntity a); T visitB(BEntity b); } class ParentEntity { <T> T accept(EntityVisitor<T> v); } class EntityA extends ParenEntity { ... @Override <T> T accept(EntityVisitor<T> v) { return v.visitA(this); } }

Next, we can implement and use a visitor that performs validation.

class Validation implements EntityVisitor<Void> { EntityRepository repository; ... @Override Void visitA(AEntity a) { ... } @Override Void visitB(BEntity b) { ... } } class EntityRepository ... { void save(List<ParentEntity> list) { list.ForEach(e -> { e.accept(new Validation(this)); ... }); } }

The validation visitor can have access to both the entity and the repository (in order to make further queries), and will therefore be able to perform the full validation.

Using such a pattern has advantages and disadvantages compared to an instanceof check and compared to moving the validation logic into the entities.

An instanceof is a much simpler solution, especially if you only have very few entity types. However, this could silently fail if you add a new entity type. In contrast, the visitor pattern will fail to compile until the accept() method is implemented in the new entity. This safety can be valuable.
While this pattern ends up having the same behaviour as adding a validate() method to the entities, an important difference is where that behaviour is located and how our dependency graph looks. With a validate() method, we would have a dependency from the entities to the repository, and would have referential integrity checks intermixed with actual business logic. This defeats the point of an Onion Architecture. The visitor pattern lets us break this dependency and lets us keep the validation logic separate from other business logic. The cost of this clearer design structure is extra boilerplate in the form of the EntityVisitor interface and the accept() method that must be added to all entities in the relevant class hierarchy.

Whether these trade-offs are worth it is your call. You know your codebase best, and you have the best idea how it might evolve.

However, performing validation based on the result of multiple queries can lead to data integrity problems. The repository should either make sure to use database transactions (and offer an API that clearly communicates when modifications have been committed), or the relevant integrity checks should be done within the database, e.g. using constraints in an SQL database. In some cases, the validation checks can also be expressed as part of an insert or update query.

Thank you. But, isn't such a solution more complex, unreadable and unmaintainable than the version with only an instanceof? Especially, if we assume the validation is something related to repository and its validation logic and not to the domain itself. Isn't the point of not using instanceof to improve readability and maintainability? — Shayan, CommentedAug 2, 2021 at 12:40
@Shayan If you just have one instanceof, I'd probably agree with you – the occasional “unclean” code is much better than an overcomplicated workaround. But the tricky aspect of an instanceof is that if you ever add another ParentEntity subclass, the instanceof check might no longer be correct. In contrast, the visitor pattern will prevent the code from compiling until the accept() method has been implemented. That safety can be valuable. Only you know what's best for your particular codebase – either alternative can be fine. — amon, CommentedAug 2, 2021 at 13:43
Good points. But is putting a void accept(EntityVisitor v); there which may be used for different purposes in different parts of the code base a good practice? From a logical perspective, in our domain, for example, does an Apple has an accept behavior? So, is it a good practice to put an accept for its corresponding class? — Shayan, CommentedAug 3, 2021 at 5:23
@Shayan The accept/Visitor boilerplate is not really entity behaviour, it is just boilerplate to make the entity's behaviour extensible. It is agnostic to what actual behaviour you'll route through it. In some languages this pattern is unnecessary, e.g. in Java 15 we could just use Sealed Classes in order to do instanceof safely. Should an Apple entity have an accept() method? Only if its part of the ParentEntity hierarchy that you want to make visitable. — amon, CommentedAug 3, 2021 at 11:06
Shayan: remember not to trust anyone who calls himself “uncle”. In other words, don’t take all this “clean” nonsense as gospel. Use your own brain. And if you use a modern language, like Swift, lots of his “clean” code looks very clumsy and outdated. — gnasher729, CommentedAug 3, 2021 at 21:29

fnmps · Accepted Answer · 2021-08-02 18:11:27Z

I know what I am about to answer is not exactly good practice but if you want to avoid instanceof's and have a generic way to call the respective method for that subclass you could use reflection:

Method m = EntityRepository.class.getMethod("validate", e.getClass()); m.invoke(this, e);

Of course, this will have a negative effect on performance and in some ways maintainability (with the only upside being less code).

Regarding the performance overhead, you can somewhat mitigate it by loading all the methods at startup:

Map<Class<?>, Method> methods = new HashMap<>(); Reflections reflections = new Reflections(ParentEntity.class, new SubTypesScanner()); Set<Class<? extends Animal>> subclasses = reflections.getSubTypesOf(ParentEntity .class); for (Class<?> c : subclasses) { methods.put(c, Solution.class.getMethod("makeTalk",c)); }

Where Reflections comes from the Reflections library

And then just call the method using:

methods.get(e.getClass()).invoke(this, e)

Thank you. But, doesn't this solution make the code more unreadable? — Shayan, CommentedAug 3, 2021 at 5:16
That depends how you abstract it. The advantage of using reflection for this is that you do not need to implement extra code for each new subclass that appears. You basically, encapsulate this logic in a function (e.g., callValidate), write a clear JavaDoc, create a thorough automated unit test, and basically "forget" that it even exists, since you should not have to change it when a new subclass appears. — fnmps, CommentedAug 3, 2021 at 9:07
And I'm not saying this is the best solution, but I find it is an alternative that has some advantages that you can consider. — fnmps, CommentedAug 3, 2021 at 9:08

Doc Brown · Accepted Answer · 2021-08-03 06:36:33Z

If EntityRepository already knows about the whole inheritance hierarchy and is already implemented in a non-generic fashion (as you wrote in a comment), I see absolutely no point in avoiding instanceOf. In case EntityRepository contains different, individual validations methods validate(AEntity a), validate(BEntity b) and validate(CEntity c), you have to write a new validation method either when a new child class DEntity arrives.

Of course, one could try here to create a generic validate(ParentEntity p) method, applicable to all kind of child classes. But how this methods should look like is pretty hard to tell by seeing only one example for a child class of ParentEntity. I would usually look at least at three different child classes, look which parts the of validation can ge generalized, which parts are different, and refactor the different parts out into the child classes (reachable through virtual methods).

avpaderno · Accepted Answer · 2023-01-11 08:58:15Z

An instanceof solution has its own downsides, detailed among others on Why not use instanceof operator in OOP design? too, with regards to SOLD principles¹, namely the O and L principles, thus its loftier readability affecting its maintainability that could be improved refactoring the validation concern to its own class and changing the constructors' pseudo code so it registers validators for classes to validate leveraging the validators' dynamic selection without using instanceof operator.

public class EntityRepository { private ValidatorRepository validationRepository; public EntityRepository() { registerValidators(); } private void registerValidators() { Function<Entity, Boolean> validator = (entity) -> { List<Entity> list = entity.getEntityIds().stream().map(id -> get(id)).collect(Collectors.toList()); boolean validationResult = false; // Do some checking based on the value of list // ... return validationResult; }; validationRepository.registerValidator(Entity.class, validator); } public void save(List<? extends Entity> entities) { entities.stream() .filter(entity -> validationRepository.getValidator(entity.getClass()) .orElseGet(() -> (e) -> true) .apply(entity) ) .forEach(entity -> { // Do save entity in the database // ... }); } private Entity get(int id) { // Do get ParentEntity with identifier id from the database return null; } } public class ValidatorRepository { private Map<Class<? extends Entity>, Function<Entity, Boolean>> validators; public ValidatorRepository() { this.validators = new HashMap<Class<? extends Entity>, Function<Entity, Boolean>>(); } public void registerValidator(Class<? extends Entity> classToValidate, Function<Entity, Boolean> validator) { validators.put(classToValidate, validator); } public Optional<Function<Entity, Boolean>> getValidator(Class classToValidate) { return Optional.ofNullable(validators.get(classToValidate)); } }

(1) I intentionally omitted. One might argue that the Interface Segregation falls keener under the definition of a rule than the definition of a principle, hence SOLD instead of SOLID principles.

JimmyJames supports Canada · Accepted Answer · 2021-08-02 21:06:19Z

I was hoping to get some clarification from my comment on your question but here's a few options you might what to consider. I think the most straightforward and least disruptive (to your design) option is to simply to add validate to ParentEntity like this:

class ParentEntity { public void validate(EntityRepository repo) { return; } }

I'm going to assume there's something else going on in ParentEntity that you haven't shown us. Otherwise I would question why it's a class and not an interface.

This might violate one of Uncle Bob's rules (I don't know and I don't really much care) by exposing the repository to the entity instances. Regardless of that, I don't love it. It seems a little loose but on the other hand, it adds a lot of flexibility.

If that's a concern for you, you could simply pass the related entities to the ParentEntity. In order to make that work without an instanceof you could do something like this:

class ParentEntity { public List<Integer> getEntityIds() { return Collections.emptyList(); } public void validate(List<ParentEntity> associated) { return; } }

And override getEntityIds() in AEntity. This is maybe a little better but still seems a little off to me. What I would probably do is a deeper redesign. You don't show where you decide to create an AEntity instead of a ParentEntity but wherever that happens, you would (naïvely) instantiate AEntity with references to the related entities. I say naïvely here because there's an immediate problem. If you can't instantiate a AEntity without it's dependencies, you can easily run into a cycle if an AEntity depends on other AEntity instances. Even if there never cycles, it can be a little complicated to instantiate them in the right order. I can think of few ways out of that mess.

Lazy loaded entities. Each entity is initialized with it's id only. When details are required, then they are retrieved.
Instantiate entities with a list of function references, each of which retrieves the actual entity when it's needed during validation.
Instantiate the AEntity with a list of strategy pattern objects. This is similar to 1, except the entities are not lazy-loaded necessarily. The strategy object will get them after the entities are created.

If you are interested in any of these three options, let me know and I can elaborate but I'm not going to spend time on that unless you are expressly interested.

Could you please elaborate your third solution? (Strategy pattern). — Shayan, CommentedAug 11, 2021 at 5:34

Joop Eggen · Accepted Answer · 2021-08-03 11:44:55Z

There is a hierarchical entity class hierarchy; ParentEntity best named (Base/Abstract)Entity or such.

The repository class manages many entity classes. It probably would be better to have <E extends ParentEntity> because the API usages now causes a lot of casts at usage points: Stream maps, instanceof, the new instanceof (java 14):

 if (e instanceOf AEntity ae) { validate(ae); }

Such a moloch being handling such a diversity is not good code either: from the outside there is no strictness: is entity FooEntity validating or not? You would need to read maybe hundreds of lines.

When validation is in general use for entities, but you do not want an overridable validate in the entity, you should use a strategy pattern, to map the entity class to validation handlers.

Thomas Junk · Accepted Answer · 2021-12-24 11:06:12Z

From my perspective you have a design problem here.

The problem you are facing could easily be circumvented if you take validation, which is from my POV a separate concern out of the repository layer - esp. out of the save() method which now does two things -namely validation and persistence- instead of one.

save() should just save. Whether or not the objects need to validate before should be handled earlier. There should be an implicit contract that ensures: everything which enters the persistence layer, should be dealt with as is, i.e. every object is in a good / valid state.

Where you deal with validation is up to you. For example you could use a Builder which ensures only valid instances are created.

By calling the validation logic before the actual object is created, you can be guaranteed that every Book instance created by the Builder has a valid state.

And your problem vanishes.

lennon310 · Accepted Answer · 2023-01-13 19:59:30Z

While there are multiple solutions addressing various programming paradigms - e.g. functional programming, OOP, and procedural programming - to avoid usage of instanceof operator the reasons to consider before implementing a solution should be about the concern to be implemented that for the validation concern are whether it should be compulsory - are there entities eligible for skipping validation? - and whether it should be a concern on its own - could the validation be mingled with the persistence concern? - or not.

Following solution is a controversial implementation of strategy design pattern for optional validation concern. Controversial in the sense that enforces the way of usage, for this specific example the right usage being call supports method then call validate method only if supports method returns true.

public class EntityRepository { private ValidatorRepository validationRepository; public EntityRepository() { registerValidators(); } public <T extends Entity> void save(List<T> entities) { for (T entity : entities) { Validator<T> validator = validationRepository.getValidator(entity.getClass()); if ( validator == null || validator.validate(entity) ) { // Do save entity in the database } } } private void registerValidators() { Validator<AEntity> validator = new Validator<AEntity>() { @Override public boolean supports(Class classToValidate) { return AEntity.class.getName().equals(classToValidate.getName()); } @Override public boolean validate(AEntity toValidate) { List<AEntity> entities = new ArrayList<AEntity>(); for (int id : toValidate.getEntityIds()) { entities.add(get(id)); } boolean validationResult = false; // Do some checking based on the value of list return validationResult; } }; validationRepository.registerValidator(validator); } private AEntity get(int id) { // Do get ParentEntity with identifier id from the database return null; } }

A solution that introduces following validation specific class and interface.

public class ValidatorRepository { private Set<Validator> validators; public boolean registerValidator(Validator validator) { return validators.add(validator); } public Validator getValidator(Class forClass) { Validator toReturn = null; for ( Validator validator : this.validators) { if ( validator.supports(forClass)) { toReturn = validator; break; } } return toReturn; } } public interface Validator<T> { boolean supports(Class<T> classToValidate); boolean validate(T toValidate); }

Ewan · Accepted Answer · 2023-01-14 12:08:26Z

You have conflated two separate paradigms.

Object Oriented Programming.
Principle. Methods go on the object with the data, Don't use reflection or instanceof to work out the type because we have provided inheritance and overridden methods to have various different functions run on different objects.
Its normal to have a list of objects and not know the type of each, call object.validate() on each and the object will run the correct function.
Clean Architecture
Principle. Methods go on use case objects, not the 'entity' so that we can better separate concerns. Don't have entities inherit from each other or a base class because we always need to know what type of entity we are dealing with in order to run the correct usecase.
Its not normal to have a list of entities of different types. They don't share a base class. If for some reason you do, its fine to use instance of, because you don't want to put the method on the entity and so cant use overrides.

The contradiction isnt between "dont use instanceof" and "clean architecture". its between "clean architecture" and "OOP"

Stack Exchange Network

Java instanceof and Clean architecture

9 Answers 9

From my perspective you have a design problem here.

Linked

Hot Network Questions

Java instanceof and Clean architecture

9 Answers 9

From my perspective you have a design problem here.

Linked

Related

Hot Network Questions