Where to validate user input - Constructor, Validation object, or Factory method?

Question

Let's say your application takes in a user request and you don't trust the front-end validation (because you never trust the front-end validation). In your controller or other handler, you want to convert this input into a domain object, since you'll need to pass it around a bit. In this scenario, what's the best practice for defensively validating the data before creating the object.

I have three proposals:

Pass the data to the object constructor and let it perform any necessary integrity checks, throwing an exception if any issues are found.

The main issue I have with this is the lack of detail returned to the caller explaining what happened. If I validate it externally in some kind of Validator class, I could construct a ValidationResult object which exposes IsValid and Errors list that can be used for a better client experience.

Use the Validator class I described above. If Validator.IsValid is/returns True, you can pass the same parameters to the constructor to create the object.

The issue with this one is that you can't guarantee that the Validator will always be used. If the constructor is called with invalid data, this could lead to crashes or subtly invalid states.

The last option is to only expose object creation through a factory method, using the Validator object (most likely passed in as a parameter).

The issues with this option are sort of language dependent. In a strongly typed language, this would appear to have the same issues as #1 since your method would be expected to return an object of that type. You could return null and then have a GetErrors method, but that feels messy to me. In a loosely-typed language, you could return either the object or the ValidationResult. That's less messy, but still leads to a lot of conditional logic to use it.

For context, I currently use approach #2, but my applications are small, one-person projects. I've been programming for a while now, but my professional life has been in QA for some time now and I'm trying to refresh myself now by working as cleanly as possible.

Could you clarify the architectural model of your system? and the kind of validation you're doing (user input validation? object invariants and consistency or business rules)? And perhaps as well if your domain objects are anaemic or plain? — Christophe, CommentedJun 12, 2020 at 6:38
I think this is a pretty noob question, but unfortunately the system won't let me delete it now. As a generic example, think of a banking app that has multiple places it would take in a card number, account number, PIN, amount. If I have a transfers controller with three actions to send to email/external account/internal account, I want to validate the user input before sending it to the models responsible for setting up the transfer so I don't have conflicting validations between them. This leads to the three possible scenarios above, each seeming to have issues. — anon, CommentedJun 12, 2020 at 6:56
I'm not even sure the question makes sense anymore. Usually when it takes this much explaining, it means I've "architecture astronauted" myself well past an obvious answer. I always get like this on personal projects, because I get obsessed with doing things The Right Way at all costs. — anon, CommentedJun 12, 2020 at 6:58
On contrary, I find this question very interesting! I even upvoted (there will always be some folks who downvote questions, don’t worry about that). — Christophe, CommentedJun 12, 2020 at 8:13
To come back to the bank example: credit card number verification is one kind (input control of the number format, not trusting UI).The. he k if the number is valid (checksum control) is another one. The check if credit card is stolen is another kind of verification (business logic). As it’s validation for different purposes, the answer may be different. Anyway, does this help: stackoverflow.com/q/57603422/3723423 ? — Christophe, CommentedJun 12, 2020 at 8:28

Community · Accepted Answer · 2020-06-16 10:01:49Z

You hit on good points but you sometimes miss a second option. The gist of my response to your question is that it can be done, it just requires effort to implement exactly what you want.

In your controller or other handler

You didn't explicitly claim otherwise, but I do want to point out here that different kinds of validation exist and belong in different locations.

For example, a controller should validate parseability (e.g. can "2020-06-12" be parsed back to a valid date?) whereas your business layer should validate the business needs (e.g. is 2020-06-12 in the allowed period for this user?)

Pass the data to the object constructor and let it perform any necessary integrity checks, throwing an exception if any issues are found.
The main issue I have with this is the lack of detail returned to the caller explaining what happened.

As much as "flow by exception" should generally be avoided, exceptions certainly don't lack detail. If you take this route, then your exception should be some kind of validation exception type which extends the exception class with all of the information you need to know about the validation failure.

The issue with this one is that you can't guarantee that the Validator will always be used.

You can, but it requires more effort. You could create a result class (e.g. ValidatedResult<T>) which effectively wraps a single value (i.e. the T). If you make sure that only the validator can instantiate this class (using nested classes or access modifiers), then you can guarantee that any ValidatedResult<T> object has been processed by a validator.

This makes sense in cases where each T has one type of validation, because otherwise you still can't be sure if your T was validated using the specific validation you're expecting it to be.

To further solve that issue of having multiple kinds of validation for a type, you can start extending these result types to explicitly specify which validation they belong to (e.g. ContainsNoProfanityValidationResult : ValidationResult<string>).

As you can see, this starts requiring more and more effort to implement, but it does give you stricter control and more solid guarantees, which you're specifically looking for.

However, I somewhat disagree about the necessity of doing it so strictly though. There's a difference between guarding against malevolent attacks and guarding against developer forgetfulness. I presume only the latter is really applicable here, and this should generally be caught with unit tests as validation failures lead to changes in public behavior (i.e. refusing to perform a requested action), which tests can and should catch.

In a loosely-typed language, you could return either the object or the ValidationResult. That's less messy

I disagree. If you look at the big picture, loose typing is messier than strong typing. However, strict typing requires a bit more pedantry to satisfy the compiler. That's not messy, it just takes a bit of effort - but it pays back dividends in any sufficiently large codebase.

I would say that any codebase in which you're worried about forgetting to use a validation (enough so that you want a preventative architecture) is more than big enough for strong typing to pay those dividends in the long run.

you could return either the object or the ValidationResult.

This circles back to my earlier point, "validation result" should include both the successes and the failures! Always return a validation result, and then inspect it to see if it contains a success.

Semantics matter here. Boiled down to its core essence, validation does not need to give you back the value you put in (since you already knew it), it just needs to tell you if it passes the validation or not. For a basic validation algorithm, there's no need to return the object you already passed into it.

However, if you do take the time and effort to encapsulate your values in a validation result (presumably with an additional boolean to confirm that it's indeed a success), then you can both:

Rest easy knowing that this value was already successfully validated and certified, allowing your domain to essentially require parameter values that were already validated
Reusably log validation failures and report the actual value that was being used.

Given the concerns that you listed in the question, using a validation result is a win-win here.

This is a fantastic answer! If I could try to paraphrase - a) for things like user input where you want to validate multiple things at once, its fine to wrap that all into a validator class that returns a complex result, b) it's generally fine to assume your team is taking steps to validate inputs before passing them to domain objects, provided you have good tests, so doing something like result = ObjectValidator.Validate(); if (result.isValid) { obj = new Object(...); SaveToDb(obj); } and c) if the data is super critical, at that point you can start extending to more complex solutions. — anon, CommentedJun 12, 2020 at 9:11
You're also right that I did sort of miss the forest for the trees with regard to exceptions. I was looking at just the ex.Message being a string and failed to consider that I could have something like ComplexValidationException extends Exception { public List<string> validationErrors — anon, CommentedJun 12, 2020 at 9:14
@AgentConundrum: (for your first comment) Roughly, yes. In general, tailor your implementation to be as simple as possible but with the necessary complexity. Developer error/forgetfulness is easily caught by a unit test. Malevolent attacks requires a much more rigid defense (which requires much more effort as there can be no gaps) but you should really evaluate if you really need this and if the effort is worth the payoff. — Flater, CommentedJun 12, 2020 at 9:14
@AgentConundrum: (for your second comment) Exactly. As a rule of thumb, whenever I choose to throw exceptions, I always force myself to consider a custom exception type (or any pre-existing subclass) specifically to focus my attention to what I want the exception to do for me. It helps as a sanity check on whether you should be using exceptions here or not. There's an argument to be made for strict validation failures but there are also valid counterarguments. I can't definitively tell you yes or no on whether to use exceptions here, but I generally advise result objects first. — Flater, CommentedJun 12, 2020 at 9:16
That’s an excellent answer that addresses in a very comprehensive manner all the concerns related to the question! — Christophe, CommentedJun 12, 2020 at 9:46

candied_orange · Accepted Answer · 2020-06-12 05:33:48Z

You validate to preserve invariants. Invariants exist for different reasons. Validation should live close to the reason for the invariant.

For example, if an object has integer division in it's behavior then zero is usually excluded from the divisor. Validation code to protect from that should live in the same class as the code that makes it necessary.

However, if such invariants do not emerge from the objects behavior, but rather emerge from a chosen use of the object then it's more appropriate to validate in the particular factory that supports this use.

I'm not sure I follow. I looked it up and "invariant" is something that must hold true for the life of an object (except as its being modified). That's what I'm trying to do, but I think where I'm having issues is how "close to the invariant" is sufficient? If the class cannot ever be invalid, then the constructor would need to throw the error. At least in the language of my project (PHP) it doesn't seem easy to send a list of errors as part of the exception payload, hence the question about appropriately DRYing my validation checks. Did I misunderstand your answer? — anon, CommentedJun 12, 2020 at 6:10
If the object is immutable holding "for the life of" is not an issue. What I'm distinguishing between is where the need for the restriction comes from. People use signed integers for positive only values all the time by controlling what they can be set to. This is the same trick. If external use drives the restriction, not internal behavior, then validating in the factory associated with that use makes a lot of sense. — candied_orange, CommentedJun 12, 2020 at 6:14

wasatz · Accepted Answer · 2020-06-12 05:33:34Z

You are not mentioning which language you are using, so this answer may be more or less useful depending on what language you are actually using. But even in languages that does not have support for this, you can add it yourself by using a library or implementing a "poor mans" version.

I believe that the best option is actually to use #3 (factory method) returning a result that is a discriminated union. This gives you the best of both worlds. If you havent heard of discriminated unions before, here is an explanation of that concept from F#: https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/discriminated-unions

This means that you could have a usage pattern like this

type CreationFault = { // whatever information you want about faults } type CreationResult<T> = | Created of T | Failed of CreationFault list let createWithFactory (fruitName : string) : CreationResult<Fruit> = // return a creationresult either containing a fruit or a list of errors // depending on if it works or not let maybeAFruit = createWithFactory "banana" // Now maybeAFruit is of the type CreationResult<Fruit>, so you cant actually touch // the fruit unless you check which union case you got first match maybeAFruit with | Created(fruit) -> // Do something with your fruit | Failed(errors) -> // Do something with your errors

This gives you the best of both worlds and leverages the type system so that you cant accidentially use an unvalidated object or "miss" checking the validation results. Similar patterns can be done even if the language doesnt have native support for Discriminated unions. C# does not have this construct in the language but you can for example use the OneOf library to achieve the same thing (examples in the github readme) (https://github.com/mcintyre321/OneOf).

I intentionally didn't mention a language as I hoped this to be a general purpose/best practices question, In my case, my project is in PHP as it's the language I know best right now. There, I do have the option to return whatever type I like, though I wouldn't be able to type hint it. I could try to make #3 cleaner by having the factories return a ValidationObject from #2 which includes a third member/getter for the validated object (i.e. ValObj.IsValid, ValObj.GetErrors, ValObj.GetValidatedObject). Admittedly, I'm just theory-crafting right now. I haven't got myself into trouble in code yet. — anon, CommentedJun 12, 2020 at 5:52
@AgentConundrum There seems to be an RFC for union types in PHP wiki.php.net/rfc/union_types_v2 that is hopefully part of PHP8 if I read this correctly? That seems like exactly the language feature that you could leverage to solve this cleanly. But if I read this correctly then PHP8 is slated for release in november so right this day you would probably have to implement a "poor mans version" of this yourself if you wanna go this direction. — wasatz, CommentedJun 12, 2020 at 5:58
Interesting. I hadn't seen that. I suppose if it comes to it, I may implement the version I mentioned in my previous comment. It seemed to me that this must be a solved problem I wasn't aware of, which I suppose you say it is, just not in all languages. I'm probably being too particular. "The perfect is the enemy of the good" after all. — anon, CommentedJun 12, 2020 at 6:06
Isn’t this question more related to the architectural pattern than to the language? — Christophe, CommentedJun 12, 2020 at 6:17
@Christophe As mentioned even if your language doesnt have this feature you can implement this pattern as a library. However that isnt as nice as having it as a language feature ofc. — wasatz, CommentedJun 12, 2020 at 6:24

Stack Exchange Network

Where to validate user input - Constructor, Validation object, or Factory method?

3 Answers 3

Linked

Hot Network Questions

Where to validate user input - Constructor, Validation object, or Factory method?

3 Answers 3

Linked

Related

Hot Network Questions