Generally, services have a troubleshooting layer, which is typically not depicted.
When the application layer catches the exception from the persistence layer, a unique random Troubleshooting ID should be generated and associated with this catch. This ID should be written to the troubleshooting log.
When using an end-to-end troubleshooting solution such as OpenTelemetry, the Troubleshooting ID is the Root Span ID corresponding to the failed API request.
Generally, a different exception should be returned to the web adapter. That said, it is also a pragmatic choice to reuse an existing exception type, as long as custom property fields can be inserted.
As @Flater explains, it is okay for the web adapter exception to link to (nest) the internal persistence exception, since there is no risk of leakage.
However, the exception for the web adapter has to include extra information, as I explain below.
When thinking about what to return, start with (i.e. ask for) a requirement analysis on the web presentation layer, which is not depicted on your architecture diagram. Your web adapter should basically provide the bare minimum information to their web presentation layer to allow them to do their part.
If there is no human end-user (it's all machine-to-machine), the OpenTelemetry Root Span ID is the only thing you should mention in your HTTP 500. (If the service is launched in debug mode by the developer, they would not need to go into OTel - they can easily check the inner exception.)
Here is a hypothetical requirement analysis for error handling on behalf of their web presentation layer, assuming that their end-users are non-technical humans.
- (always) Troubleshooting ID and the exact timestamp
- (optional) Any reasons of failure that can be understood by the end-user. Do not include technical information. And if there's no user-understandable reason, just omit it.
- (optional) Any failed validations or data-related warning signs observed from any layer. This can provide hints to the end-user so that they can correct the problem on their end. If there's nothing they can correct on their side, omit it.
- (optional) If the persistence failure implies that some user changes have been lost or reverted, it is important to notify them, and it is also important for the web presentation layer to refresh their display to accurate reflect the state of data in the system.
- (only if the web interface has a "developer mode") A link that opens up a new window (to a backend log aggregator) where an authenticated developer can browse the relevant error messages associated with the Troubleshooting ID. Generally, such developer-oriented information shall never be sent through the web presentation layer.
Therefore:
- Do include a troubleshooting ID and timestamp, for technical support ("correlation") purpose.
- Ask the web presentation layer team how they would like to handle such situations. Then, design yours to satisfy their minimum requirement.
- It is up to you to decide whether to use English string literals for user-oriented error guidance messages.
- Doing so can increase the maintenance effort because you'll be asked to update the strings ("exception messages") more often than you'd like.
- Not doing so requires more cooperation between you and the web presentation layer team; on the other hand it provides UX with the flexibility they need for user-oriented language use.
HTTP/1.2 500 Internal Server Error
Exception
?