Architecture for storing generic data

Question

(I am facing this issue with code written in Swift, but would appreciate any high-level pseudocode solution, just so that I may wrap my head around the architecture)

I need to find an architecture that would allow me to store generic data into instances of a specific type.

Consider the following example:

struct Project: Codable { let id: String let title: String // ... other metadata } protocol Publisher { associatedtype Configuration: Codable func publish(project: Project, using configuration: Configuration) } class MyPublisher { struct Configuration: Codable { let destination: String // ... other configuration data } func publish(project: Project, using configuration: Configuration) {} }

How could I design the Project type to store any arbitrary Configuration types?

I have the following requirements:

Project is Codable, so any data it stores should also conform to Codable
Project does not care about the actual data stored, as it is only destined for Publisher types.
There may be 0 to many publisher configuration stored

I have found the following subpar solutions:

Storing static references to existing Publisher types

struct Project: Codable { // ... metadata var myPublisherConfiguration: MyPublisher.Configuration? // potentially 5 other such variables for other available publishers }

I want to avoid doing this because this strongly couples the Project type to any and all Publisher types (that, and I cannot do this as it stands as Publishers exist in different packages, can be added externally and should not require changes to the Project type every time).

Adding a layer between `Project` and `Publisher`

Another solution would be to create a Manager type that would oversee linking Project and Publisher. Something like:

struct PublisherManager: Codable { var publisherTypes: [any Publisher.Type] = [] var storedConfigurations: [String: [any Publisher.Configuration]] = [:] func registerPublisher(_ publisherType: Publisher.Type) { // do some more validation, but for the example just add publisherTypes.append(publisherType) } func configuration<T: Publisher>(for project: Project, publisherType: T.Type) -> T.Configuration? { guard let configurations = storedConfiguration[project.id] else { return nil } return configurations.compactMap { config in config as? T.Configuration).first } }

This is just a gist, and probably doesn't even compile due to the use of any Publisher.Configuration, as the compiler doesn't know how to encode/decode this... But there are issues with this solution anyways, even if I got it to work;

It splits up where data is stored, which requires manual work to keep everything in sync (think deleting a project, it should also delete all stored configurations for that project)

What is the best approach?

I need to find a solution that is simple, which allows me to persist generic data related to objects, without those objects needing to manage and/or care about the data persisted.

Is this even a good approach? Are there better architectural paradigms I'm overlooking to achieve something similar?

Any help would be greatly appreciated 🙏

What prevents you from using a JSON or XML file? Would a simple Dictionary or Associative Array in your programming language of choice suffice? — Robert Harvey, CommentedMar 29, 2023 at 12:30
I'm considering resorting to this more and more; it's just that I would have liked not to store raw encoded data, as that makes debugging slightly more difficult — Skwiggs, CommentedMar 29, 2023 at 12:44
Will the configurations always be set of key-pair properties or could they have more complex structures. — JimmyJames supports Canada, CommentedMar 29, 2023 at 17:38

DavidT · Accepted Answer · 2023-03-30 00:14:08Z

I have no swift experience, so I can only answer the "high-level" part of your questions.

You seem to be asking two questions. How to manage:

An arbitrary set of configuration parameters?
Generic data in general?

1 - Configuration Parameters

I would say, don't overthink it - it's likely that most of the config parameters for any similar set of objects (publishers) are likely to be the similar.

Hence you can probably create a PublisherConfig object that contains the standard stuff all publishers need and simply add a Map<String, String> for anything extra a particular publisher needs.

If you need to add additional validation you can add a metadata object to each Publisher, which provides validation rules for the extra parameters they require.

Introducing some structure (ProjectConfig, PublisherConfig, etc) is a trade off so that you don't just end up with "Configuration" being a thin wrapper around a Map<String, String>.

2 - Generic Data in General

Most data stores have a way to store a generic blob of data for example Postgres has the JSON and JSONB types.

For application tiers that are just "passing the data through" without manipulating it, you have several of choices:

Leave it as a blob.
Serialize it to a string (JSON being a good example of this).
Covert it to some combination of Lists and Maps along with basic types (string, ints, etc).

The last case is usually best if you need to include the data in a larger structure you are serializing for a network call - otherwise you tend to end up with "double escaping" - consider what happens if you try to include a full JSON document as a string value inside another JSON document.

Unless you are using the generic data to render a generic UI (where any arbitrary structure can be rendered) it is likely that some part of your code will either need to convert the data into a structured form or at least understand that the data has some structure to it. This creates additional complications for you to deal with:

Deserialization

You should assume that the data you are getting is garbage:

It could be from an earlier (or later) version of the application which uses a different structure.
It could be for a different version that had a bug in it.
Another application (or data migration) may have messed with data.

Since the lower tiers / data store are making no guarantees for you - it becomes your responsibility to think about all the ways the data could be mangled.

Round Tripping

If you:

Load the data
Manipulate or present it in some way
Write the data back to the data store

It's possible that your code (or some earlier/later version) is not the only application using the data, hence you need to decide, if it is acceptable to throw away any structure that the current version of your code doesn't understand.

Alternatively, you will need to load the data, update the original structure based on users actions then write back the updated form - so that you continue to preserve any structure you don't understand.

Ref Integrity / Cascading Deletes

It is difficult to enforce any kind of referential integrity in generic data. This is true, if the reference points to:

Another part of the same blob.
Another blob.
Structured data somewhere else in your system.

This is because the lower tiers (and ultimately the data store) do not understand the data - hence can't help with any guarantees.

That said, if you store two "blobs" in different database tables, that have referential integrity between them, you can safely assume that if a record in the parent table is deleted, all the children records have been deleted too.

What a super thorough answer! Thanks a lot for the write up, helps me paint a better picture on how to achieve this and what pitfalls I may run into 💪 — Skwiggs, CommentedMar 30, 2023 at 12:33

Stack Exchange Network

Architecture for storing generic data

Storing static references to existing Publisher types

Adding a layer between `Project` and `Publisher`

What is the best approach?

1 Answer 1

Hot Network Questions

Architecture for storing generic data

Storing static references to existing Publisher types

Adding a layer between Project and Publisher

What is the best approach?

1 Answer 1

Related

Hot Network Questions

Adding a layer between `Project` and `Publisher`