8

Background

This question stems out of an architectural discussion we had at work.

For the current project we are using Python and an object oriented database. There are input services that we consume, and certain interfaces that we are expected to provide output through.

Someone on our team started designing a data model, and my question was: why?

A data-only object in Python can just as well be be represented by a dict (there is a fair amount of questions about that on this site). We don't have a relational database that has columns that we need to map objects to either. At the same time, there is a cost to a data model in that it will always need to be maintained. When interfaces change it (IMHO) simply becomes another dependency that needs to be satisfied, but unlike in static-typed languages it does not actually enforce anything.

Question:

The way it seems to me is that when you are in an environment where everything is dynamic, it makes sense for the interfaces to define your data model implicitly rather than maintain some sort of classes that define a model. Are there ever any good reasons to the contrary?

EDIT

In the comments and answers people seem to be focusing on two areas, they are: database mapping and representation as well as validation of the data model.

I apologize for not being more explicit about the DB, but in this environment that we face we have an object-oriented DB that stores blobs in a file system-like representation. There is no mapping to SQL and no ORM of any sort to speak of. I do understand the argument though. For example, Django's models require subclassing for ORM to work. In thise case, making model classes makes perfect sense because the DB is effectively a static-typed data store and not 'dynamic'. But this is not the scenario of the question, because it's not a pure dynamically typed environment.

Regarding validators: yes. One will need to create validators that check fields being present, and them being of the right type. In static-typed languages the model implicitly does this (when you're coding C++/Java you can't stick user 'std::string' to where a User class should be). But in Python, a model class does not enforce anything. If I am building validators that check presence of attributes anyway, does this not make classes, dicts, generators etc functionally interchangable? And if so, should the solution not be one with least code? Validators in general seem to me to be an argument in favour of not building data model classes rather than the contrary. Am I wrong in this?

9
  • 4
    when you are in an environment where everything is dynamic, it makes sense for the interfaces to define your data model implicitly -- Traditionally a data model is the way you give your data some "shape," so that it is relatively straightforward to work with. If your data really has no shape at all, and merely consists of key/value pairs, you might be right. If you're building interfaces that specify this shape, then in a sense, you are building a data model, aren't you?CommentedFeb 28, 2013 at 19:54
  • @RobertHarvey Suppose it does have shape. In some leaky way you can always represent a real-life concept with a class. But my question is: so what? Other than being able to smugly say that our design is OO, do we actually gain anything in my scenario? I just can't shake the feeling that we gain nothing and we're just doing this because that's how we've always done it when we programmed in Java... I hope that you guys on this site can either reaffirm my gut feeling or give me a good argument to the contrary.
    – MrFox
    CommentedFeb 28, 2013 at 20:05
  • What do you gain? You gain the ability to query your data, for one thing. To ask it questions. That's very hard to do with a big, amorphous pile of key/value pairs. By giving your data some shape, it becomes meaningful to those who wish to use it. You gotta do that one way or another; you can do it by creating classes, defining interfaces, designing a SQL database, or hacking your key/value pairs into something useful to your customer, but one way or another you're still working with a data model of some sort.CommentedFeb 28, 2013 at 20:12
  • 2
    It's not just readability. Try doing a join on a collection of key/value pairs, and see how far you get. For a detailed example of what I mean when I refer to data that has no shape, see here: simple-talk.com/content/article.aspx?article=292CommentedFeb 28, 2013 at 20:20
  • 1
    @MrFox If you are using a storage mechanism which provides very little to no model / constraint checking, that's all the more reason to have an object model which enforces correctness. Also - "[having a] class isn't going to prevent some guy from assigning a string to a currency 'field'"; well, you certainly could do validations or even type checking in a setter for the field, or am I missing something? In this case, you'd still only get an error at runtime, but it'd still prevent the data from making it to your DB, and causing crashes 6 months later.
    – Daniel B
    CommentedMar 1, 2013 at 6:04

2 Answers 2

6

Even in a dynamic language the principles of Domain Driven Design can still apply. If all you're doing is passing around dictionaries, you have an Anemic model where your objects are pure data and other objects operate on them.

Taking the time to create a rich object model and embedding logic into the model provides an opportunity for your model to be more expressive and representative of the domain you're modeling.

5
  • I read the article. In the fourth paragraph the author notes "In essence the problem with anemic domain models is that they incur all of the costs of a domain model, without yielding any of the benefits. The primary cost is the awkwardness of mapping to a database, which typically results in a whole layer of O/R mapping." Funny enough, the benefit/cost analysis is exactly my reservation abuot implementing it. With our NoSQL DB, I don't have a problem with mapping. At the same time the author does not mention any real benefits outside of "It will satisfy OO purists".
    – MrFox
    CommentedMar 1, 2013 at 15:38
  • 2
    I appreciate the answer, but I'm looking for something more definitive and not dogmatic (as I find the linked article to be). Sometimes functional programming fits better than OO. I reject that we have to be OO for the sake of OO. Please consider your own argument in the second pargarph (I think it's more on the money?), you're saying that modeling the data will allow it to be more expressive. So does that again go into readability? What's the point of it being expressive and how is it worth it considering the cost of maintaing the data model in the described environment?
    – MrFox
    CommentedMar 1, 2013 at 15:45
  • 1
    @MrFox: the domain knowledge relating to a model must be represented somewhere. If it's not in the domain models, then it must be in higher-level objects. In such designs, the domain knowledge tends to be duplicated in multiple higher-level objects.CommentedMar 1, 2013 at 16:22
  • @MrFox I didn't put the link in there someone else did. It's not a dogmatic approach...believe me I railed against some of the ideas at first. But once I learned and embraced it, I was able to take my knowledge and apply the techniques to lead the design and development of the main software system for a logistics company, a domain I had no prior experience in. (By happenstance, Evans book discusses shipping, but not nearly as in depth as what we were doing).CommentedMar 1, 2013 at 17:00
  • By the way functional programming and OO programming are more alike than you realize. If your language of choice supports one over the other, take advantage of that paradigm to the fullest. c2.com/cgi/wiki?ClosuresAndObjectsAreEquivalentCommentedMar 1, 2013 at 17:04
2

When I worked against Mongo we got dictionaries back. In some parts of the code we converted it to an actual object. In other places we kept it a dictionary. In most cases, the objects were just for convenience so we could say customer.name instead of customer["name"]. In these cases we used a dotted dictionary (a class overriding __getattr__).

There were cases when the domain model "viewed" the data differently. In these cases we created objects with the structure we wanted and mapped our data objects to them. It can make a huge difference in readability to do this mapping once up-front rather than scattered all throughout your code.

Another place where an actual object was useful was when validating schema. There are some Mongo wrappers out there that will allow you to define checks when grabbing and storing documents. This is useful for keeping your "schema" consistent in the face of user error. They can provide default values for missing attributes, convert dates into strings and things of that nature. There are similar libraries for working with JSON returned from a web application.

2
  • Thanks for sharing the experience. Regarding validators - so my understanding is that you have to have separate validator classes or objects (functions?) anyway because you can't rely on typing. Since you're building something that checks that all the attributes you require are present and are of a specific type doesn't the actual data representation (class, dict, generator) become completely irrelevant?
    – MrFox
    CommentedMar 1, 2013 at 15:50
  • Here's an example: MongoKit. You define your class in terms of an expected structure. It deals with wrapping the actual dict internally. There are a ton of document/JSON validators on GitHub to check out.CommentedMar 1, 2013 at 16:22

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.