Finding a way to simplify complex queries on legacy application

Question

I am working with an existing application built on Rails 3.1/MySql with much of the work taking place in a JavaScript interface, although the actual platforms are not tremendously relevant here, except in that they give context.

The application is powerful, handles a reasonable amount of data and works well. As the number of customers using it and the complexity of the projects they create increases, however, we are starting to run into a few performance problems. As far as I can tell, the source of these problems is that the data represents a tree and it is very hard for ActiveRecord to deterministically know what data it should be retrieving.

My model has many relationships like this:

Project has_many Nodes has_many GlobalConditions Node has_one Parent has_many Nodes has_many WeightingFactors through NodeFactors has_many Tags through NodeTags GlobalCondition has_many Nodes ( referenced by Id, rather than replicating tree ) WeightingFactor has_many Nodes through NodeFactors Tag has_many Nodes through NodeTags

enter image description here

The whole system has something in the region of thirty types which optionally hang off one or many nodes in the tree.

My question is: What can I do to retrieve and construct this data faster?

Having worked a lot with .Net, if I was in a similar situation there, I would look at building up a Stored Procedure to pull everything out of the database in one go but I would prefer to keep my logic in the application and from what I can tell it would be hard to take the queried data and build ActiveRecord objects from it without losing their integrity, which would cause more problems than it solves.

It has also occurred to me that I could bunch the data up and send some of it across asynchronously, which would not improve performance but would improve the user perception of performance. However if sections of the data appeared after page load that could also be quite confusing.

I am wondering whether it would be a useful strategy to make everything aware of it's parent project, so that one could pull all the records for that project and then build up the relationships later, but given the ubiquity of complex trees in day to day programming life I wouldn't be surprised if there were some better design patterns or standard approaches to this type of situation that I am not well versed in.

The last Rails 3.1 release was March 18, 2013. The app was built on that, and it's already legacy? — l0b0, CommentedAug 20, 2014 at 12:31
You might want to change the title then, either to "[expletive] application" or the more descriptive "untested application". — l0b0, CommentedAug 20, 2014 at 12:46
@Eric & l0b0: that is just Michael Feather's definition (which is not bad, but not the only popular one). Other people call everything "legacy" which has gone under maintenance. — Doc Brown, CommentedAug 20, 2014 at 14:05
This code has reasonable test coverage here. I am using "legacy" to mean "has been running for a few years, features and components bolted on over time, written by different people most of whom are not me, still in constant use, not candidate for rewriting on a different platform or changing core libraries." Maybe "Legacy" isn't the term for that? Possibly "Production"? — glenatron, CommentedAug 20, 2014 at 14:12

Doc Brown · Accepted Answer · 2014-08-22 11:15:01Z

whether it would be a useful strategy to make everything aware of it's parent project

That is a standard strategy. It is called "denormalization". Assumed

your "project elements" never change the project they belong to
your project's primary ID key will never change
the additional memory needed for the IDs is no problem
it actually solves your performance problems

you should apply it.

Another approach is trying to add some views to your database, encapsulating some complex JOIN operations (I assume your database is properly indexed so the JOIN itself won't slow down things more than necessary). I never used Rails by mayself, but AFAIK ActiveRecord will work on views. So for example by joining 2 tables of a parent-child relationship together, instead of having a first SELECT SQL to determine a set of parent objects, and a list of SELECT SQLs, each one for the set of child objects of a specific parent ID, you pull all your data from the joined view with just one select SQL. Of course, this will work best when you need the data in joined form and don't want to write changes back to your DB system.

Thank you, this is useful- the system I am working with is theoretically well designed but becoming impractical in real-world operation, so I think denormalisation may well be helpful. I will look into views- updates come in through a separate queue already, so read-only behaviours would not be a problem. — glenatron, CommentedAug 22, 2014 at 11:22

Stack Exchange Network

Finding a way to simplify complex queries on legacy application

1 Answer 1

Hot Network Questions

Finding a way to simplify complex queries on legacy application

1 Answer 1

Related

Hot Network Questions