3
\$\begingroup\$

I want to create a struct-of-arrays (SOA) database for better cache use but also have a separate interface that gives encapsulation. I am close to the design I think but would like more experienced coding eyes to have a look to critique my design. Here is what I have so far:

struct OrbitalDataBase { OrbitalDataBase(std::shared_ptr<FEMDVR> a_radial_grid, std::shared_ptr<AngularGrid> a_angular_grid) : nbas(a_radial_grid->getNbas()), number_of_lm_pairs_per_orbital(), m_orbital_coefs() { m_orbitals_coefs.reserve(a_radial_grid->getNbas() * a_angular_grid->getNumChannels()); } inline void StoreData(const uint32_t &a_ith_orbital, std::shared_ptr<MOPartialWaveRepresentation> a_MO) { this->number_of_lm_pairs_per_orbital.push_back(a_MO->getNumChannels()); this->m_orbital_coefs.reserve(a_ith_orbital * nbas * a_MO->getNumChannels()); for (uint32_t i = 0; i < nbas * a_MO->getNumChannels(); ++i) { this->m_orbital_coefs.push_back(a_MO->getPartialWaveRep(i)); } } /* private: */ uint32_t nbas; std::vector<uint32_t> number_of_lm_pairs_per_orbital; std::vector<std::complex<double>> m_orbital_coefs; }; struct OrbitalAccessor { OrbitalAccessor(const OrbitalDataBase &a_orbital_data) : orbital_data(a_orbital_data){}; inline std::complex<double> coef(uint32_t ith_orbital, uint32_t i) const { return this->orbital_data.m_orbital_coefs [ith_orbital * this->orbital_data.nbas * this->orbital_data.number_of_lm_pairs_per_orbital[ith_orbital] + i]; } private: const OrbitalDataBase orbital_data; }; 

How this would work is construct the database with the number of orbitals and then in a for-loop, save all the ith_orbital's coefficients in one large array. Something like this

uint32_t number_of_orbitals = 5; OrbitalDataBase orb_data_base(number_of_orbitals); for (uint32_t ith_orbital = 0; ith_orbital < number_of_orbitals < ++ith_orbital) { Orbital(); orb_data_base.StoreData(ith_orbital, femdvr_grid, orbital); } 

The Orbital() function just creates the orbital I want to store. Then the OrbitalAccessor should be able to access these large arrays index with stride ith_orbital. I think I should pass the accessor the created database I want to access in a constructor like

 int which_orbital = 1, int index 4; // example looks at 2nd orbital's 5th coefficient OrbitalAccessor orb_accessor(orb_data_base); orb_accessor.coef(which_orbital, index); 

I tried to have the ObitalDataBase member data private but I get

'm_orbitals_coefs' is a private member of 'OrbitalDataBase'

so I commented the private: keyword so it will compile. I think they should be private for encapsulation so I'd prefer that if I am correct.

I think a better way would be to have OrbitalAccessor derive the OrbitalDataBase rather than pass it in to a constructor.

UPDATE: I changed the code a little so now the m_orbital_coef has a base vector capacity and resizes if it becomes larger because a_MO->getNumChannels() can be larger than a_angular_grid->getNumChannels().

\$\endgroup\$
4
  • 1
    \$\begingroup\$if it doesn't compile you have to fix it prior to posting here\$\endgroup\$CommentedNov 22, 2020 at 20:36
  • \$\begingroup\$With SOA, do you mean "Service-oriented architecture"? Please expand acronyms you use at least once so people know what exactly you are talking about.\$\endgroup\$CommentedNov 22, 2020 at 23:37
  • 1
    \$\begingroup\$Please do not update the code in your question after receiving answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.\$\endgroup\$
    – Mast
    CommentedNov 23, 2020 at 19:56
  • \$\begingroup\$@Mast Ah shoot sorry, thank you for the guidance!\$\endgroup\$
    – zstreet
    CommentedNov 23, 2020 at 20:08

1 Answer 1

2
\$\begingroup\$

There is a lot missing from this question, but there is enough code to be able to tell that a lot that could be improved. Here are some ideas for improving your code.

Use all of the required #includes

All of the #includes for this code are missing in the description. I believe the required list is this:

#include <complex> #include <cstdint> #include <memory> #include <vector> 

Provide complete code to reviewers

This is not so much a change to the code as a change in how you present it to other people. Without the full context of the code and an example of how to use it, it takes more effort for other people to understand your code. This affects not only code reviews, but also maintenance of the code in the future, by you or by others. One good way to address that is by the use of comments. Another good technique is to include test code showing how your code is intended to be used. There is a bit of that in your question, so we'll start from there.

Start with the desired interface

It's usually very useful to start by imagining how we'd like to use the code before implementing the interface. It appears that there exists some external source of orbital data (I'm guessing TLE) which this code is supposed to read and parse (not shown) and then provide storage and access methods. First, rather than two separate classes, OrbitalDataBase and OrbitalAccessor, it seems clear to me that there should rather be a single OrbitalDataBase and that the accessor should be a member function. That leads to the first simplification:

std::complex<double> coef(uint32_t ith_orbital, uint32_t i) const { return m_orbital_coefs[ith_orbital * nbas * number_of_lm_pairs_per_orbital[ith_orbital] + i]; } 

Don't write this->

Within member functions this-> is redundant. It adds visual clutter and does not usually aid in understanding.

Use more whitespace for clarity

It's very difficult to parse the OrbitalDataBase constructor because it's all run together with very little whitespace. I'd modify it to look more like this:

struct OrbitalDataBase { OrbitalDataBase(std::shared_ptr<FEMDVR> a_radial_grid, std::shared_ptr<AngularGrid> a_angular_grid) : nbas(a_radial_grid->getNbas()) , number_of_lm_pairs_per_orbital() , m_orbital_coefs() { m_orbital_coefs.reserve(a_radial_grid->getNbas() * a_angular_grid->getNumChannels()); } 

Don't explicitly invoke implied constructors

The constructor, reformatted as above, reveals that the default and implied constructors for number_of_lm_pairs_per_orbital and m_orbital_coefs are explicitly called. This is both unnecessary and unhelpful. Simply omit those two lines.

Prefer const references to smart pointers

If you are passing in data, such as the FEMDVR and AngularGrid in this code, that are used but not altered, this implies const. Further, if we are not sharing ownership, which appears to be the case here, these should be references rather than std::shared_ptr.

OrbitalDataBase(const FEMDVR& a_radial_grid, const AngularGrid& a_angular_grid) : nbas(a_radial_grid.getNbas()) { m_orbital_coefs.reserve(nbas * a_angular_grid.getNumChannels()); } 

Understand modern use of inline

Based on how you're using it, it appears that you are abusing the inline keyword to get an effect that the compiler is already going to provide without it. That is the inline specifier for functions used to be a request to the compiler to optimize by putting the code inline rather than creating an explicit function call. However, this was only ever a suggestion, and the changing semantics of inline for C++17 and beyond are likely to lead to trouble unless very carefully used. The reason is that, starting with C++17, it tells the compiler that multiple defifinitions are allowed, but if they differ in any way, invoking the function is undefined behavior. For these reasons, you're better off omitting inline in most cases.

Rethink your data structures

Right now getting a value back uses this rather complex form:

std::complex<double> coef(uint32_t ith_orbital, uint32_t i) const { return m_orbital_coefs[ith_orbital * nbas * number_of_lm_pairs_per_orbital[ith_orbital] + i]; } 

This is very cache unfriendly because it has to look up one value from number_of_lm_pairs_per_orbital and then use that to index into another vector. It may make things more clear and give better performance if instead you defined a struct or class that encapsulated both. The obvious would be this:

struct Orbital { std::vector<std::complex<double>> m_orbital_coefs; }; 

Then OrbitalDataBase could use std::vector<Orbital> as its only data member, since the number of coefficients is m_orbital_coefs.size() and nbas is now m_orbital_coefs.size().

Even simpler might be this:

using Orbital = std::vector<std::complex<double>>; 
\$\endgroup\$
1
  • \$\begingroup\$I am confused with your suggestion for restructuring the data, could you explain a little more? The cache performance is from saving all the orbitals coefficients in one vector strided by nbas * number_of_lm_pairs_per_orbital[ith_orbital], so all of the elements for each orbital are contiguous in memory.\$\endgroup\$
    – zstreet
    CommentedNov 23, 2020 at 17:11

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.