1
\$\begingroup\$

I'm looking for a fast way to get the subset of 2 arrays containing the same kind of objects using PHP functions.

I'm looking for the red part:

Image

class Order { private $_orderId; private $_someOtherAttributes; } 

I'm currently using array_uintersect() to get the intersection and array_udiff() to get rid of the intersection.

$array_1 = array($o1, $o2, $o3 ... $on); $array_2 = array($o1, $o2, $o3 ... $on); function order_hash($object) { return sprintf('%d', $object->getOrderId()); } function compare_objects($a, $b) { return strcmp(order_hash($a), order_hash($b)); } $intersection = array_uintersect($array_1, $array_2, 'compare_objects'); $subset = array_udiff($array_2, $intersection, 'compare_objects'); 

Is there a faster / more efficient way? I need to work with a lot of objects > 400 each array and very often as well.

I'm currently using PHP Version 7.1.4.


My changes:

I changed the Order-Class to:

class Order { public $_orderId; private $_someOtherAttributes; } 

and compare_objects() to:

function compare_objects($a, $b) { if($a->_orderId> $b->_orderId) { return 1; } else if($a->_orderId< $b->_orderId) { return -1; } return 0; } 

and using array_udiff($array_1, $array_2, 'compare_objects') instead off array_udiff($array_2, array_udiff($array_1, $array_2, 'compare_objects'), 'compare_objects')

I'm aware that public attributes aren't the best practice, but changing it this way I gained 300% more speed.

\$\endgroup\$
6
  • 2
    \$\begingroup\$Can you show more context? On the surface, and from the usage example, it would seem that the two functions shown are standalone functions, but then when you look in compare_objects() you see object context ($this) being used. This doesn't make sense. If order_hash() and compare_objects() of Order class, why not show code for whole class for review? Right now it is hard to understand why you even put the Order class code in here.\$\endgroup\$CommentedJun 16, 2017 at 13:38
  • \$\begingroup\$The $this in compare_objects() is my mistake in forgetting to remove it in my urge to simplify my question. As soon as I'm able to work again I'll provide more code and more context. This Problem itselfe is a part of a quite bigger one using a lot more subsets and intersections.\$\endgroup\$
    – Maui
    CommentedJun 16, 2017 at 13:54
  • \$\begingroup\$From your sketch it looks like it would be enough to do $subset = array_udiff($array_2, $array1, 'compare_objects');. Any object not in $array2 will just be skipped. However, this question looks to be off-topic here, since we want the real code and not just some dumbed-down minimal version (in contrast to Stack Overflow).\$\endgroup\$
    – Graipher
    CommentedJun 16, 2017 at 16:03
  • \$\begingroup\$Since I allready created the whole problem in one post and it was said, that I should close it and reopen it at [StackOverflow]. Which was the place I got send from. @Graipher I'll rewrite the said article and open it again, as this question is pretty much anwsered. (stackoverflow.com)\$\endgroup\$
    – Maui
    CommentedJun 16, 2017 at 17:12
  • \$\begingroup\$Don't change your code for your question. If you are trying to get a code review, you should show us your actual code.\$\endgroup\$CommentedJun 16, 2017 at 20:26

1 Answer 1

1
\$\begingroup\$

As far as these things go, I think your performance is going to be close to ideal. There is maybe a little room for improvement, namely:

I'm pretty sure all you need is array_udiff. My suspicion (and some quick tests) suggest that this:

$subset = array_diff( $array_2, array_intersect( $array_1, $array_2 ) ); 

Is exactly equivalent to this:

$subset = array_diff( $array_2, $array_1 ); 

(some details excluded for brevity) So save yourself an expensive function call. My only other suggestion would be a more "global" change that may or may not be helpful. In your example it looks like you don't care about the keys. In that case, use them. When you build your array of objects use order_hash( $obj ) as the key. They you can do array_diff on the array_keys, without having to worry about compare_objects method, i.e.:

$array1 = [ order_hash( $o1 ) => $o1, order_hash( $o2 ) => $o2, ]; $array2 = [ ... same thing ... ] $keys_to_check = array_diff( array_keys( $array2, $array1 ) ); foreach ( $keys_to_check as $key ) do_something( $array2[$key] ); 

It is a different way of organizing your data structures, and it takes a little more awareness about how your objects are stored. However, if these arrays of objects are being passed back and forth frequently, and if you are frequently doing these kinds of comparisons, it can save you a lot of time (I think) because you only have to do your hash calculation once per object, instead of doing it once per object per intersect.

I frequently organize my structures this way (and do lots of diffs on them), and I find it easier to follow. Basically, the thinking changes from:

Use this function to compare these arrays and see which ones are different

to

Check the keys of this array to see which ones are different

Personally I prefer the latter.

\$\endgroup\$
1
  • \$\begingroup\$This is a way I haven't thought of organizing my data. The use of array_diff( $array_2, $array_1 ) instead of array_diff( $array_2, array_intersect( $array_1, $array_2 ) ); is the right way.\$\endgroup\$
    – Maui
    CommentedJun 16, 2017 at 17:13

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.