34
\$\begingroup\$

This question is always on my mind. I usually alter the array while looping through its keys, based on my gut feelings:

foreach (array_keys($array) as $key) { $array[$key] = perform_changes_on($array[$key]); } 

I'm aware of other methods, though. The array item can be passed by reference on the foreach call:

foreach ($array as &$item) { $item = perform_changes_on($item); } 

Finally, the array can be modified directly during the loop:

foreach ($array as $key => $value) { $array[$key] = perform_changes_on($value); } 

What are the performance and security implications of each approach? Is there a recommended approach?

UPDATE: What I'm actually worried about is that the two last approaches modify the array while I'm looping it.

\$\endgroup\$
2

5 Answers 5

21
\$\begingroup\$

If you are worried about modifying the array while looping, don't because foreach copies the array. Try

$array = array('foo'); foreach ($array as $k => &$item) { $array[] = 'bar'; } var_dump($array); 

And see it terminates just fine. foreach ($array as $k => &$v) is a shorthand for foreach (array_keys($array) as $k) $v = &$array[$k] so while there still is a copy of the array (that's why I used &$item in my example so you can see, if you modify the array then it'll be modified in the reference!

$array = array('foo', 'bar'); foreach ($array as $k => $item) { echo "$item\n"; if (!$k) { $array[1] = 'baz'; } } $array = array('foo', 'bar'); foreach ($array as $k => &$item) { echo "$item\n"; if (!$k) { $array[1] = 'baz'; } } 

the first dump foo and bar, the second foo and baz.

\$\endgroup\$
10
  • \$\begingroup\$"foreach ($array as $k => $v) is a shorthand for foreach (array_keys($array) as $k) $v = &$array[$k]" On a very picky note, that's not a shorthand as much as a difference in calling a function and a language construct (though for practical reasons, I suppose it's a shorthand). Also, it would be shorthand for a copy, not a reference. (Also, PHP is copy-on-write, so you can typically ignore extra copies.)\$\endgroup\$
    – Corbin
    CommentedJun 20, 2012 at 1:46
  • \$\begingroup\$Obviously that wanted to be foreach ($array as $k => &$v) edited and fixed.\$\endgroup\$
    – chx
    CommentedJun 20, 2012 at 5:46
  • \$\begingroup\$Would like to point out that if(!$k) is TRUE for 0, otherwise this statement is always false. So essentially you are overwriting the second array element before ever seeing it.\$\endgroup\$
    – mseancole
    CommentedJun 20, 2012 at 13:31
  • \$\begingroup\$I do that and by doing that, I am demonstrating that the first foreach works from a copy and the second doesn't.\$\endgroup\$
    – chx
    CommentedJun 20, 2012 at 16:39
  • \$\begingroup\$@chx You might want to run your first snippet then. $item is a copy of a value from $array. $array is never copied (though all elements of $array will be copied if the foreach loop completes). The first snippet will dump foo and baz just like the second one.\$\endgroup\$
    – Corbin
    CommentedJun 22, 2012 at 3:33
10
\$\begingroup\$

This sounds like a place for array_map().

$array = array_map('perform_changes_on', $array); 
\$\endgroup\$
4
  • \$\begingroup\$nice, but sometimes i won't change every item in the array. I just tried to make the example simpler...\$\endgroup\$CommentedJun 18, 2012 at 17:55
  • 1
    \$\begingroup\$perfom_changes_on doesn't have to change every element. Note that there are other array functions, some more powerful than others. array_reduce can help a lot, for example.\$\endgroup\$CommentedJun 19, 2012 at 11:46
  • \$\begingroup\$Doesn't this involve a function call for every element? May not be the best performance wise.\$\endgroup\$
    – Tom
    CommentedDec 8, 2014 at 18:31
  • \$\begingroup\$@Tom: I wouldn't say the overhead of function call is notable. Maintainability is way more important than very small performance gains (that I'm not even sure if they even do exist). But if you think that writing 100,000 lines program without functions is a best way of programming, by all means do it. Just don't be surprised when nobody wants to maintain the code, and you won't understand what you wrote a day later.\$\endgroup\$
    – null
    CommentedDec 8, 2014 at 20:34
3
\$\begingroup\$

When using foreach($array as &$item) never ever forget the unset($item); after the foreach or you will get into serious trouble trying to use $item later. It should be habitual to avoid this trap.

In general you should avoid foreach ...& and do array_walk($array, function (&$item) {... so that the reference is strictly confined inside the closure.

\$\endgroup\$
1
  • \$\begingroup\$thanks for the answer, I'm using array_keys to avoid the trap, but do I need to use it? I mean, foreach ($array as $key => $item) should let me change $array[$key] value, but who knows how php works internally... that's the question, actually.\$\endgroup\$CommentedJun 19, 2012 at 17:02
2
\$\begingroup\$

I can't really help you with the performance bit, only tell you to wrap them in microtime() tags and see which one performs better for you. Systems are slightly different. The one suggestion I can give you is to remove array_keys() from your code.

!!UPDATE!!

If you were following Corbin and my argument below, then I finally have an answer for you. I was getting for and foreach confused. For and while loops do call any functions passed in as arguments on every iteration, foreach does not. Which is why its better to call functions such as strlen() and count(), just to give a couple of examples, outside of a for or while loop. The overhead we were experiencing was not from foreach but from array_keys(). When array_keys() is called it must generate a new array, which is why it is almost twice as slow. So it is best to drop the array_keys() function all-together and just iterate over the array and retrieve the key value pair. Sorry for any confusion this may have caused.

Sources:

!!END OF UPDATE!!

To the best of my knowledge there is no security risk with any of those implementations. You are iterating a construct that already exists. Any security issues would have happened before this point. Except of course if you are using user supplied data, such as GET and POST. These you should sanitize and validate before using, which is something you could do with one of those foreach loops. Or you could also check out filter_input_array() and its cousins.

I know I personally would not use the second implementation due to the lack of legibility. At least with the first and third implementations you can visually see that a value is being changed. The second is not readily obvious. However, it is most likely the more efficient. I have used both the first and third myself, but more often use the third. Think it has to do with what mood I'm in. Hope this helps, I'm interested to hear what others might have to say :)

\$\endgroup\$
14
  • \$\begingroup\$"Arguments given to foreach are declared on each iteration, so you are technically calling array_keys() every time foreach is looped." Nope, it will only call array_keys once in this context. (Perhaps before PHP5 it had this behavior, but I do not think so as this would very easily end up in infinite loops.)\$\endgroup\$
    – Corbin
    CommentedJun 18, 2012 at 17:11
  • \$\begingroup\$@Corbin: Nope, still an issue. Don't believe me, run some tests on it and you'll see for yourself. PHP creates an iterator object for the first parameter each time it loops so that it can perform each() on each iteration of it. It is recreated, but the pointer stays the same. That's why if you changed the value of the array by reference it would be immediately available upon the next iteration :)\$\endgroup\$
    – mseancole
    CommentedJun 18, 2012 at 18:09
  • \$\begingroup\$Can you provide an example? Just tried to get it to call array_keys (well, f) twice, and could never get it to.\$\endgroup\$
    – Corbin
    CommentedJun 18, 2012 at 18:20
  • \$\begingroup\$Potentially relevant: stackoverflow.com/questions/4043569/…\$\endgroup\$
    – Corbin
    CommentedJun 18, 2012 at 18:35
  • \$\begingroup\$@Corbin: I'm honestly not sure then. I've seen someone else claiming the exact opposite, and my speed tests seem to prove it. The foreach with the "recreated" array always seems to take longer. Ran a test of over a hundred iterations and all seem conclusive that something extra is being performed in the background. I'm still looking around for that post, hopefully I can find it. I need to start keeping a repository of these posts...\$\endgroup\$
    – mseancole
    CommentedJun 18, 2012 at 19:17
0
\$\begingroup\$

I'd go with your second approach:

foreach ($array as &$item) { $item = perform_changes_on($item); } 

Or even better:

function perform_changes_on(&$item) { // ... } foreach ($array as &$item) { perform_changes_on($item); } // ... 

Working on each item (by reference) seems the safest one, since you're not accessing the array's structure, but its single elements instead (which, I take, is what you want to do).

\$\endgroup\$

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.