5
\$\begingroup\$

Heap's algorithm is an algorithm to generate all permutations of a given array. It

... generates each permutation from the previous one by interchanging a single pair of elements; the other n−2 elements are not disturbed.

Here is my attempt to implement this algorithm in Swift 4 as a Sequence, so that all permutations of an array can be enumerated with for .. in loops and related techniques:

/* * A sequence of all permutations of a given array. Based on the * non-recursive version of Heap's algorithm as described in * https://en.wikipedia.org/wiki/Heap%27s_algorithm */ struct HeapPermutationSequence<Element>: Sequence, IteratorProtocol { private var current: [Element] private var c: [Int] private var i = 0 private var firstIteration = true init(elements: [Element]) { self.current = elements self.c = Array(repeating: 0, count: elements.count) } mutating func next() -> [Element]? { if firstIteration { firstIteration = false } else { while i < c.count { if c[i] < i { if i % 2 == 0 { current.swapAt(0, i) } else { current.swapAt(c[i], i) } c[i] += 1 i = 0 break } else { c[i] = 0 i += 1 } } } return i < c.count ? current : nil } } 

Example usage:

for p in HeapPermutationSequence(elements: [1, 2, 3]) { print(p, terminator: ", ") } print() // Output: // [1, 2, 3], [2, 1, 3], [3, 1, 2], [1, 3, 2], [2, 3, 1], [3, 2, 1], 

Several techniques from @Hamish's answer Sequence-based enumeration of permutations in lexicographic order could be applied here as well to improve the performance:

  • Avoid the cost of copy-on-write by not mutating the current array after it is returned as the next element.
  • Have a single exit point in the next() method.

Benchmark:

let N = 10 var count = 0 let start = Date() for _ in HeapPermutationSequence(elements: Array(1...N)) { count += 1 } let end = Date() print(count, end.timeIntervalSince(start)) 

which takes about 0.03 seconds (on a 1.2 GHz Intel Core m5 MacBook, compiled in Release mode).

All feedback is welcome, such as (but of course not restricted to):

  • Performance improvements.
  • Better variable names (c and i are simply copied from the Wikipedia pseudo-code).
  • Simplifying the code, or making it better readable/understandable.
\$\endgroup\$

    2 Answers 2

    3
    \$\begingroup\$

    I've been able to slash the execution time by almost half on my machine by declaring current and c as ContiguousArray:

    struct HeapPermutationSequence<Element>: Sequence, IteratorProtocol { private var current: ContiguousArray<Element> private var c: ContiguousArray<Int> private var i = 0 private var firstIteration = true init(elements: [Element]) { self.current = ContiguousArray(elements) self.c = ContiguousArray<Int>(repeating: 0, count: elements.count) } mutating func next() -> ContiguousArray<Element>? { ... } } 

    Benchmarks

    Using the same benchmarking code, compiled with -O in the terminal, on a 2.7 GHz i7 MacBook Pro :

    • Before: fluctuates between 0.126s and 0.139s
    • After: 0.07s with inferior fluctuations, thanks to current being a ContiguousArray, since such a type has predictable performance.

    I've got to 0.022s by wrapping the benchmarking code in a do statement or any new scope (Makes the code 3x faster):

    do { let N = 10 var count = 0 let start = mach_absolute_time() for _ in HeapPermutationSequence(elements: Array(1...N)) { count += 1 } let end = mach_absolute_time() print(count, Double(end - start)/Double(1e9)) } 

    🤯 Putting the original code inside a single iteration for loop for _ in 0..<1 {...} brings its execution time down to 0.020 too. Putting it inside a do statement, repeat while false, closure, while true { ... break}..., takes it back to 0.13s 👎🏻.

    N.B: The above result has only been observed on a mac (4 physical cores, 4 logical ones). On an iPhone and an iPad (both 2 physical cores), using a ContiguousArray makes the code a full second faster (2.~s vs 3.~s).


    We can gain 1 to 2ms by rearranging the conditions:

    private var notFirstIteration: Bool = false //... if notFirstIteration { //... } else { notFirstIteration = true } 

    This would eliminate The (N! - 1) unnecessary conditional jumps in the assembly code.


    As suggested by Mr. Martin: Not discarding manually the output of HeapPermutationSequence by using this code:

    for x in HeapPermutationSequence(elements: Array(1...N)) { count += x.count } 

    Gives a +/-2ms fluctuation in both solutions.

    \$\endgroup\$
    6
    • 1
      \$\begingroup\$That is interesting. My original code, now compiled with Xcode 10.1, runs in approx. 0.16 seconds, that is much slower than what I measured when I wrote the question. With your modification, it reduces to 0.03 seconds. Compiled with Xcode 9.2 both your and my code take about 0.025 seconds. Now the funny part: If I change the benchmark loop to for x in HeapPermutationSequence(elements: Array(1...N)) { count += x.count } (so that the return value is not ignored), then both your and my code run in 0.025 seconds, both with Xcode 9.2 and Xcode 10.1.\$\endgroup\$
      – Martin R
      CommentedNov 8, 2018 at 18:39
    • 1
      \$\begingroup\$I have the feeling that for some reason, the performance of Array degraded in Xcode 10.1. According to the docs, Array and ContiguousArray should have similar efficiency if the element type is a struct.\$\endgroup\$
      – Martin R
      CommentedNov 8, 2018 at 18:39
    • \$\begingroup\$@MartinR I am using Xcode 10.1, Swift 4.2, and without ignoring the result (using x), with the original code, I get between 0.128s and 0.132s (no apparent improvement). With ContiguousArray I get between 0.072s and 0.074s. Ignoring the result gives me between 0.073s and 0.075s. So I do observe a slight improvement when the result is not discarded manually before entering the loop, and letting it be released automatically.\$\endgroup\$
      – ielyamani
      CommentedNov 8, 2018 at 19:23
    • 1
      \$\begingroup\$This explains why putting the code into a do-block (or for-loop) can affect the perfomance. It also seems that some performance differences depend on whether the compiler inlines the code or not. To get rid of these effects, I have put the HeapPermutationSequence code into a separate module, and now the results are 1.5 sec (my code) vs 0.5 sec (your code) on a 3.5 GHz Intel Core i5 iMac, and independent of whether the return value is ignored or not. – (Conclusion: benchmarking is difficult!)\$\endgroup\$
      – Martin R
      CommentedNov 11, 2018 at 13:24
    • \$\begingroup\$(cont.) So ContiguousArray is indeed faster than Array, even for simple value types. It would be interesting to why that is, but that is a different question.\$\endgroup\$
      – Martin R
      CommentedNov 11, 2018 at 13:25
    1
    \$\begingroup\$

    HeapPermutationSequence computes all permutations of a given array. A common Swift idiom is not to operate on concrete types, but on protocols. In this case, we need a mutable collection type which allows efficient random access to its elements. Therefore it makes sense to implement

    struct HeapPermutationSequence<C>: Sequence, IteratorProtocol where C: RandomAccessCollection & MutableCollection { init(elements: C) { ... } mutating func next() -> C? { ... } } 

    instead, which is then applicable to more types. A simple example is an array slice:

    let slice = [1, 2, 3, 4, 5].prefix(3) for p in HeapPermutationSequence(elements: slice) { print(p) } 

    There is not much that needs to be changed, only the index calculations become a bit verbose:

    struct HeapPermutationSequence<C>: Sequence, IteratorProtocol where C: RandomAccessCollection & MutableCollection { private var current: C private var c: [Int] private var i = 0 private var firstIteration = true init(elements: C) { self.current = elements self.c = Array(repeating: 0, count: elements.count) } mutating func next() -> C? { if firstIteration { firstIteration = false } else { while i < c.count { if c[i] < i { if i % 2 == 0 { current.swapAt(current.startIndex, current.index(current.startIndex, offsetBy: i)) } else { current.swapAt(current.index(current.startIndex, offsetBy: c[i]), current.index(current.startIndex, offsetBy: i)) } c[i] += 1 i = 0 break } else { c[i] = 0 i += 1 } } } return i < c.count ? current : nil } } 

    In my tests, the benchmark still runs in the same time with this implementation.

    \$\endgroup\$
    0

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.