Parallel processing simulation

Question

Input

n number of workers and m jobs to be scheduled, followed by m lines of time t taken by a particular job to be completed.

workers(or threads) take jobs in the order they are given in the input. If there is a free thread, it immediately takes the next job from the list. If a thread has started processing a job, it doesn’t interrupt or stop until it finishes processing the job. If several threads try to take jobs from the list simultaneously, the thread with smaller index takes the job. For each job you know exactly how long will it take any thread to process this job, and this time is the same for all the threads. You need to determine for each job which thread will process it and when will it start processing.

Output

m lines. i-th line containing two space-separated integers - 0 based index of the thread which will process the i-th job and its start time

Constraints

0 ≤ n ≤ 10⁵ (number of workers)

0 ≤ m ≤ 10⁵ (number of jobs)

0 ≤ t ≤ 10⁹ (time taken for each job)

My Approach

I have used a heap to represent a priority queue of the threads to determine the next worker. The workers are first prioritized by their next free time and then by their index. To represent these two quantities I've used the pair in c++ stl.

Here is my code:

#include <algorithm> #include <iostream> #include <vector> using std::cin; using std::cout; using std::make_pair; using std::pair; using std::swap; using std::vector; class JobQueue { private: int num_workers_; vector<int> jobs_; vector<int> assigned_workers_; vector<long long> start_times_; void WriteResponse() const { for (int i = 0; i < jobs_.size(); ++i) { cout << assigned_workers_[i] << " " << start_times_[i] << "\n"; } } void ReadData() { int m; cin >> num_workers_ >> m; jobs_.resize(m); for (int i = 0; i < m; ++i) cin >> jobs_[i]; } void heapify(vector<pair<int, long long>> &heap, int k) { int smallest = k; int left = (2 * k) + 1; int right = (2 * k) + 2; if (left < heap.size()) { if (heap[left].second < heap[smallest].second) smallest = left; else if (heap[left].second == heap[smallest].second) { if (heap[left].first < heap[smallest].first) smallest = left; } } if (right < heap.size()) { if (heap[right].second < heap[smallest].second) smallest = right; else if (heap[right].second == heap[smallest].second) { if (heap[left].first < heap[smallest].first) smallest = right; } } if (left < heap.size() && right < heap.size() && heap[left].second == heap[right].second) { if (heap[left].first < heap[right].first) smallest = left; else smallest = right; } if (smallest != k) { swap(heap[k], heap[smallest]); heapify(heap, smallest); } } void changePriority(vector<pair<int, long long>> &heap, int incr) { heap[0].second += incr; heapify(heap, 0); } void AssignJobs() { assigned_workers_.resize(jobs_.size()); start_times_.resize(jobs_.size()); vector<pair<int, long long>> next_free_time(num_workers_); for (int i = 0; i < num_workers_; i++) { next_free_time[i].first = i; next_free_time[i].second = 0; } for (int i = 0; i < jobs_.size(); ++i) { pair<int, long long> next_worker = next_free_time[0]; assigned_workers_[i] = next_worker.first; start_times_[i] = next_worker.second; changePriority(next_free_time, jobs_[i]); } } public: void Solve() { ReadData(); AssignJobs(); WriteResponse(); } }; int main() { std::ios_base::sync_with_stdio(false); JobQueue job_queue; job_queue.Solve(); return 0; }

Complexity

if J is the number of jobs and T the number of threads, the complexity here (I might be wrong, please correct me if I am) should be O(J log T)

Question

The code exceeds time limit in certain cases, how can I optimize it? (Time limit: 1 sec)

Toby Speight · Accepted Answer · 2018-12-03 11:40:09Z

Know your standard library! In this case, there's a heap container provided for us, called std::priority_queue. If we use it, we can remove all of the heap management code we've reinvented, and focus on the algorithm itself. Note that we will need to pop() and push_back() to reposition a thread when we assign work; this operation remains O(log T), and is likely much better than re-heapifying the entire heap at each iteration.

Remember that std::pair already has a < operator that does exactly what we want, if we order it so that next free time is the first element and thread index the second.

The split between main() and JobQueue seems a bit stilted to me. I'd expect main() to read the initial inputs and use them to create an appropriately-dimensioned JobQueue, and then to feed inputs to it. And finally, to inspect the next starting time and to print it. At present, JobQueue seems to have two responsibilities mixed together - it's both calculating and performing I/O and it's hard to separate the two (e.g. for unit testing).

I don't think there's any need to store the job values or the start times - we can simply return the start time from the insert_job() method and print it immediately.

~~To store times, long long could be overkill - for a range of 0 to 10⁹, we only need 30 bits, so we could use uint_fast32_t for those.~~ For the number of workers, std::size_t is probably most appropriate, as we'll use that for capacity in our heap.

We need a range of 0 to 10¹⁴ for times, so long long could be overkill. We could possibly get away with maintaining a single "epoch" value and only store the last few dozen bits in a uint_fast32_t in the pair - use one or more high-order bits to determine whether value is relative to the current epoch or to the previous one. Remember that with each individual job limited to 10⁹ units, there's only two valid epochs to consider at any given time.

Community · Accepted Answer · 2020-06-10 13:24:26Z

Sadly, I have to disagree with some parts of Toby Speights answer.

Heap performance

Using a std::priority_queue (or falling back onto std::make_heap) sounds like a good idea at the first moment - until you look at what heapify actually does.

std::priority_queue::pop basically has to move the last element into the first place, and then bubble it down using likely \$\mathcal{O}(\log T)\$ bubble down operations (since it was on the lowest level before, it will likely return there).
std::priority_queue::push (not push_back) then inserts the changed element at the last position of the heap, then bubbling it up until it reaches its final position. This depends on the actual time values on the heap, so I call this \$R\$ bubble up operations.
heapify instead changes the value of the front element directly, and then bubbles it down. Using the values from above, this will take \$\mathcal{O}(\log T) - R\$ bubble down operations.

Unless the newly inserted value will be on the lowest level of the heap (i.e. \$R = 0\$), heapify will require less operations (\$\mathcal{O}(\log T) - R\$) than pop + push (\$\mathcal{O}(\log T) + R\$).

Time compression

The suggested time compression format, while likely working, seems a bit overkill. Unless testing shows it's measurably faster in the intended use case I'd stay with the more readable current version.

@TobySpeight: Usually, I would have written a comment about this instead of an answer, but I couldn't fit my arguments into one. I fully agree with the other parts of your answer not mentioned here, though. — hoffmale, CommentedDec 4, 2018 at 1:21

Stack Exchange Network

Parallel processing simulation

Input

Output

Constraints

My Approach

Complexity

Question

2 Answers 2

Heap performance

Time compression

Hot Network Questions

Parallel processing simulation

Input

Output

Constraints

My Approach

Complexity

Question

2 Answers 2

Heap performance

Time compression

Related

Hot Network Questions