Simplified question: I have a dataset of how well certain agents perform on certain tasks, and based on this I've trained a model that can make predictions for newly incoming tasks. I'd like to make an algorithm that distributes tasks amongst agents to maximise performance.
One way would be to make a prediction for each scenario - each agent assigned to each new set of tasks (note that one agent can work on more than one task at a time) -, thus iterating through all possible solutions, and summing performance and finding the max.
This, however, would be very complex - to iterate through every possible combination an finding a maximum.
Is there a better way to optimise (maximise performance) without having to iterate through all possible solutions? How could I do that?
Original question:
Let's take a look at the following setup:
- There are a number of "agents"
- Agents work on "jobs"
- Each agent might be better at certain jobs, but the jobs get assigned by random, and not based on performance (good performance = the least amount of time spent working)
I'd like to know if there's any ML algorithm that can learn how to assign jobs to agents as to minimise time spent working (maximise performance).
I know that with methods like regression or classification, or even deep learning you can teach the algorithm to predict a target variable, but in this case I do not want to reach a target variable, but want to increase performance (minimise time spent working). Is there such an algorithm/method that could "learn" from past performance reviews and assign new jobs to agents to maximize performance?
Edit:
A bit more formalised:
I teach an SVM (or a regressor, or a neural net), that will be able to predict how well a certain agent A1 performs on a certain (type of) task T1 (call this performance P1). So when a new task comes in, I'll be able to predict P1 based on A1 and T1. BUT! And here comes the question. Now that I can predict P1 based on A1 and T1, how do I use this knowledge to actually assign ALL the tasks - T1...Tn amongst all the agents A1...Am as to maximise the performance: sum(P1, P2, ..., Pi)?