I am working on a project for a non-profit organization where they are trying to help students with special needs to match to different project topics. Each student will have four preferences and a set of supervisors will have their list of preferences on the topics they supervise.
The solution I look for is an algorithm which can find an optimum solution to match students to project topics and supervisors.
I have done extensive reading on SPA, HR and other Greedy Algorithms and even tried a flavour of Genetic algorithm. So far I have nothing but stress.
Here is the flow of the program.
P1, P2, P3, P4, P5 ...... Pn ... SP1, SP2, SP3 .... SPn
In the above list, P1 ... Pn are existing topics and SP1...SPn are suggested topics.
Let's say after this round, we have a list of supervisors with the following preference.
supervisor | Topics of Interest | No. Of Groups
L1 | P1, P3, P4 | 2
L2 | P5, P2, P9 | 1
L3 | P1, P3, P4 | 1
L4 | P1, P3, P4 | 4
L5 | SP1, P3, P8 | 3
L6 | P32, P3, P40 | 3
After the above round, we know that there are only supervisors who can Supervise students on the following topics.
P1, P2, P3, P4, P8, P9, P32, P40, SP1
student | Pref1 | Pref 2 | Pref 3 | Pref 4 |
S1 | P4 | P1 | SP1 | P5 |
S2 | P1 | P9 | SP1 | P5 |
S3 | P3 | P1 | P2 | P5 |
S4 | P4 | P1 | P40 | P5 |
S5 | P4 | P32 | SP1 | P5 |
...
Sn | P9 | P1 | SP1 | P5 |
Now, once the students pick the preference, We will then decide a number MAX_GROUP_SIZE and we will run our algorithm to group these students into partitions where we will
a. Group students with similar interests into same group ( eg. We add Students who picked P1 as their pref1 and fill in the rest with pref2, pref3, pref4 when they don't have groups for their first choices).
b. Assign a supervisor to a group where he has shown interest in the project ( Ideally, every students first preferences or best-matched project)
c. We need to make sure that we don't overload the supervisor, if a supervisor has shown interest in P1, P2, P3 and mentioned that he can only supervise 2 projects, then we should only add him to 2 projects.
So far, I have been trying the above-explained algorithms and I still don't think I have a justified solution for the students. I prefer a solution which is more biased towards the students since they have special needs. If anyone can point me in the right direction or can provide me with a well-defined algorithm or an implementation I would not only appreciate the effort but I would buy you a coffee as well.
Speaking as someone who does this kind of thing for a living, the core of this problem is pretty similar to a standard problem called "capacitated facility location", which at the scales I imagine you're dealing with can be handled cleanly by integer programming. I can vouch for the free Google OR-Tools (disclaimer: yep, that's my employer; nope, not speaking for them), but you have several other free and paid options (SCIP, lpsolve, Gurobi, CPLEX).
Integer programming is pretty nice: you declare some variables, write some constraints and an objective on those variables, push a button and get a(n often optimal) solution.
Here we're going to have the following binary variables:
For each pair of (student i, potential project j for student i), we have a 0-1 variable Assign[i,j] that is 1 if that student does that project and 0 otherwise.
For each pair of (advisor k, potential project j for advisor k), we have a 0-1 variable Avail[k,j] that is 1 if that advisor does that project and 0 otherwise.
The objective is
minimize sum_{i,j} PreferenceValue[i,j] Assign[i,j],
where PreferenceValue[i,j] has lower values to indicate a student's more preferred projects. You could use 1,2,3,4 for example for first, second, third, fourth choice; or bias toward first choices with 1,2,2,2; or bias toward fairness with 1,4,9,16. Lots to play with, have fun. As requested, this objective does not care what we make the advisors do.
The constraints are
for each student i, sum_j Assign[i,j] = 1,
i.e., each student is assigned exactly one project;
for each advisor k, sum_j Avail[k,j] ≤ MaxGroups[k],
i.e., no advisor has more work than they want;
for each student i and project j, Assign[i,j] ≤ sum_k Avail[k,j],
i.e., each student can only be assigned to a project if it is available;
for each project j, sum_i Assign[i,j] ≤ MaxGroupSize,
i.e., each group has at most MaxGroupSize students.
OR-Tools doesn't let you write "for each" and "sum" like that, so you'll have to write a short program to expand them. Read the OR-Tools documentation.
Hopefully this is enough of a start that when you build it and it inevitably disappoints your expectations, you can figure out how to add more constraints to prevent solutions that you don't want. Good luck!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With