Optimisation algorithms for parallel machine scheduling problems with setup times

Parallel machine scheduling is a problem of high practical relevance for the manufacturing industry. In this paper, we address a variant in which an unweighted combination of earliness, tardiness and setup times aggregated in a single objective function is minimised. We compare an Evolutionary Algorithm (EA) approach with a variant of local search implementing a probabilistic Best Response Dynamic algorithm (p-BRD) inspired by game theoretic considerations. Our p-BRD algorithm achieved promising results outperforming the EA on a series of test sets.


INTRODUCTION
The parallel machines scheduling problem in which jobs have to be assigned to machines at particular times is one of the oldest problems of operations research with high practical relevance for the manufacturing industry. Minimising the time it takes for a product to pass through manufacturing not only aects the revenue of the company but also improves customer satisfaction [8]. The objective of this well known problem, is to assign jobs to machines at particular times, such that a schedule is created which completes all jobs if possible on time or with minimal deviation from due dates and in full.
In this work, we address a variant of the parallel machine scheduling in which the goal is to nd an optimal machine schedule minimising an unweighted combination of earliness, tardiness and setup times that are aggregated in a single objective function (AOF). Tardiness is obviously a crucial aspect of planning as it leads to Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). GECCO '21, July 10-14, 2021, Lille, France © 2021 Copyright held by the owner/author(s). ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00 https://doi.org/10.1145/nnnnnnn.nnnnnnn delayed delivery intrinsic in the plan, earliness is also a cause of customer dissatisfaction, especially in times of just-in-time production. Additionally, setup times indicate considerable eort in setting up machines which is not only time consuming but also likely to cause considerable costs. We intentionally left out weighting factors in that sum considering the criteria as of equal importance. This can be changed according to specic requirements.
In literature, production planning problems are categorised by multiple criteria [1] such as the number of involved machines, equality of machines, non-preemptive jobs or the relevance of setup times. Most research in this area focuses on the minimisation of the makespan, which is the maximum completion time of all jobs [7,9]. However, tardiness and earliness optimisation are covered by some papers as well [2,3,10,18]. Setup times are often neglected in academic research [1], but the sequence of jobs may have a huge impact on how a production schedule is developed [18]. This is why this work deals with a non-identical parallel machine environment regarding sequence dependant setup times, earliness and tardiness.
As the described problem is known to be NP-hard [16], we face a typical case for applying meta-heuristics such as Evolutionary Algorithms (EAs). This has successfully been done in the past [18], we revisit this technique and discuss an EA implementation minimising for our problem setting. Local-search algorithms also have been applied and provided good results in solving scheduling problems (e.g. [17]). We extend this strand with a new variant inspired by game theory including a non deterministic stochastic element. We call this approach p-Best Response Dynamic (p-BRD).
We compare both algorithms in terms of solution quality and runtime by applying them to a series of test sets.

P-BRD
Best Response Dynamics is an iterative algorithm for searching Nash equilibria in = player non-cooperative games. Agents successively change their strategy to be a best response to the strategies of at least a subset of the other agents [15]. A combination of strategies for which none of the agents can be better o by a unilateral change of its strategy is called a Nash equilibrium [13].
Potential games are a subset of strategic games for which the eect of a change of strategy can be expressed by a single global function [12]. For potential games, best response dynamics always converges to a Nash equilibrium [15].
We view the machine schedule optimisation problem as a game called the scheduling game. Payo is interpreted as costs and agents therefore seek to minimise their utility. Jobs are represented as participating agents in the scheduling game and each agent tries to get an optimal position on a machine. The position directly contributes to the value of AOF. This function represents the utility for each agent and therefore our game is obviously a potential game.
Agents interact in the search for an optimal solution by mutually changing their positions. The procedure terminates in a Nash equilibrium, which embodies the nal production plan that is returned by the algorithm. By using a stochastic element that excepts intermediate results even when its counter indicated by the objective function the search process can be considered to consist of an exploration and an exploitation phases.
Input to BRD is a set of jobs to be scheduled together with available machines and a setup matrix. Each job agent 8 has full access to information about the material it produces, its due date and the machines M 8 it can be executed on together with the production time for each machine and their current start and end time. In an initialisation phase job agents are rst sorted in ascending order by due date and maximum production times and then get assigned to an initial position on a suitable machine.
Through this initialisation phase, doubly linked lists ; " of job agents are constructed for each machine " 2 M. At rst, articial INIT agents are placed on the machines that become head of the lists. They do not participate in the scheduling game, their end denes the rst availability of the machine dening the start of the planning horizon. Start and end date of jobs placed in the list dene time spans in which the machine is available. For each job agent and each machine it can be processed on gaps in the linked list suciently large for production and setup time are located. The job agent chooses the gap in which it can be placed as close to its due date as possible. Therefore, each job 9 seeks the predecessor : for which ?A43 ( : , 9 ) dened as the value |2 9 3D4 9 | when 9 is inserted as successor of : is minimal while respecting setup times and shifting j as close to is due date as possible.
p-BRD now starts with the same list of jobs as the initialisation phase with each job having an initial position on a machine. Each job agent 0 has a set of options to change its strategy, i.e. its position in the schedule in each round. It can either swap its position with another agent 0 0 if machines match (they are compatible (see Algorithm 1) and production times allow the change without interfering with start and end dates of other agents or -in case that 0 0 cannot run on the machine 0 is assigned to -change its position by moving behind 0 0 if 0 0 has no successor. A swap is performed if it improves the AOF. However, rejecting options too early entails the threat to be stuck in a local minimum too early. Therefore, non improving swaps increasing AOF are accepted with probability ?. This probability decays by multiplying it with a damping factor 3 in each iteration thus successively moving the process from exploration to exploitation. BRD stops as soon as no more swaps occur and the game has reached a Nash equilibrium. Details of handling setup times determined by the products of the jobs are straightforward and left out for the sake of brevity. The best solution found during exploration is always preserved, so in case an inferior solution is accepted due to ?, the best solution found is still accessible and will be returned at termination.

Algorithm 1: p-BRD
Data: list of job agents each with initial positions on machine; probability p; damping factor d Result: B4@ optimised assignment of jobs repeat for 9 in do for : in \ { 9 } do if machines of 9 and : are compatible then if (swap( 9 , : ) improves AOF) or (rand<p) then BF0? ( 9 , : ); 1A40:; end end else if 9 can run on machine of : and : has no successor then if moving 9 after : improves AOF then move 9 after : ; end end end end ? = ? · 3 ; until no more swaps; In this implementation agents take the rst opportunity that improves AOF without checking further alternatives. Figure 1 illustrates the mechanism of swapping. Let 3 be the agent currently in action. 3 could swap with 2 (on the same machine) or -if machines allow for it -with 1 . Alternative option is placing it behind 1 . Note that # ) 64=C1 and # ) 64=C2 are only available as predecessors if they are alone on their respective machine.

EVOLUTIONARY ALGORITHM
A vanilla Evolutionary Algorithm using the typical operators to evolve the population -e.g. tournament selection, single point crossover, swap mutation, and tournament replacement -is compared against p-BRD. The tness of an individual is determined by its phenotype, i.e. AOF of the production schedule that is generated by the decoder. The decoding of the GA is adapted according to the given problem.

DATA SETS
For the rst test case we create three data sets to imitate real world production setups. Our method for creating test cases is derived from the approach proposed by Bagheri and Zandieh (2011) [4], Vilcot and Billaut (2008) [19] and Demirkol et al. (1998) [6]. The structure of a test set can be controlled by a number of input parameters that generate a diversity analogous to problems encountered in real world data.
We generate three test sets representing dierent scenarios. Data set U contains a considerable amount of jobs but only a few machines. In addition, the jobs are due early provoking a high delay and a high importance of setup times. In contrast Data set S contains jobs that vary strongly in their due dates and aims for an optimal sequence according to the deviation of the jobs from their due dates. Data set L focuses on the scalability of the algorithms in terms of runtime using a large number of jobs. The datasets can be obtained URL.
For the second test case the published Oliver 30 benchmark for the travelling salesmen problem (TSP) [14] is reformulated as a scheduling problem for which solutions can be assessed by AOF. The TSP is dened for 30 locations with the distance for the shortest round trip (i.e. a tour visiting each location once, starting and ending at the same location) being 419 [11].

RESULTS
Evolutionary Algorithms and p-BRD depend on parameter settings that have a strong inuence on the performance. Therefore, we applied Tree-structured Parzen Estimators [5] -which is a form of Bayesian optimisation -to ne tune the parameter settings. The obtained parameters can be found URL.
The p-BRD algorithm as such produced poorer results on the O30 data set (see Figure 2) using AOF. This is explained by the fact that the algorithm aims to minimise AOF instead of the tour length which not only includes setup times in the sequence of jobs but also the setup time between the last and the rst job. However, the algorithm can easily be adapted to this eect by adjusting the objective function. The object oriented implementation allows for this with a simple subclass. With the modied objective function p-BRD nearly always reached a tour length near to 424 with a mean value of 424.39 and a standard deviation of 0.39. The Evolutionary Algorithm scatters rather widely between 423 and 565, while the mean is signicantly larger than that of p-BRD (see Table 1) For data set L (see Figure 3), p-BRD performs better than the GA with a smaller standard deviation (see Table 1). Potentially the Evolutionary Algorithm has not yet converged ultimately after the dened number of generations for the large scheduling problem. However, increasing the number of maximal iterations might improve its performance but would further deteriorate runtime.
On the data set U the dierences are clearly less pronounced in terms of the mean but show a distinction in terms of standard deviation in favour of p-BRD.  The distribution for the data set S of p-BRD has a low standard deviation, due to the very low probability to accept temporarily worse solutions that was found during parameter optimisation (? = 0.0012). The standard deviation of the Evolutionary Algorithm is larger, but the algorithm often nds a better result than p-BRD. The observed behaviour may be explained by the fact that the initialisation strategy for p-BRD that is currently implemented (see Section 2) is well suited to solve the problem for the spread data set and exploration of the search space is less necessary.
In conclusion p-BRD produces better results regarding AOF on two of the data sets and comparable ones on the third, while being signicantly faster in each case.

CONCLUSION AND FUTURE WORK
As there are no publicly accessible benchmark tests available for presented machine scheduling problem, we created new test sets of data for our comparison. In the experiments, p-BRD showed good results with better runtime behaviour than the Evolutionary Algorithm. Therefore, p-BRD can be considered as an algorithm feasible for real world application. The vanilla EA should be replaced by a state of the art algorithm to further validate this impression. Additional research on p-BRD may investigate more elaborate swapping strategies and analyse the eect on result and runtime. Applicability on further problems will be in focus as well an extension of the heuristics that closes gaps in the timeline of the production schedule.