IAES Inter national J our nal of Articial Intelligence (IJ-AI) V ol. 14, No. 4, August 2025, pp. 2826 2838 ISSN: 2252-8938, DOI: 10.11591/ijai.v14.i4.pp2826-2838 2826 Le v eraging machine lear ning f or column generation in the dial-a-ride pr oblem with dri v er pr efer ences Sana Ouasaid, Mohammed Saddoune Machine Intelligence Laboratory , Department of Computer Science, F aculty of Sciences and T echnologies, Uni v ersity of Hassan II Casablanca, Casablanca, Morocco Article Inf o Article history: Recei v ed Jun 18, 2024 Re vised Mar 24, 2025 Accepted Jun 8, 2025 K eyw ords: Binary classication Clustering-based initialization Column generation algorithm Dial-a-ride problem Pricing subproblem ABSTRA CT The dial-a-ride problem (D ARP) is a signicant challenge in door -to-door trans- portation, requiring the de v elopment of feasible schedules for transportation re- quests while respecting v arious constraints. This paper addresses a v ariant of D ARP with time windo ws and dri v ers’ preferences (D ARPDP). W e introduce a solution methodology inte grating machine learning (ML) into a column gen- eration (CG) algorithm frame w ork. The problem is reformulated into a master problem and a pricing subproblem. Initially , a clustering-based approach gener - ates the initial columns, follo wed by a customized ML-based heuristic to solv e each pricing subproblem. Experimental results dem onstrate the ef cienc y of our approach: it reduces the number of the ne w generated columns by up to 25%, accelerating t he con v er gence of the CG algorithm. Furthermore, it achi e v es a solution cost g ap of only 1.08% compared to the best-kno wn solution for lar ge instances, while signicantly reducing computation time. This is an open access article under the CC BY -SA license . Corresponding A uthor: Sana Ouasaid Machine Intelligence Laboratory , Department of Computer Science, F aculty of Sciences and T echnologies Uni v ersity of Hassan II Casablanca Casablanca, Morocco Email: sana.ouasaid1-etu@etu.uni vh2c.ma 1. INTR ODUCTION The dial-a-ride problem (D ARP) is a well-kno wn and thoroughly e xplored area of study that focuses on designing ef cient routing schedules for a eet of v ehicles to fulll transportation requests, each in v olving a pickup and deli v ery point [1]. Optimizing D ARP requires balancing cost-ef fecti v eness with high service quality . It considers f actors such as customer ride times and de viations from desired departure or arri v al times. This challenge becomes e v en more comple x when incorporating dri v er preferences, as in the D ARP with time windo ws and dri v ers’ preferences (D ARPDP) [2], [3]. In D ARPDP , dri v ers aim t o maximize serv ed requests while also considering preferred destinations and arri v al times, adding another layer of comple xity to the optimization process. Existing approaches for D ARPDP , such as those based on iterated local search metaheuristics, ha v e sho wn promise b ut struggle with lar ge-scale instances [3]. This scalability issue moti v ates the need for more ef cient solution methodologies, particularly those well-suited to lar ge-scale problems. One such method is column generation (CG), an iterati v e optimizat ion technique that starts with a restricted set of columns representing a feasible solution. The master problem is solv ed using this initial subset, producing a current solution and dual v ariable v alues (shado w prices). These dual v ariables guide the pricing subproblem, which identies ne w columns with ne g ati v e reduced costs. If such columns are found, the y are added to the master problem, and the process is repeated until no further impro v ement is possible. J ournal homepage: http://ijai.iaescor e .com Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 2827 CG has pro v en ef fecti v e for addressing D ARP and similar transportation challenges. Garaix et al . [4] optimized passenger occupanc y in lo w-demand scenarios through ef cient continuous relaxation of a set- partitioning model. Hybrid approaches further enhance CG, such as P arragh and Schmid [5], who com- bined it with v ariable and lar ge neighborhood search to impro v e solut ions. Rahmani et al . [6] used dynamic programming within a branch-and-price frame w ork to handle ride-time constraints in t he pricing problem. Be yond D ARP , CG has been applied to other transportation conte xts, including b us scheduling [7], ride-sharing [8], and selecti v e D ARP with Benders decomposition [9]. While CG is ef fecti v e, inte grating machine learning (ML) into D ARP solutions of fers ne w possibilities. F or e xample, Mark o v decision processes were used to optimize dynamic D ARP , impro ving customer insertion e v aluation [10]. Random forests (RF) were utilized for operator se lection in dynamic E-AD ARP with a tw o-phase metaheuristic [11]. Thompson sampling ef fecti v ely modeled dual v alues of requests, yieldi ng f aster solutions for lar ge D ARP instances [12]. An adapti v e logit model signicantly sped up traditional parallel insertion heuristics for generating feasible solutions [13]. Building on these ef forts, inte grating ML with CG enhances ef cienc y by guiding k e y decisions lik e pricing and column selection [14]. F or e xample, deep learning has been used to predict ight connection prob- abilities [15] and promising aircre w pairings [16], while support v ector machines (SVM) ef ciently generated high-quality columns for graph coloring [17]. This inte gration of ML into CG opens promising a v enues for impro ving lar ge-s cale D ARP solutions. Recent w ork highlights its potential, particularly for intelligent column selection. Morabit et al . [18] used a graph neural netw ork for column selection in v ehicle and cre w scheduling and v ehi cle routing, reducing computation time by up to 30%. The y later e xtended this to arc selection, further rening the process [19]. Chi et al . [20] proposed a Q-learning-based single-column selection polic y with a graph neural netw ork, outperforming greedy heuristics. Gerbaux et al . [21] de v eloped a CG heuristic for multi-depot electric v ehicle scheduling using graph neural netw orks and transfer learning for lar ger instances. Addressing the D ARP using ML methods is a relati v ely recent area of research and remains undere x- plored. While signicant ef fort s ha v e focused on inte grating ML into CG algorithms, particularly for problems lik e cre w scheduling and v ehicle routing, the application of s uch techniques to the D ARP has been limited. Existing studies in this domain primarily enhance the pricing subproblem by predicting promising v ariables, such as routes or cre w assignments, to impro v e optimization. Ho we v er , for the D ARPDP—a v ariant introduced in prior w ork and initially tackled using a metaheuristic—the main challenge e xtends be yond the pricing sub- problem to include identifying the most rele v ant requests to be serv ed. This aspect is critical and requires tar geted strate gies. Our approach b uilds upon this foundation by inte grating ML into the request selection process. This guides the CG algorithm, of fering a no v el perspecti v e for addressing the D ARPDP . This paper addresses the scalability g ap in e xisting D ARPDP solutions by introducing a no v el approach that combines ML and CG. This inte gration of fers a signicant adv ancement o v er e xisting methods by le v eraging ML s predicti v e capabilities to enhance the ef cienc y of CG. Our approach reformulates the D ARPDP into a master problem and a pricing subproblem within the CG frame w ork. The master problem selects optimal routes (columns), while the pricing subproblem generates promising ne w routes. W e enhance the CG process with tw o k e y inno v ations: a clustering-based strate gy for initial CG to ef fecti v ely structure the search space, and a customized ML-based heuristic for solving the pricing subproblem . This ML heuristic learns from past instances and adapts to the e v olving solution, guiding the generation of high-qual ity columns and accelerating con v er gence. This tar geted approach allo ws us to tackle lar ge-scale D ARPDP instances more ef ciently than e xisting methods, pro viding v aluable insights into the interplay between M L and optimization algorithms for comple x routing problems. The remainder of this paper is or g anized as follo ws: section 2 details our proposed methodology , including the inte gration of ML into the request selection process for CG. In section 3, we present the e xperimental results and e v aluate the performance of our approach. Finally , section 4 concludes the paper and discusses potential future research directions. 2. METHOD Although traditional CG w orks well for solving comple x problems, early attempt s to apply it re v ealed limitations in e xploring the entire range of potential solutions. Frequently , the algorithm became stuck in local minima, pre v enting the sub-problem from generating ne w columns. Our proposed frame w ork impro v es the traditional CG process by inte grating a learning mechanism. This mechanism impro v es both the quality of generated columns and the algorithm’ s con v er gence speed. Le ver a ging mac hine learning for column g ener ation in the dial-a-ride pr oblem with ... (Sana Ouasaid) Evaluation Warning : The document was created with Spire.PDF for Python.
2828 ISSN: 2252-8938 As outlined in Algorithm 1, the proposed ML-CG algorithm starts by training a binary clas sication model on optimally solv ed problem instances (line 1). T o solv e a ne w instance, it rst e xtracts problem- specic features (line 2). Then, it generates an initial set of columns using a tw o-phase heuristic: clustering and the ”randomized nearest heuristic” (RNI) (line 3). These columns are used as the initial v ariables for solving the restricted master problem (RMP). The main part of the algorithm is an iterati v e loop (lines 4 to 11) that continues until the stopping condition is met. In each iteration, the RMP is solv ed using the columns in the pool up to that point (line 5). This solution pro vides the v alues for the dual v ariables of the constraints. Then, statistical features are computed based on the solutions found up to the current iteration (line 7), and these features are combined with the original problem-specic features and the dual information to create a comprehensi v e set of features (line 8). The output of the model guides a di v ersication heuristic to generate ne w columns with ne g ati v e reduced costs, thus further reducing the objecti v e function (line 9). These ne w columns are added to the pool of v ariables for the RMP . Once the algorithm nishes iterating, it returns the nal solution by sol ving the RMP on the set of all columns generated during the e x ecution of the algorithm (line 12). Algorithm 1 Column generation algorithm (ML-CG) Requir e: P : Problem Instance, T : Maximum T ime Limit Ensur e: s : Solution 1: M T rainBinaryClassicationModel () 2: f 1 ExtractProbFeatures ( P ) 3: C ClusteringBasedRNH () 4: while cpu T do 5: s Solv eMP ( C ) 6: D Solv eRLP ( s ) 7: f 2 ExtractStatsFeatures ( s ) 8: features Combine ( f 1 , f 2 , D ) 9: N GeneratNe wColumns ( P , features , M ) 10: C C N 11: end while 12: r etur n s The ML-CG algorithm is b uilt upon three fundamental components that w ork together to i terati v ely impro v e the solution. First, it be gins with the construction of an initial pool of columns that serv e as a starting point for the optimization. Then, through the formulation of the RMP and its subproblems, the algorithm repeatedly generates ne w columns with ne g ati v e reduced costs to enhance the RMP and guide the solution process. 2.1. Generation of the initial set of columns T o generate initial columns, we propose a tw o-stage heuristic. In the rst stage, requests are separated into clusters, and each cluster is assigned to a v ehicle. W e use three clustering approaches to di v ersify the initial columns: K-means, K-medoids, and random clustering. Then, for each cluster , a route is constructed using the randomized nearest insertion heuristic (RNI) [22]. The method is further detailed in Algorithm 2. Algorithm 2 outlines the process of generating initial columns based on a gi v en problem instance and clustering methods. F or each method, requests are clustered, and for each cluster , a route is b uilt using the RNI heuristic. Starting with a rout e from the v ehicle’ s origin to its destination, at each iteration, RNI ranks the requests in increasing order of d , which is dened as the sum of four distances: the distances between their respecti v e pickup points, the deli v ery points, and the cross-distances between the deli v ery point of the rst request and the pickup poi n t of the second request, and vice v ers a. RNI inserts the th element in the ordered set ( represents the randomized f actor in the heuristi c) into the route, and the remaining requests are sequentially inserted at the best feasible position, which minimizes the increase in the route’ s cost. The nal route is con v erted into a column and added to the ColumnsPool. Int J Artif Intell, V ol. 14, No. 4, August 2025: 2826–2838 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 2829 Algorithm 2 Clustering based RNH Requir e: P : Problem instance, C M : Clustering methods Ensur e: I : Initial columns pool 1: C ol umnsP ool 2: f or M in C M do 3: Cluster requests ( P .requests) into K groups, each associated with a v ehicle in P .v ehicles 4: f or k in { 1 , . . . , K } do 5: r eq uests C k .r eq uests 6: Rank the requests in C k in increasing order of d 7: Select the th request as the seed request ( is a randomized f actor()) and insert it into r oute k 8: Delete the seed request from r eq uests 9: f or r eq uest in r eq uests do 10: Insert request into r oute k using the least cost insertion method 11: end f or 12: C ol umnsP ool C ol umnsP ool { Con v ert r oute k to column } 13: end f or 14: end f or 2.2. Restricted master pr oblem and pricing subpr oblem T o implement CG, the original mix ed-inte ger programming (MIP) model outlined in [3] needs to be reformulated as a RMP . This RMP consider s a subset of columns for the solution. Subsequently , one or more subproblems are solv ed using the current dual solution of the RMP to i d e ntify ne w , adv antageous columns to add to the RMP to enhance the objecti v e v alue. Let be the set of all possible routes. A feasible route is a Hamiltonian path that starts at the dri v er’ s origin node, ends at the dri v er’ s destination node, and visits only a subset of nodes corresponding to the requests. Dene k as the subset of containing routes assignable to a specic v ehicle k K . Use the v ariable y r k to indicate the selection of a potential route r k , setting y r k to 1 when selecting r k and to 0 otherwise. Represent the cost of a generated route r k as c r k . Additionally , set the constant a i,r k to 1 if route r k serv es request i . Using the LP relaxation of the Dantzig-W olfe reformulation, we can describe the D ARPDP with the follo wing model. Min X k K X r k k c r k y r k (1) Subject to the constraints: X k K X r k k a i,r k y r k 1 i P ( π i 0) (2) X r k k y r k 1 k K ( µ k 0) (3) X i P X k K X r k k a i,r k y r k ε ( γ 0) (4) In this (master problem), the objecti v e minimizes the total routing cost of the v ehicle routes used; t he rst constraint ensures that each request is co v ered by at most one v ehicle route, and the second constraint ensures that at most one route is selected per v ehicle. Since the reje ction of a request is allo wed, the third constraint restricts the number of rejected requests. The dual v ariables associated with each constraint are specied between parentheses ne xt to the constraint in the model. Let π i be the nonne g ati v e dual v ariable associated with the visit of request r (constraints 2), and let µ k be the nonpositi v e dual v ariable associated with the eet size constraint (constraint 3), and let γ the one restricts the number of rejected request s. W e deri v e the RMP by replacing k in the master problem with a subset k for which t he problem is primal feasible. Le ver a ging mac hine learning for column g ener ation in the dial-a-ride pr oblem with ... (Sana Ouasaid) Evaluation Warning : The document was created with Spire.PDF for Python.
2830 ISSN: 2252-8938 The subproblem, corresponding to a column in the RMP , determines the rout e, schedule, and passenger assignments to a single v ehicle. It retains the same v a riables as the original problem formulation [3], e xcluding the k inde x, as it no w pertains to a specic v ehicle. The objecti v e function minimizes the reduced cost of the route, considering the route cost and dual prices from the RMP: min ( c r k X i P a i,r k π i µ k γ X i P a i,r k 0 : r k k ) (5) The cost of a route, c r k , aligns with the objecti v e function of the original formulation b ut is specic to the indi vidual route k . The subproblem inherits the constraints from the original formulation. These ensure that the generated route meets all feasibility requirements, including v ehicle capacity , time windo ws, and passenger service guarantees. 2.3. Generate new columns In the probl em addressed, rejecting a small portion of the requests is allo wed i f it benets the o v erall objecti v e v alue. This means that the optimal solution does not ne cessarily serv e all requests, which mak es it challenging to identify the most protable combination of requests to serv e. It is clear that di v ers ifying the search in the CG algorithm is necessary , b ut this should be done intelligently to a v oid slo wing do wn the process. T o address this, we train a model on problem-specic features, historical soluti ons, dual v alues, and statistical measures to predict request protabil ity . Using the predicted probabilities, we rank requests from most to least protable, which will guide a di v ersication heuristic called the learning-based di v ersication heuristic (LBDH) to solv e each subproblem. In the follo wing subsections, we rst discuss the k e y aspects of training the binary classication model, including reformulating the request selection problem as a binary classication task, selecting and e xtracting rele v ant features, choosing suitable classication algorithms, optimizing h yperpa- rameters, and e v alua ting the model’ s performance using specic metrics. Ne xt, we describe the di v ersication process, where the rank ed list of requests intelligently guides the di v ersication heuristic to e xplore the most promising combinations of requests. 2.3.1. T raining a classication model f or r equest pr otability This section focuses on training a binary classication model to predict the protability of requests, a crucial step within our CG algorithm. W e a p pr o a ch this as a standard classication problem. Each training e xample is represented as a tuple ( f r R n , c r 0 , 1) . He re, f r denotes the feature v ector , encompassing n distinct characteristics of the request. The v ariable c r acts as the binary class label, taking on a v alue of 1 if the request is part of the optimal solution and 0 ot herwise. By learning from these e xamples, the model aims to accurately classify ne w requests, guiding the CG process ef fecti v ely . a. Collecting data Initially , we aimed to construct a training set using small problem instances solv ed to optimality and then apply this kno wledge to tackle lar ger ones. Ho we v er , the number of e xamples pro v ed insuf cient, and the ef cienc y of ML algorithms relies on a lar ge dataset. T o address this, we e xtended the dataset by generating additional sm all instances in the same manner as in [3] and solving them optimally using a commercial solv er . Ultimately , we constructed 1000 e xamples in total. Gi v en the signicant imbalance in our dataset , where the majority cl ass (’1’) co v ers almost 80% of the sample, undersampling techniques are applied to address this issue. These techniques in v olv e remo ving instances from the training dataset that are part of the majority class to impro v e the balance between classes. This adjustment ensures that majority and minority classes are on a similar scale, enhancing the learning algorithm’ s sensiti vity to the minority class. In this study , we addressed class imbalance in each classication model by combining KNNOrder and synthetic minority o v er -sampling technique (SMO TE). KNNOrder , a KNN-based undersampling technique introduced by [23], reduces the imbalance by selecti v ely remo ving e xamples from the majority class. W e then apply SMO TE, a widely used o v ersampling method that generates synthetic e xamples for the minority class [24], to further balance the dataset. W e demonstrated the ef fecti v eness of KNNOrder + SMO TE by compared it with v arious other methods deal with unbalanced data: no sampling (serving as a baseline), random undersampling (randomly remo ving majority class e xampl es), random o v ersampling (randomly duplicating minority class e xamples), SMO TE (generating synthetic e xamples for the minority class through interpolation), Int J Artif Intell, V ol. 14, No. 4, August 2025: 2826–2838 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 2831 and nally KNNOrder (tar geted remo v al of majority class e xamples near the decision boundary). The T able 1 presents the results obtained in terms of accurac y , F1-score, precision, and recall for each method. T able 1. Comparison of sampling methods Method Accurac y F1-score Precision Recall No Sampling 0.80 0.87 0.91 0.83 Random Undersampling 0.85 0.90 0.88 0.92 Random Ov ersampling 0.79 0.86 0.91 0.82 KNNOrder 0.83 0.89 0.88 0.91 SMO TE 0.82 0.88 0.90 0.87 KNNOrder + SMO TE 0.86 0.91 0.88 0.94 The results demonstrate that KNNOrder + SMO TE combination achie v es strong performance, with an F1-score of 0.91 and a recall of 0.94, surpassing the other tested techniques in terms of these metrics. A high recall suggests that this approach will be more ef fecti v e at identifying instances from minority class. Ho we v er , it is important to note that these results are specic to the dataset used in this study . Further in v estig ation with other datasets is necessary to conrm the rob ustness and generalizability of the KNNOrder + SMO TE method. b . Feature e xtraction i) Problem-specic features: the features e xtracted describe each re qu e st r R in the conte xt of the D ARPDP problem, namely: Request de gree: indicates the number of requests that can be serv ed simultaneously with this request without violating an y constraints. Request ride time: represents the duration between a request’ s pickup and drop-of f. Correlation between request pickup and other pickups: represents the a v erage Pearson correlation coef - cient between the pickup node of a gi v en request (dened by its x-location, y-location, earliest time, and latest time) and the pickup nodes of all other requests within the problem instance. Correlation between request deli v ery and other deli v eries: similar to pickup correlation, this metric as- sesses the relationship between the deli v ery node of a gi v en request and the deli v ery nodes of all other requests within the problem instance. Dual information: indicates ho w much the o v erall s o l ution cost w ould change in response to a mar ginal relaxation of the co v erage constraint for a specic request. Request distance to v ehicles: represents the a v erage distance of each request from the set of all v ehicles. ii) Statistical-based features: in addition to the problem-specic features, tw o statistical measures are intro- duced, inspired by [25]. These measures are calculated from a set of up to N feasible solutions, selected from those obtained in pre vious iterations. x i r is a binary v ariable indicating whether request r is included in solution i ( x i r = 1 ) or not ( x i r = 0 ). Correlation-based measure: this measure e v aluates the linear relationship between the presence of a request and the quality of feasible solutions. The Pearson correlation coef cient for reques t r is calculated as follo ws: C or r ( r ) = P i N ( x i r ¯ x r )( obj i ¯ obj ) p P i N ( x i r ¯ x r ) 2 q P i N ( obj i ¯ obj ) 2 (6) Where obj i is the objecti v e function v alue associated with solution i , ¯ x r is the a v erage of x i r across all solutions, and ¯ obj represents the a v erage of the objecti v e function v alues. This measure identies demands that contrib ute to higher -quality solutions. F or instance, a correlation close to 1 indicates that the presence of the request is often associated with high-quality solutions, while a correlation close to -1 indicates the opposite. Ranking-based measure: this measure is based on the position of feasible solutions in a ranking deter - mined by their quality (objecti v e function v alue). F or each solution i , its rank is denoted as rank i , where a lo wer rank corresponds to a higher -quality solution. The score assigned to request r is gi v en by: R ank S cor e ( r ) = X i N x i r rank i (7) Le ver a ging mac hine learning for column g ener ation in the dial-a-ride pr oblem with ... (Sana Ouasaid) Evaluation Warning : The document was created with Spire.PDF for Python.
2832 ISSN: 2252-8938 This measure f a v ors requests that appear in higher -rank ed solutions by assigning a higher score to those present in high-qual ity solutions. Requests with higher scores are thus highlighted for their potential contrib ution to optimal solutions. These statistical data are combined with the problem-specic features of D ARPDP . This results in a detailed representation of the requests, enabling ML models to more ef fecti v ely identify the best solutions to the problem. c. Selection of the best classication model W e trained four state-of-the-art supervised ML algorithms to choose the appropriate class ication algorithm. The selected classication algorithms include the Gaussian na ¨ ıv e Bayes (NB) classier , an e xten- sion of the NB algorithm that relies on posterior probabilit y estimation and assumes that the features used to characterize instances are conditionally independent. The SVM determines the best h yperplane-based decision boundaries that se gre g ate n-dimensional space into classes, allo wing for the eas y cate gorization of ne w data points. RF is an ensemble-based method. Se v eral decision trees are emplo yed with a random sample of characteristics to impro v e pre d i cti v e accurac y and manage comple x relationships within the data. Finally , a multi-layer perceptron (MLP) is a feed-forw ard articial neural netw ork. It consists of interconnected nodes arranged into multiple layers (input, hidden, and output). F or a comprehensi v e surv e y of supervised classica- tion techniques, see [26]. Model optimization is nding the h yperparameters that minimize or maximize a scoring function for a specic task. Each model has its h yperparameters with a set of possible v alues. This research emplo ys the grid search technique to unco v er the optimum v alues of the h yperparameters. Grid search accepts the h yperpa- rameter names (e.g., the learning rate in MLP or the k ernel in SVM) and a v ector of possible v alues for each. Then, the function goes through all the combinations and returns a ne-tuned model using the best combination of v alues. Ev en though gri d search can require more resources and time than other optimization methods, it w orks better with our problem since the datasets are not enormous. T able 2 sho ws both the h yperparameter conguration of each algorithm used by grid search and the best v alue of the parameters. T able 2. Hyperparameters congurations Algorithm Hyperparameter v alues Selected h yperparameters SVM ’C’: [0.001, 0.01, 0.1, 1, 10, 100] C= 10 ’k ernel’: [’ linear’, ’poly’, rbf ’, sigmoid’] k ernel=’ linear’ ’g amma’: [’ scale’, ’auto’, 0.001, 0.01, 0.1, 1, 10] g amma=’ scale’ MLP ’acti v ation’: [’ logistic’, tanh’, relu’] acti v ation=’ tanh’ ’alpha’: [0.0001, 0.001, 0.01, 0.1] alpha=0.0001 learning rate init’: [0.001, 0.01, 0.1, 0.5] learning rate init=0.5 ’hidden layer sizes’: [(15, 10, 5), (50, 25, 10), hidden layer sizes= ((nbFeatures + nbClasses)/2,)] ((nbFeatures + nbClasses)/2,) NB ’priors’: [None, [0.3, 0.7], [0.4, 0.6], [0.5, 0.5]] priors= None v ar smoothing’: [1e-9, 1e-8, 1e-7, 1e-6, 1e-5] v ar smoothing= 1e-09 RF ’n estimators’: [10, 20, 30, 50] n estimators=50 ’max depth’: [20, 30, None] max depth=20 ’min samples split’: [2, 5, 10] min samples split=2 W e e v aluated the performance of the classication models o v er v e runs for each of the SVM, RF , MLP , and NB algorithms. The metrics used for this e v aluation are accurac y (Acc), precision (P), recall (R), F1-score (F1), and time (T) measured in seconds. The detailed results for each run are presented in T ables 3 and 4, allo wing for an analysis of the v ariability in model performance. Figure 1 illustrates through box plots, the v ariations in k e y performance metrics and com putational time for each model o v er v e runs. The MLP model stands out for it s high performance and rob ustness, achie ving a maximum accurac y of 98.05% and stable median v alues for precision (80.7%), recall (78.1%), and F1-score (79.4%). Despite its slightly longer computational t ime (median of 3.2 seconds), its lo w v ariability ensures consistent performance, making it a reliable option for applications requiring stability . The NB model of fers a good balance, with a maximum accurac y of 94.71% and a short computational time (1.1 seconds), though its v ariability in recall and F1-score may reduce stability . The RF model demonstrates moderate perfor - mance (m edian accurac y of 88% and computational tim e of 2.3 seconds) b ut suf fers from high v ariabili ty across all m etrics, limiting its reliability in certain applications. In comparison, although the SVM model is f ast, the MLP model emer ges as the most rele v ant choice due to its balance between accurac y and computational cost. Int J Artif Intell, V ol. 14, No. 4, August 2025: 2826–2838 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 2833 T able 3. Performances of SVM and RF algorithms for v e runs SVM RF Run Acc (%) P (%) R (%) F1 (%) T (s) Acc (%) P (%) R (%) F1 (%) T (s) 1 78.2 81.2 78.1 79.6 0.14 90.4 92.1 89.4 90.7 0.35 2 82.3 83.5 81.9 82.7 0.31 84.8 88.4 84.1 86.2 0.41 3 68.7 70.1 66.8 68.4 0.17 72.1 74.8 71.5 73.1 0.20 4 88.6 90.2 87.9 89.0 0.24 89.1 91.7 88.3 90.0 0.34 5 70.9 73.2 70.4 71.8 0.18 69.4 72.9 68.5 70.6 0.28 T able 4. Performances of MLP and NB algorithms for v e runs MLP NB Run Acc (%) P (%) R (%) F1 (%) T (s) Acc (%) P (%) R (%) F1 (%) T (s) 1 96.4 95.6 94.8 95.2 0.27 94.7 93.8 92.4 93.1 0.15 2 89.4 92.1 90.4 91.2 0.31 87.2 90.5 88.7 89.6 0.19 3 79.2 82.6 80.1 81.3 0.11 79.4 81.4 78.6 79.9 0.30 4 98.1 96.9 96.4 96.7 0.16 94.3 93.3 91.9 92.6 0.20 5 78.0 80.7 78.1 79.4 0.22 61.7 64.1 61.8 62.9 0.25 Figure 1. Comparati v e e v aluation of four classication models: boxplot analysis 2.3.2. Lear ning-based di v ersication heuristic ChatGPT said: At each iteration of the CG process, after solving the RMP and training the supervi sed classication model, we use the model to predict the protability of each request. Based on these predictions, we construct three distinct subsets of requests from the current RMP solution. This is done to di v ersify the solutions e xplored for each subproblem: Le ver a ging mac hine learning for column g ener ation in the dial-a-ride pr oblem with ... (Sana Ouasaid) Evaluation Warning : The document was created with Spire.PDF for Python.
2834 ISSN: 2252-8938 Reduced subset ( R ): created by remo ving a percentage p of the least protable requests, i.e., those with the lo west predicted probabilities of being included in the optimal solution according to the classication model. This reduction focuses the e xploration on potentially more rele v ant subsets of requests. Expanded subset ( E ): formed by adding percentage p of ne w requests deemed most protable by classi- cation model, selected from those not currently serv ed by the route (column). This e xpansion enables the e xploration of more comple x solutions, inte grating requests that could enhance the o v erall solution. Di v ersied subset ( M ): generated by sw apping a percentage p of the i nitially serv ed requests with an equal number of ne w requests. The remo v ed requests are selected from the least protable ones, while the added requests are chosen from the most protable according to the model’ s predictions. This e xchange introduces additional di v ersity by altering the composition of routes, allo wing for the e xamination of potentially more ef fecti v e alternati v e solutions. After dening these subsets, we con v ert each into a graph. In these graphs, the nodes represent the requests and the v ehicle, while the edges carry weights based on the reduced costs from t he RMP . Using the CPLEX solv er , we then solv e these graphs to identify candidate columns with ne g ati v e reduced costs. 3. RESUL TS AND DISCUSSION W e designed a tw o-stage e xperimental setup to e v aluate the ef fecti v eness of the ML-CG algori thm. First, we as sessed the benets of inte grating ML into CG by comparing ML-CG’ s performance ag ainst a standard CG approach. In the second stage, we compared the proposed ML-CG algorithm ag ainst the enhanced iterated local search (E-ILS) algorithm described in [3] and the CG algorithm from [5], referred to as constrained genetic programming (CGP). Both sets of tests utilized the same randomly generated data described in [3]. W e de v eloped the algorithms using Python 3.5 and e x ecuted them on a personal computer with an Intel Core i7-6700 processor operating at 2.6 GHz and 32 GB of RAM. 3.1. The effecti v eness of incor porating machine lear ning techniques into CG Initially , we discuss the number of columns handled by CG, e xamining scenarios with and without learning to solv e the pricing problem. T able 5 displays the a v erage number of colum ns generated by the CG algorithm, comparing results with and without the learning-based enhancement, across dif ferent numbers of requests. One can observ e that e v en though we di v ersied the search and increased the number of subproblems solv ed on each iteration, the number of columns generated by CG with learning w as consistently less than that generated without learning. This discrepanc y in the number of columns generated may arise because man y of these columns may ha v e minimal or insignicant contrib utions to the RMP . F or e xample, in instances with 200 requests, the number of columns generated by ML-CG a v eraged 638.04, compared to 906.53 using traditional CG, resulting in an 18% reduction. This reduction suggests that ML-CG enables a more ef cient search process by focusing on fe wer , more rele v ant columns, while man y of the columns generated by the standard CG approach may ha v e minimal or insignicant contrib utions to the RMP . This observ ation can be further supported by e xamining the correlation between the con v er gence of the objecti v e v alue and the CPU time. As sho wn in Figure 2 for certain selected instances, especially those with more requests (50, 100, 200)—gi v en that the proposed algorithm e xhibits poor performance compared to the commercial solv er or other metaheuristics for small instances—CG with learning demonstrates a f aster con v er gence, highlighting the signicance of ef cient columns in boosting the search procedure’ s ef fecti v eness. Therefore, learning can yield signicant benets for solving lar ge-scale data by ef ciently generating columns without e xcessi v ely increasing the number of v ariables for the RMP . T able 5. Number of columns generated by CG algorithm with and without learning-based approach Nb requests W ithout learning W ith learning 10 4927.49 3738.27 15 3845.37 3039.24 20 3887.67 2926.61 30 2771.32 1993.27 50 1522.45 1102.60 100 1456.92 974.49 200 906.53 638.04 Int J Artif Intell, V ol. 14, No. 4, August 2025: 2826–2838 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 2835 Figure 2. Plots sho wing the e v olution of objecti v e function v alues o v er CPU time for selected e xperiments (inst b50 6, inst a50 5, inst b100 11, inst a100 12, inst b200 23, inst a200 22) 3.2. Comparison with state-of-art algorithms T ables 6 and 7 present the best objecti v e v alues and a v erage runtime for the a-type and b-type in- stances, respecti v ely , achie v ed by the three algorithms under comparison. Due to the computational comple xity of determining optimal solutions for all instances, we omit the g ap to optimality . Ho we v er , for instances with up to 20 requests, pre vious analyses using the CPLEX solv er ha v e conrmed optimality . T o compare the per - formance of these algorithms, W e compute the relati v e g ap for each algorithm. This relati v e g ap is calculated as the percentage dif ference between an algorit hm’ s solution objecti v e v alue and the best solution found by an y of the three algorithms for a gi v en instance. The relati v e g ap Gap i for algorithm i is dened as: Gap i = O bj i O bj best O bj best × 100% Where O bj i is the sol ution objecti v e v alue of algorithm i and O bj best is the bes t solution objecti v e v alue found by an y of the three algorithms. W e e v aluated the performance of each algorithm under the follo wing congurations. All three algo- rithms used a maximum time limit as the stopping criterion. F or instances with up to 15, 50, and 100/200 requests, the time limits were set to 300, 900, and 1200 s econds, respecti v el y . The E-ILS algorithm retained its conguration as detailed in [3], whi le the CGP algorithm utilized the follo wing parameters: N inter v al = 2 , N iter ations = 200 , and N LN S = 50 . T o fully understand t hese parameters and the CGP algorithm, consult the rele v ant paper [5]. Our proposed m ethod’ s conguration in v olv ed dynamically adjusting the percentage (p) of requests considered for change within the LBDH. The percentage started at 20% and increased by 10% with each iteration, resetting to 20% after reaching 60%. T ables 6 and 7 demonstrate that proposed ML-CG algorithm can achie v e at most 2.51% (3.86%) l ess accurac y than the best solution for small-scale a-type instances (b-type). Ho we v er , as comple xity of problems increases, ML-CG outperforms state-of-the-art algorithms, such as CGP and E-ILS, demonstrating its practical- ity and high o v erall e f fecti v eness, e videnced by its a v erage g ap of 1.08% (1.65%). In contrast, CGP algorithm e xhibits moderate performance with a v erage g ap of 17.52% (18.76%), while E-ILS algorithm, with the highest a v erage g ap of 31.58% (35.27%), is the least ef fecti v e in approximating optimal solutions. The incorporation of learning mechanisms into M L-CG enhances solution accurac y while maintaining comparable computational ef cienc y; specically , its performance on lar ger instances highlights its potential in addressing D ARPDP . Le ver a ging mac hine learning for column g ener ation in the dial-a-ride pr oblem with ... (Sana Ouasaid) Evaluation Warning : The document was created with Spire.PDF for Python.