## What's the better what to add constrains in OptaPlanner - optaplanner

### Crossover function for genetic

```I am writing a Time table generator in java, using AI approaches to satisfy the hard constraints and help find an optimal solution. So far I have implemented and Iterative construction (a most-constrained first heuristic) and Simulated Annealing, and I'm in the process of implementing a genetic algorithm.
Some info on the problem, and how I represent it then :
I have a set of events, rooms , features (that events require and rooms satisfy), students and slots
The problem consists in assigning to each event a slot and a room, such that no student is required to attend two events in one slot, all the rooms assigned fulfill the necessary requirements.
I have a grading function that for each set if assignments grades the soft constraint violations, thus the point is to minimize this.
The way I am implementing the GA is I start with a population generated by the iterative construction (which can leave events unassigned) and then do the normal steps: evaluate, select, cross, mutate and keep the best. Rinse and repeat.
My problem is that my solution appears to improve too little. No matter what I do, the populations tends to a random fitness and is stuck there. Note that this fitness always differ, but nevertheless a lower limit will appear.
I suspect that the problem is in my crossover function, and here is the logic behind it:
Two assignments are randomly chosen to be crossed. Lets call them assignments A and B. For all of B's events do the following procedure (the order B's events are selected is random):
Get the corresponding event in A and compare the assignment. 3 different situations might happen.
If only one of them is unassigned and if it is possible to replicate
the other assignment on the child, this assignment is chosen.
If both of them are assigned, but only one of them creates no
conflicts when assigning to the child, that one is chosen.
If both of them are assigned and none create conflict, on of
them is randomly chosen.
In any other case, the event is left unassigned.
This creates a child with some of the parent's assignments, some of the mother's, so it seems to me it is a valid function. Moreover, it does not break any hard constraints.
As for mutation, I am using the neighboring function of my SA to give me another assignment based on on of the children, and then replacing that child.
So again. With this setup, initial population of 100, the GA runs and always tends to stabilize at some random (high) fitness value. Can someone give me a pointer as to what could I possibly be doing wrong?
Thanks
Edit: Formatting and clear some things
```
```I think GA only makes sense if part of the solution (part of the vector) has a significance as a stand alone part of the solution, so that the crossover function integrates valid parts of a solution between two solution vectors. Much like a certain part of a DNA sequence controls or affects a specific aspect of the individual - eye color is one gene for example. In this problem however the different parts of the solution vector affect each other making the crossover almost meaningless. This results (my guess) in the algorithm converging on a single solution rather quickly with the different crossovers and mutations having only a negative affect on the fitness.
I dont believe GA is the right tool for this problem.
```
```If you could please provide the original problem statement, I will be able to give you a better solution. Here is my answer for the present moment.
A genetic algorithm is not the best tool to satisfy hard constraints. This is an assigment problem that can be solved using integer program, a special case of a linear program.
Linear programs allow users to minimize or maximize some goal modeled by an objective function (grading function). The objective function is defined by the sum of individual decisions (or decision variables) and the value or contribution to the objective function. Linear programs allow for your decision variables to be decimal values, but integer programs force the decision variables to be integer values.
So, what are your decisions? Your decisions are to assign students to slots. And these slots have features which events require and rooms satisfy.
In your case, you want to maximize the number of students that are assigned to a slot.
You also have constraints. In your case, a student may only attend at most one event.
The website below provides a good tutorial on how to model integer programs.
http://people.brunel.ac.uk/~mastjjb/jeb/or/moreip.html
For a java specific implementation, use the link below.
http://javailp.sourceforge.net/
SolverFactory factory = new SolverFactoryLpSolve(); // use lp_solve
factory.setParameter(Solver.VERBOSE, 0);
factory.setParameter(Solver.TIMEOUT, 100); // set timeout to 100 seconds
/**
* Constructing a Problem:
* Maximize: 143x+60y
* Subject to:
* 120x+210y <= 15000
* 110x+30y <= 4000
* x+y <= 75
*
* With x,y being integers
*
*/
Problem problem = new Problem();
Linear linear = new Linear();
problem.setObjective(linear, OptType.MAX);
linear = new Linear();
linear = new Linear();
linear = new Linear();
problem.setVarType("x", Integer.class);
problem.setVarType("y", Integer.class);
Solver solver = factory.get(); // you should use this solver only once for one problem
Result result = solver.solve(problem);
System.out.println(result);
/**
* Extend the problem with x <= 16 and solve it again
*/
problem.setVarUpperBound("x", 16);
solver = factory.get();
result = solver.solve(problem);
System.out.println(result);
// Results in the following output:
// Objective: 6266.0 {y=52, x=22}
// Objective: 5828.0 {y=59, x=16}
```
```I would start by measuring what's going on directly. For example, what fraction of the assignments are falling under your "any other case" catch-all and therefore doing nothing?
Also, while we can't really tell from the information given, it doesn't seem any of your moves can do a "swap", which may be a problem. If a schedule is tightly constrained, then once you find something feasible, it's likely that you won't be able to just move a class from room A to room B, as room B will be in use. You'd need to consider ways of moving a class from A to B along with moving a class from B to A.
You can also sometimes improve things by allowing constraints to be violated. Instead of forbidding crossover from ever violating a constraint, you can allow it, but penalize the fitness in proportion to the "badness" of the violation.
Finally, it's possible that your other operators are the problem as well. If your selection and replacement operators are too aggressive, you can converge very quickly to something that's only slightly better than where you started. Once you converge, it's very difficult for mutations alone to kick you back out into a productive search.
```
```I think there is nothing wrong with GA for this problem, some people just hate Genetic Algorithms no matter what.
Here is what I would check:
First you mention that your GA stabilizes at a random "High" fitness value, but isn't this a good thing? Does "high" fitness correspond to good or bad in your case? It is possible you are favoring "High" fitness in one part of your code and "Low" fitness in another thus causing the seemingly random result.
I think you want to be a bit more careful about the logic behind your crossover operation. Basically there are many situations for all 3 cases where making any of those choices would not cause an increase in fitness at all of the crossed-over individual, but you are still using a "resource" (an assignment that could potentially be used for another class/student/etc.) I realize that a GA traditionally will make assignments via crossover that cause worse behavior, but you are already performing a bit of computation in the crossover phase anyway, why not choose one that actually will improve fitness or maybe don't cross at all?
Optional Comment to Consider : Although your iterative construction approach is quite interesting, this may cause you to have an overly complex Gene representation that could be causing problems with your crossover. Is it possible to model a single individual solution as an array (or 2D array) of bits or integers? Even if the array turns out to be very long, it may be worth it use a more simple crossover procedure. I recommend Googling "ga gene representation time tabling" you may find an approach that you like more and can more easily scale to many individuals (100 is a rather small population size for a GA, but I understand you are still testing, also how many generations?).
One final note, I am not sure what language you are working in but if it is Java and you don't NEED to code the GA by hand I would recommend taking a look at ECJ. Maybe even if you have to code by hand, it could help you develop your representation or breeding pipeline.
```
```Newcomers to GA can make any of a number of standard mistakes:
In general, when doing crossover, make sure that the child has some chance of inheriting that which made the parent or parents winner(s) in the first place. In other words, choose a genome representation where the "gene" fragments of the genome have meaningful mappings to the problem statement. A common mistake is to encode everything as a bitvector and then, in crossover, to split the bitvector at random places, splitting up the good thing the bitvector represented and thereby destroying the thing that made the individual float to the top as a good candidate. A vector of (limited) integers is likely to be a better choice, where integers can be replaced by mutation but not by crossover. Not preserving something (doesn't have to be 100%, but it has to be some aspect) of what made parents winners means you are essentially doing random search, which will perform no better than linear search.
In general, use much less mutation than you might think. Mutation is there mainly to keep some diversity in the population. If your initial population doesn't contain anything with a fractional advantage, then your population is too small for the problem at hand and a high mutation rate will, in general, not help.
In this specific case, your crossover function is too complicated. Do not ever put constraints aimed at keeping all solutions valid into the crossover. Instead the crossover function should be free to generate invalid solutions and it is the job of the goal function to somewhat (not totally) penalize the invalid solutions. If your GA works, then the final answers will not contain any invalid assignments, provided 100% valid assignments are at all possible. Insisting on validity in the crossover prevents valid solutions from taking shortcuts through invalid solutions to other and better valid solutions.
I would recommend anyone who thinks they have written a poorly performing GA to conduct the following test: Run the GA a few times, and note the number of generations it took to reach an acceptable result. Then replace the winner selection step and goal function (whatever you use - tournament, ranking, etc) with a random choice, and run it again. If you still converge roughly at the same speed as with the real evaluator/goal function then you didn't actually have a functioning GA. Many people who say GAs don't work have made some mistake in their code which means the GA converges as slowly as random search which is enough to turn anyone off from the technique.```

### How to design acceptance probability function for simulated annealing with multiple distinct costs?

```I am using simulated annealing to solve an NP-complete resource scheduling problem. For each candidate ordering of the tasks I compute several different costs (or energy values). Some examples are (though the specifics are probably irrelevant to the question):
global_finish_time: The total number of days that the schedule spans.
split_cost: The number of days by which each task is delayed due to interruptions by other tasks (this is meant to discourage interruption of a task once it has started).
deadline_cost: The sum of the squared number of days by which each missed deadline is overdue.
The traditional acceptance probability function looks like this (in Python):
def acceptance_probability(old_cost, new_cost, temperature):
if new_cost < old_cost:
return 1.0
else:
return math.exp((old_cost - new_cost) / temperature)
So far I have combined my first two costs into one by simply adding them, so that I can feed the result into acceptance_probability. But what I would really want is for deadline_cost to always take precedence over global_finish_time, and for global_finish_time to take precedence over split_cost.
So my question to Stack Overflow is: how can I design an acceptance probability function that takes multiple energies into account but always considers the first energy to be more important than the second energy, and so on? In other words, I would like to pass in old_cost and new_cost as tuples of several costs and return a sensible value .
Edit: After a few days of experimenting with the proposed solutions I have concluded that the only way that works well enough for me is Mike Dunlavey's suggestion, even though this creates many other difficulties with cost components that have different units. I am practically forced to compare apples with oranges.
So, I put some effort into "normalizing" the values. First, deadline_cost is a sum of squares, so it grows exponentially while the other components grow linearly. To address this I use the square root to get a similar growth rate. Second, I developed a function that computes a linear combination of the costs, but auto-adjusts the coefficients according to the highest cost component seen so far.
For example, if the tuple of highest costs is (A, B, C) and the input cost vector is (x, y, z), the linear combination is BCx + Cy + z. That way, no matter how high z gets it will never be more important than an x value of 1.
This creates "jaggies" in the cost function as new maximum costs are discovered. For example, if C goes up then BCx and Cy will both be higher for a given (x, y, z) input and so will differences between costs. A higher cost difference means that the acceptance probability will drop, as if the temperature was suddenly lowered an extra step. In practice though this is not a problem because the maximum costs are updated only a few times in the beginning and do not change later. I believe this could even be theoretically proven to converge to a correct result since we know that the cost will converge toward a lower value.
One thing that still has me somewhat confused is what happens when the maximum costs are 1.0 and lower, say 0.5. With a maximum vector of (0.5, 0.5, 0.5) this would give the linear combination 0.5*0.5*x + 0.5*y + z, i.e. the order of precedence is suddenly reversed. I suppose the best way to deal with it is to use the maximum vector to scale all values to given ranges, so that the coefficients can always be the same (say, 100x + 10y + z). But I haven't tried that yet.
```
```mbeckish is right.
Could you make a linear combination of the different energies, and adjust the coefficients?
Possibly log-transforming them in and out?
I've done some MCMC using Metropolis-Hastings. In that case I'm defining the (non-normalized) log-likelihood of a particular state (given its priors), and I find that a way to clarify my thinking about what I want.
```
```I would take a hint from multi-objective evolutionary algorithm (MOEA) and have it transition if all of the objectives simultaneously pass with the acceptance_probability function you gave. This will have the effect of exploring the Pareto front much like the standard simulated annealing explores plateaus of same-energy solutions.
However, this does give up on the idea of having the first one take priority.
You will probably have to tweak your parameters, such as giving it a higher initial temperature.
```
```I would consider something along the lines of:
return (calculate probability)
else if (new global finish time > old global finish time)
return (calculate probability)
else if (new split cost > old split cost)
return (calculate probability)
else
return (1.0)
Of course each of the three places you calculate the probability could use a different function.
```
```It depends on what you mean by "takes precedence".
For example, what if the deadline_cost goes down by 0.001, but the global_finish_time cost goes up by 10000? Do you return 1.0, because the deadline_cost decreased, and that takes precedence over anything else?
This seems like it is a judgment call that only you can make, unless you can provide enough background information on the project so that others can suggest their own informed judgment call.```

### Score calculation performance using shadow variables

```From a score calculation perspective, is it correct to assume that shadow variable helps to arrive at the solution faster than without using shadow variable.
Making use of shadow variable allows VariableListner to reset the values of dependent entities closer to the final/optimal value.
```
```It depends. Both shadow variables and score calculation use delta's to create incremental calculation. Those delta's are the key to scaling out and getting a high score calculation speed per second (see last INFO logging line and the benchmark report). Whatever you do, keep an eye on that value for at least datasets of different sizes.
In theory, it shouldn't matter much if there is a shadow variable to simplify the score rules, or if there is no shadow variable and the score rules are more complex (and might use insertLogicals etc).
In practice, it often does matter: for vehicle routing, IIRC I 've seen that the shadow variable arrivalTime noticeably improved performance and scalability.
My advice is to use a shadow variable when it makes sense to have that on the domain. For example the arrivalTime etc. But use a simple calculated getter (without loops) when that suffices: for example departureTime (= arrivalTime + duration). And use the score rules for the rest.
In the end, it's a design choice: do the score rules need to figure out the departureTime or arrivalTime themselves or can we abstract away over that - by putting them in the model - and make the rules more natural to read?```

### Optimizing Parameters using AI technique

```I know that my question is general, but I'm new to AI area.
I have an experiment with some parameters (almost 6 parameters). Each one of them is independent one, and I want to find the optimal solution for maximum or minimum the output function. However, if I want to do it in traditional programming technique it will take much time since i will use six nested loops.
I just want to know which AI technique to use for this problem? Genetic Algorithm? Neural Network? Machine learning?
Update
Actually, the problem could have more than one evaluation function.
It will have one function that we should minimize it (Cost)
and another function the we want to maximize it (Capacity)
Maybe another functions can be added.
Example:
Construction a glass window can be done in a million ways. However, we want the strongest window with lowest cost. There are many parameters that affect the pressure capacity of the window such as the strength of the glass, Height and Width, slope of the window.
Obviously, if we go to extreme cases (Largest strength glass, with smallest width and height, and zero slope) the window will be extremely strong. However, the cost for that will be very high.
I want to study the interaction between the parameters in specific range.
```
```Without knowing much about the specific problem it sounds like Genetic Algorithms would be ideal. They've been used a lot for parameter optimisation and have often given good results. Personally, I've used them to narrow parameter ranges for edge detection techniques with about 15 variables and they did a decent job.
Having multiple evaluation functions needn't be a problem if you code this into the Genetic Algorithm's fitness function. I'd look up multi objective optimisation with genetic algorithms.
I'd start here: Multi-Objective optimization using genetic algorithms: A tutorial
```
```First of all if you have multiple competing targets the problem is confused.
You have to find a single value that you want to maximize... for example:
value = strength - k*cost
or
value = strength / (k1 + k2*cost)
In both for a fixed strength the lower cost wins and for a fixed cost the higher strength wins but you have a formula to be able to decide if a given solution is better or worse than another. If you don't do this how can you decide if a solution is better than another that is cheaper but weaker?
In some cases a correctly defined value requires a more complex function... for example for strength the value could increase up to a certain point (i.e. having a result stronger than a prescribed amount is just pointless) or a cost could have a cap (because higher than a certain amount a solution is not interesting because it would place the final price out of the market).
Once you find the criteria if the parameters are independent a very simple approach that in my experience is still decent is:
pick a random solution by choosing n random values, one for each parameter within the allowed boundaries
compute target value for this starting point
pick a random number 1 <= k <= n and for each of k parameters randomly chosen from the n compute a random signed increment and change the parameter by that amount.
compute the new target value from the translated solution
if the new value is better keep the new position, otherwise revert to the original one.
repeat from 3 until you run out of time.
Depending on the target function there are random distributions that work better than others, also may be that for different parameters the optimal choice is different.
```
```Some time ago I wrote a C++ code for solving optimization problems using Genetic Algorithms. Here it is: http://create-technology.blogspot.ro/2015/03/a-genetic-algorithm-for-solving.html
It should be very easy to follow.```

### Cplex/OPL local search

```I have a model implemented in OPL. I want to use this model to implement a local search in java. I want to initialize solutions with some heuristics and give these initial solutions to cplex find a better solution based on the model, but also I want to limit the search to a specific neighborhood. Any idea about how to do it?
Also, how can I limit the range of all variables? And what's the best: implement these heuristics and local search in own opl or in java or even C++?
```
```Just to add some related observations:
Re Ram's point 3: We have had a lot of success with approach b. In particular it is simple to add constraints to fix the some of the variables to values from a known solution, and then re-solve for the rest of the variables in the problem. More generally, you can add constraints to limit the values to be similar to a previous solution, like:
var >= previousValue - 1
var <= previousValue + 2
This is no use for binary variables of course, but for general integer or continuous variables can work well. This approach can be generalised for collections of variables:
sum(i in indexSet) var[i] >= (sum(i in indexSet) value[i])) - 2
sum(i in indexSet) var[i] <= (sum(i in indexSet) value[i])) + 2
This can work well for sets of binary variables. For an array of 100 binary variables of which maybe 10 had the value 1, we would be looking for a solution where at least 8 have the value 1, but not more than 12. Another variant is to limit something like the Hamming distance (assume that the vars are all binary here):
dvar int changed[indexSet] in 0..1;
forall(i in indexSet)
if (previousValue[i] <= 0.5)
changed[i] == (var[i] >= 0.5) // was zero before
else
changed[i] == (var[i] <= 0.5) // was one before
sum(i in indexSet) changed[i] <= 2;
Here we would be saying that out of an array of e.g. 100 binary variables, only a maximum of two would be allowed to have a different value from the previous solution.
Of course you can combine these ideas. For example, add simple constraints to fix a large part of the problem to previous values, while leaving some other variables to be re-solved, and then add constraints on some of the remaining free variables to limit the new solution to be near to the previous one. You will notice of course that these schemes get more complex to implement and maintain as we try to be more clever.
To make the local search work well you will need to think carefully about how you construct your local neighbourhoods - too small and there will be too little opportunity to make the improvements you seek, while if they are too large they take too long to solve, so you don't get to make so many improvement steps.
A related point is that each neighbourhood needs to be reasonably internally connected. We have done some experiments where we fixed the values of maybe 99% of the variables in a model and solved for the remaining 1%. When the 1% was clustered together in the model (e.g. all the allocation variables for a subset of resources) we got good results, while in comparison we got nowhere by just choosing 1% of the variables at random from anywhere in the model.
An often overlooked idea is to invert these same limits on the model, as a way of forcing some changes into the solution to achieve a degree of diversification. So you could add a constraint to force a specific value to be different from a previous solution, or ensure that at least two out of an array of 100 binary variables have a different value from the previous solution. We have used this approach to get a sort-of tabu search with a hybrid matheuristic model.
Finally, we have mainly done this in C++ and C#, but it would work perfectly well from Java. Not tried it much from OPL, but it should be fine too. The key for us was being able to traverse the problem structure and use problem knowledge to choose the sets of variables we freeze or relax - we just found that easier and faster to code in a language like C#, but then the modelling stuff is more difficult to write and maintain. We are maybe a bit "old-school" and like to have detailed fine-grained control of what we are doing, and find we need to create many more arrays and index sets in OPL to achieve what we want, while we can achieve the same effect with more intelligent loops etc without creating so many data structures in a language like C#.
```
```Those are several questions. So here are some pointers and suggestions:
In Cplex, you give your model an initial solution with the use of IloOplCplexVectors()
Here's a good example in IBM's documentation of how to alter CPLEX's solution.
Within OPL, you can do the same. You basically set a series of values for your variables, and hand those over to CPLEX. (See this example.)
Limiting the search to a specific neighborhood: There is no easy way to respond without knowing the details. But there are two ways that people do this:
a. change the objective to favor that 'neighborhood' and make other areas unattractive.
b. Add constraints that weed out other neighborhoods from the search space.
Regarding limiting the range of variables in OPL, you can do it directly:
dvar int supply in minQty..maxQty;
Or for a whole array of decision variables, you can do something along the lines of:
range CreditsAllowed = 3..12;
dvar int credits[student] in CreditsAllowed;
Hope this helps you move forward.```