Normalizing Constraints For A Multi Objective Fitness Function

Most optimization problems involving genetic algorithms will contain multiple constraints; a useful way of handling these constraints, is by multiplying them by a factor which will scale the constraint to a predefined range. It is generally a good idea to have all constraints contributing equally to the final fitness value.

Example:

//calculate fitness
m_dFitness = distFit + 400*rotFit + 4*fitAirTime;

The factor been used in the above example is 400, distFit has a max of 400 therefore, rotFit which has a max of 1 is multiplied by 400 and fitAirTime which has a max of 100 is multiplied by 4.

If you wanted to normalize each constraint to a different range then you could use this formula to calculate the normalized value y = 1 + (x-A)*(r2-r1)/(B-A) where A,B is the original range and r1, r2 is the range that the value x should be normalized to.

Example:

//calculate fitness
var distFit = 1 + (x-0)*(10-1)/(50-0);
var rotFit  = 1 + (x-0)*(10-1)/(90-0);
var fitAirTime = 1 + (x-0)*(10-1)/(30-0);

m_dFitness = distFit + rotFit + fitAirTime;

This option is more feasible since it does not rely on any constraint value to derive a multiplication factor.

Advertisements

Predicting The Lottery With MATLAB® Neural Network

DISCLAMER: This post does not in any way prove or disprove the validity of using neural networks to predict the lottery. It is purely for the purpose of demonstrating certain capabilities available in MATLAB ® . The results and conclusions are my opinion and may or may not constitute applicable techniques of predicting the popular Jamaican Lottery Cash Pot game.

Background:

Supreme Ventures Jamaica Limited has a lottery game called Cash Pot (CP) . The game is based on 36 balls being loaded into a chamber and one ball been selected at random from the grouping. The game is ran four (4) times each day seven (7) days per week.

Anecdotal Heuristics:

While doing a little tongue and cheek research at my favorite barbershop, I stumbled upon some heuristics that are employed by most patrons who play the (CP) game. One involved writing down the day, time, and winning number for each day’s lottery. After building up a sufficient dataset, they could then query a particular day and time; and with some simple arithmetic tally the most likely number to be played on that day and time. I was informed that this proved to be a very efficient way of telling which number was to be played next. Another popular heuristic involved pre-assigned symbols; these symbols were associated with each of the thirty six (36) numbers. Then based on dreams aka “rakes” numbers would be chosen that matched the symbols seen in the “rake”. These two methods were the favorite amongst the players of Cash Pot.

Procedure for predicting Cash Pot with MATLAB ANN:

  1. Get the dataset from Supreme Ventures Jamaica website. [contains all winning numbers with date and time]
  2. We will need to do some twiddling with the file in order to get it into a format that MATLAB can use. To do that we need to remove all headings/sub-headings and labels.
  3. Next remove the DRAW# and MARK columns since we will not be using those in our analysis.
  4. In column D use the =WEEKDAY() formula to get the day number from the corresponding date: repeat for all rows.
  5. Use find and replace to replace MORNING with 1, MIDDAY with 2, DRIVETIME with 3 and EVENING with 4. [Save the file]
  6. Using MATLAB folder explorer, navigate to the file then double click on it to run the import tool.
  7. Select columns B and D then hit the import button; this should import only columns B and D, rename the imported matrix to cpInputs .
  8. Select column C and hit the import button; this should import column C only, rename the imported matrix to  cpTargets.
  9. Because MATLAB sees Neural Network(NN) features as rows, transpose the two matrices using
  10. cpInputs = cpInputs’;
    cpTargets = cpTargets’;
    
  11. In the MATLAB command window type nntool.
  12. Import cpInputs and cpTargets into the NN data manager.
  13. Hit the new button on the Neural Network Data Manager and change the default name to cpNN.
  14. Set Input data to cpInputs, Target data to cpTargets.
  15. Hit the create button to create the NN.
  16.  

    Note:

    The newly created NN has two inputs, the first been the day of the week on which the [CP] is scheduled to be played and the second input the time of day that the [CP] is scheduled to played. It also has a hidden layer with 10 neurons with associated bias, and an output layer with 1 neuron and its associated bias. The output is a scalar double which represents the predicted winning number.

  17. Let’s go ahead and train this network. On the train tab of the Network: cpNN dialog, select cpInputs for Inputs and cpTargets for Targets; then press the Train Network button to start the network training.
  18. Results of training.
  19. After training the network to the desired tolerance’s go back to the Neural Network/Data Manager dialog box and hit the export button, select cpNN from the list then hit the export button.
  20. Go back to the MATLAB command window and type
  21. CpNN([2;3]) % [day;time]
    
  22. The resulting value will be the NN’s best guess of what will be the winning entry for Cash Pot on a Tuesday at DRIVETIME.

Conclusions:

My initial analysis of the results of the NN was not conclusive, maybe the parameters of the NN could be adjusted and the results compared to actual winning numbers. However, even after doing so one may find that the outputs are still random and contain no discernible patterns, which should be the case for a supposedly random game of chance.

Simple Genetic Algorithm To Evolve A String of Integers

Genetic Algorithms are a means of optimization copied from the natural world. According to the theories of evolution, nature has a way of selecting the best (fittest) individuals to mate (Crossover) and reproduce; thus carrying on the best features of each selected individual. Even though traits from both parents are carried over into children there is still an element of randomness involved (Mutation) that gives their offspring the ability to explore their fitness landscape (adapt) to their environment. Genetic Algorithms tries to mimic this behavior with three common operators (1)Selection, (2)Crossover, (3)Mutation.

Selection:  
The selection operator determines how the individuals of a population are selected to mate, the most popular selection method is called elitism and this is the method that we will use in our genetic algorithm implementation.

Crossover:
The crossover operator determines how the parents are recombined to form offspring. We will be using single point crossover in this implementation.

Mutation:
Mutation inserts randomness into the genotype of each offspring giving it the ability to diversify from the features of its parents.

Note
Implementing genetic algorithms can be seen as somewhat of an art because almost all of the code is boiler plate except for the chromosome representation used and how the fitness of each individual is calculated. These two factors usually have the most impact on the accuracy and speed of the genetic algorithm and are the most difficult to represent. These factors along with the mutation rate, crossover rate and selection method have to be tinkered with until a viable configuration is reached.

This implementation will evolve the number sequence 123456789 in that specific order.

Interesting Functions

private static void CalculateFitness(sequence seq)
{
var ordered = new sequence{buffer = new List<string>() {"1", "2", "3", "4", "5", "6", "7", "8", "9"}};

int result=0;
for (int i = 0; i < 9; i++)
{
if (ordered.buffer[i] == seq.buffer[i]) result++;
}
seq.fitness = result;
}

 

private void Epoch(List<sequence> population)
        {
            //Elitism
            survivors.AddRange(population.Where(i => i.fitness >= survivorThreshold));

            Mutate(population.Where(i => i.fitness >= mutantThreshold));

            CrossOver(population.Where(i => i.fitness >= crossoverThreshold) as IEnumerable);

            population.Clear();
            population.AddRange(survivors);

            for (int i = 0; i < populationSize - survivors.Count; i++)
            {
                var temp = GenerateSequence();
                CalculateFitness(temp);
                population.Add(temp);
            }
            survivors.Clear();
        }

The full code listing can be found at github