what is genetic algorithm

Genetic Algorithm . During the 70s, in order to abstract and explain the processes of adaptation in natural systems, as well as to design an artificial system that by means of softwares emulated with the mechanisms of natural systems, this technique was known and became popular with the name of “genetic algorithms”. From that moment on, an important group of researchers emerged that devoted themselves to what is nowadays the field of development and application of genetic algorithm.

Main elements

The main elements of a genetic algorithm are:

  • Coding scheme, that is, the way in which a possible solution to the problem is represented.
  • Evaluation function, which indicates whether an individual is able to solve the problem.
  • Three basic operators: reproduction , crossing and mutation .
  • Parameters that control the performance of the genetic algorithm: probability of crossing, probability of mutation, size of the population , number of generations, etc.

In practice, the most common coding of solutions is through binary strings, although real numbers and letters or characters have also been used to represent the chromosomes. The first of these schemes has enjoyed much popularity because it was originally proposed by John Holland , and because it is easy to implement. Simple bit operations allow crossing, mutation and other operations. Although a large number of investigations were conducted on variable length chains and other structures, the greatest number of genetic algorithms has focused on fixed-length character strings.

Steps to follow

The steps to follow to generate a genetic algorithm, according to Coello are:

  • Generate the initial population.
  • Evaluate the adaptation of all individuals in the population.
  • Create a new population by performing operations such as selection / reproduction proportional to the adaptation, crossing and mutations in the individuals in which it has just been measured.
  • Replace the old town
  • Iterate using the new population, until it converges.

This implementation, seen in the form of a pseudocode, would be:

                      generate initial population, G (0);
                       evaluate G (0);
                       t: = 0;
                       repeat
                                  t: = t + 1
                                  generate G (t) using
                                  G (t-1);
                                  evaluate G (t);
                       until finding a solution;

Each iteration of this loop is known as generation. The first of this process is a population of individuals generated at random. From this point, the genetic operators, together with the adaptation measure, act to improve the population.

Main elements and operators of an AG

Generation of the initial population

The first task of the genetic algorithm is to create an initial population of chains. There are several ways to select an initial population. Techniques vary from randomly selecting each character of a string to modifying the search result previously made by man.

The simplest way to generate the initial population is to select each character of the chain completely randomly until the entire population is completed. In case the alphabet is binary, the probability that each bit is 1 is 50%.

The composition of the initial population can dramatically affect the behavior of the genetic algorithm.

The population must be long enough to create a diverse group of individuals in order for the genetic algorithm to exploit it, but not so much that it does not dominate computer time.

Coding scheme

If a problem is feasible to be represented by a set of parameters (known as genes), these can be linked to form a chain of values ​​(chromosome), this process is called coding. In genetics, that set represented by a particular chromosome is referred to as a genotype. It contains the information necessary to build an organism known as a phenotype . Those same terms are applied in genetic algorithms.

There are several issues related to the codification of a problem that must be taken into account at the time of execution:

  • Use the smallest possible alphabet to represent the parameters, normally binary digits are used.
  • The variables that represent the parameters of the problem must be discretized to be represented with bit strings. Sufficient resolution must be used to ensure that the output has an adequate level of accuracy. It is assumed that the discretization is representative of the objective function.
  • Most of the problems treated with genetic algorithms are non-linear and there are often hidden relationships between the variables that make up the solution. This interaction is referred to as epístasis, and it is necessary to take it into account for an adequate representation of the problem.

Evaluation function

Given a chromosome, the evaluation function consists in assigning a numerical value of adaptation, which is supposed to be proportional to the utility or ability of the individual represented. Other characteristics that this function must have is to punish bad solutions and reward good ones, so that the latter are the ones that propagate more quickly.

Selection / reproduction

It is the process by which an individual (chromosome) is copied proportionally to his evaluation (force), forming an intermediate set of individuals. This intermediate set tentatively becomes a new population to which the other genetic operators will be applied. In Nature , the strength of an individual is measured by their ability to survive in a certain environment .

There are two fundamental ways to perform the selection / reproduction procedure: roulette and stochastic tournament.

Forms of work of AG

There are several forms of work of genetic algorithms, each based on a metaphor different from nature.

Generative Generation Algorithms

It is similar to the way insects reproduce. In which a generation lays eggs, it moves geographically or dies and is replaced by a new one. In such a model crosses are made in a pool of individuals, the descendants are placed in another. At the end of the reproductive phase, the previous generation is eliminated and the new generation is used. This model is also known as a canonical genetic algorithm.

Fixed-state genetic algorithms

They use the generational scheme of mammals and other long-lived animals, in which parents and their descendants coexist, which allows the children to be educated by their parents, but also in the long run to generate competition among them.

In this model, not only the two parent individuals must be selected, but also those from the previous population will be eliminated, to give space to the descendants.

Parallel genetic algorithms

Part of the biological metaphor that motivated the use of genetic search is that it is inherently parallel, since when evolving, many solutions are presented simultaneously, each presented by an individual of the population. However, it is very common in Nature that not only is there an evolving population, but several, usually geographically isolated, that originate different responses to evolutionary pressure. This brings with it models that take into account such variation and use not a population like the previous ones, but multiple concurrently.

Models of Islands

If you have a population of individuals, it is divided into subpopulations that evolve independently as a normal genetic algorithm. Sometimes migrations occur between them, which allows them to exchange genetic material.

With migration, this model can exploit differences in subpopulations. Such variation represents a source of genetic diversity. However, if a large number of individuals migrate in each generation, a global mix occurs and local differences are eliminated. If migration is not frequent, premature convergence in subpopulations is likely.

Cell model

Place each individual in a matrix, where each one can only seek to reproduce with the others around him (closer to home) by what he chooses at random or at the best adapted. The descendant will move to a nearby position.

There are no islands in this model, but there are similar potential effects. If it is assumed that the crossing is restricted to adjacent individuals, two of them separated by 20 spaces will be as separate as if they were on two islands. This form is known as distance isolation.

Crossing

The crossing is carried out on the intermediate set generated by the reproduction. First, a pair of individuals is randomly selected to be crossed. Then, with the use of probability theory, it is determined if there will be a crossing between the two selected individuals or not. If so, both individuals are aligned and a K position is chosen randomly between 1 and Lc -1 (Lc is the length of the chromosome), in which the crossing will be made. Two new individuals will be created by exchanging the information of the parents between the position K + 1 and the position Lc even (See figure 1). This is known as crossing at one point.

When 2 crossing points are used, according to Goldberg, the procedure is similar. Normally the crossing is handled within the implementation of the genetic algorithm with a percentage that indicates how often it will be carried out. This means that not all pairs of chromosomes will cross, but there will be some that will pass intact to the next generation. There is a technique, called elitism, in which the most apt individual throughout the different generations does not cross paths with anyone and remains intact until a better one emerges that displaces it.

Mutation

The mutation is applied to each descendant individually after each crossing. This alters each of the genes of the chromosome at random, with a small probability. When a binary representation is used, a bit is replaced by its complement (a zero is changed by a one and vice versa). This operator allows the introduction of the new chromosomal material in the population, just as it happens with its biological equivalents. Like the crossing, the mutation is handled as a percentage that indicates how often it will be carried out, although it differs from the first one because it occurs much more sporadically (the crossover percentage is usually more than 60% while the mutation rate is greater than 60%). it is frequent not to exceed 5%). This low percentage in the probability of mutation avoids oscillations in the average of the objective values ​​of the population.

There are two ways to implement a mutation: with fixed probability or the use of a variable probability that changes according to the characteristics of the population.

The mutation seeks to further improve the solutions created with the two previous operators. It can be said then that genetic algorithms use solutions that by their evaluation have proven to be good for combining them and producing still better ones.

Fundamental problems in the implementation of an AG

A problem with genetic algorithms, given by a poor formulation of the model, is one in which the genes of a few relatively well-adapted but not optimal individuals can quickly dominate the population. Once this happens, the model’s ability to search for better solutions is completely eliminated, while only the mutation remains as a way to find other alternatives, while the algorithm becomes a random search.

One thought on “what is genetic algorithm

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.