Got a YouTube account?

New: enable viewer-created translations and captions on your YouTube channel!

English subtitles

← Origins of Life: Astrobiology & General Theories for Life - Evolutionary Computation

Get Embed Code
2 Languages

Showing Revision 2 created 09/17/2019 by sana.ahmed.

  1. Today we're going to talk
    about how these ideas about
  2. evolution look as computations and
    how we can use computations
  3. to understand and leverage
    evolutionary ideas.
  4. Not Synced
    Let's first review the three basic elements
  5. Not Synced
    of Darwinian evolution,
  6. Not Synced
    which you already learned about.
  7. Not Synced
    Random variation of individuals.
  8. Not Synced
    Selection of some of those individuals,
  9. Not Synced
    based on differential fitness,
  10. Not Synced
    and the inheritance of those variations
  11. Not Synced
    into individuals of the next generation.
  12. Not Synced
    We also need to think a little bit about
  13. Not Synced
    how those variations are represented
  14. Not Synced
    and that really gets us into the field of genetics
  15. Not Synced
    and we don't need to know, we just
  16. Not Synced
    need to know that the information,
  17. Not Synced
    these variations, are represented as
  18. Not Synced
    discreet units, which today are called genes.
  19. Not Synced
    And we need to know that the representation
  20. Not Synced
    of the information in genes is separate
  21. Not Synced
    from how it's expressed in the phenotype.
  22. Not Synced
    So, we refer to this as the genotype versus
  23. Not Synced
    phenotype mapping.
  24. Not Synced
    And the final thing we need to know is
  25. Not Synced
    that these genes are organized in a linear
  26. Not Synced
    array, which today we call a chromosome.
  27. Not Synced
    And this understanding really started with
  28. Not Synced
    Mendel in 1865 and the culmination of it
  29. Not Synced
    was the Crick & Watson discovery of the
  30. Not Synced
    structure of DNA.
  31. Not Synced
    So, we're going to take those very simple
  32. Not Synced
    elements and simple understandings and
  33. Not Synced
    translate them into a computer algorithm.
  34. Not Synced
    And the way we're going to do that is,
  35. Not Synced
    instead of having chromosomes with genes
  36. Not Synced
    on them, we're going to have strings with bits.
  37. Not Synced
    Bits are numbers that are either
  38. Not Synced
    zero or one, they can only have two values,
  39. Not Synced
    and we're going to have those strings of bits
  40. Not Synced
    be our individuals in the population.
  41. Not Synced
    So, imagine that we start out, and this is
  42. Not Synced
    in the left-most panel of the figure.
  43. Not Synced
    Imagine that we start out with a population
  44. Not Synced
    of randomly generated individuals, or strings,
  45. Not Synced
    and here we only have three shown and they
  46. Not Synced
    each only have 5 bits, but in real genetic
  47. Not Synced
    algorithms we would have more bits and
  48. Not Synced
    larger populations.
  49. Not Synced
    We also need a way to evaluate fitness
  50. Not Synced
    and we do that by using a fitness function.
  51. Not Synced
    In our example fitness function, we will assume
  52. Not Synced
    that the values range from 0 to 1 and the
  53. Not Synced
    higher value (1) is better.
  54. Not Synced
    So, the first step is generate initial population,
  55. Not Synced
    then we need to evaluate each individual
  56. Not Synced
    in the population using our fitness function
  57. Not Synced
    and use those fitness values to select
  58. Not Synced
    the next generation.
  59. Not Synced
    And so, you see that in the center panel
  60. Not Synced
    where we have two copies of the highest
  61. Not Synced
    fitness individual, no copies of the lowest
  62. Not Synced
    fitness individual, and a single copy
  63. Not Synced
    of the average individual.
  64. Not Synced
    That's not very interesting because
  65. Not Synced
    those individuals look exactly like their parents.
  66. Not Synced
    And so, we take advantage of what are known as
  67. Not Synced
    genetic operators to introduce new variations.
  68. Not Synced
    And we do that using mutation that's shown
  69. Not Synced
    in the top individual where the first bit (a one)
  70. Not Synced
    is flipped to become a zero, and we do this
  71. Not Synced
    not always in the first position, we do it
  72. Not Synced
    randomly at different places in the string,
  73. Not Synced
    and do it randomly throughout the strings
  74. Not Synced
    of the population.
  75. Not Synced
    Then we also use a process called:
  76. Not Synced
    crossover, or recombination, where we take
  77. Not Synced
    two individuals and exchange pieces of their
  78. Not Synced
    DNA or pieces of their bit strings, and
  79. Not Synced
    you see that in the second two individuals.
  80. Not Synced
    So, now in the third panel, we have the
  81. Not Synced
    true, new generation, Generation T+1, and
  82. Not Synced
    we then need to repeat the cycle
  83. Not Synced
    and evaluate those using the fitness function,
  84. Not Synced
    do new selection, introduce new mutations
  85. Not Synced
    and crossovers, and that is how
  86. Not Synced
    the evolutionary process runs.
  87. Not Synced
    This basic strategy, and basic idea of using
  88. Not Synced
    bit strings to represent chromosomes was
  89. Not Synced
    introduced by John Holland in the early 1960s.
  90. Not Synced
    John was very interested in the
  91. Not Synced
    populations level effects and in the
  92. Not Synced
    impact of crossover.
  93. Not Synced
    Coming back to the algorithm...
  94. Not Synced
    what does it look like when we actually
  95. Not Synced
    run this algorithm for multiple generations?
  96. Not Synced
    So, I just showed you one generation in the previous slide,
  97. Not Synced
    but we typically iterate these for hundreds
  98. Not Synced
    or even thousands sometimes, of generations.
  99. Not Synced
    And the way we look at it is by plotting time
  100. Not Synced
    in terms of numbers of generations on the x-axis
  101. Not Synced
    and the fitness, typically the average fitness
  102. Not Synced
    of the population or the best fitness of
  103. Not Synced
    the population. Those are the two values
  104. Not Synced
    shown here on the y-axis.
  105. Not Synced
    And so, this is a very typical kind of
  106. Not Synced
    performance curve that we see with genetic
  107. Not Synced
    algorithms, where the fitness of the population
  108. Not Synced
    starts out very low initially, very quickly
  109. Not Synced
    climbs up and improves as the really lousy
  110. Not Synced
    individuals get deleted and the somewhat
  111. Not Synced
    better individuals get copied, so the whole
  112. Not Synced
    average fitness moves up.
  113. Not Synced
    Then there's a little bit of searching around
  114. Not Synced
    that we see, and we get another innovation.
  115. Not Synced
    But eventually, the population gets stuck
  116. Not Synced
    on what is known as a plateau.
  117. Not Synced
    This is known as punctuated equilibrium.
  118. Not Synced
    And when that happens, then the algorithm
  119. Not Synced
    is affectively having to explore the space
  120. Not Synced
    more widely to find a high fitness
  121. Not Synced
    innovation and so that can take a varying
  122. Not Synced
    amount of time, and we actually see two
  123. Not Synced
    plateaus, the first plateau we see in the figure
  124. Not Synced
    we see eventually and new innovation is found
  125. Not Synced
    and the population jumps up in fitness
  126. Not Synced
    and this is very typical of these
  127. Not Synced
    genetic algorithms.
  128. Not Synced
    Let's now talk about some applications,
  129. Not Synced
    how they're used.
  130. Not Synced
    Genetic algorithms have been used widely
  131. Not Synced
    in engineering applications and they've
  132. Not Synced
    also been used for scientific modeling.
  133. Not Synced
    First, we'll talk about the engineering
  134. Not Synced
    applications, and of these by far the
  135. Not Synced
    most common is what's known as
  136. Not Synced
    multi-parameter function optimization.
  137. Not Synced
    So, that's shown in the figure.
  138. Not Synced
    Where just for a two-dimensional function,
  139. Not Synced
    that is a function in two variables.
  140. Not Synced
    And if a function is complex a lot of times
  141. Not Synced
    we don't have analtyical methods to find mathematically
  142. Not Synced
    what the maximum value of the function is
  143. Not Synced
    and when that happens we have to resort to
  144. Not Synced
    sampling and we can think of the genetic algorithm
  145. Not Synced
    as a kind of biased sampling algorithm.
  146. Not Synced
    And the goal, of course, would be to find
  147. Not Synced
    in this multi-peaked surface points that
  148. Not Synced
    are on the highest peak up at the top.
  149. Not Synced
    Let's just go into a little bit more detail
  150. Not Synced
    about how that works.
  151. Not Synced
    Here is an example of such a function and
  152. Not Synced
    it's not trivial to analyze this mathematically,
  153. Not Synced
    but suppose we want to find the x and y values
  154. Not Synced
    for this function that produces the
  155. Not Synced
    maximum F(x,y).
  156. Not Synced
    So to do that, we take out bit strings and
  157. Not Synced
    conceptually think of the first half of the
  158. Not Synced
    string as being a representation of the
  159. Not Synced
    value of x, and the second half of the string
  160. Not Synced
    is being a representation of the value of y.
  161. Not Synced
    Now, to evaluate fitness we then need to
  162. Not Synced
    take these ones and zeros and interpret
  163. Not Synced
    them as Base 2 numbers, translating them
  164. Not Synced
    into their decimal equivalents, which is
  165. Not Synced
    shown on the figure, and then take those
  166. Not Synced
    decimal values and plug them into our
  167. Not Synced
    fitness function, and in this case we get
  168. Not Synced
    the value out 4.
  169. Not Synced
    So, we would have to do that for every
  170. Not Synced
    individual in the population.
  171. Not Synced
    I just want to say a word about where
  172. Not Synced
    these algorithms came from. I've mentioned
  173. Not Synced
    John Holland. There were several other
  174. Not Synced
    groups that were interested in similar
  175. Not Synced
    kinds of algorithms and the first three main
  176. Not Synced
    groups I've listed were all independent
  177. Not Synced
    discoveries of similar and overlapping ideas.
  178. Not Synced
    And then in the early 90s, John Koza came along
  179. Not Synced
    and really blew open the field by showing
  180. Not Synced
    how we could use genetic algorithms to
  181. Not Synced
    evolve computer programs.
  182. Not Synced
    These separate stream inventions are now
  183. Not Synced
    lumped together and the field is called
  184. Not Synced
    Evolutionary Computation.
  185. Not Synced
    And needless to say, there's been
  186. Not Synced
    a lot of recombination between these
  187. Not Synced
    different origins.