Technology Sharing

Python34 Genetic Algorithm (GA)

2024-07-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

1. Development History

Genetic Algorithm (GA) is an optimization algorithm inspired by natural selection and genetics. It was first proposed by American scholar John Holland in the 1970s to study adaptability in natural systems and apply it to optimization problems in computer science.

Key development history
  • 1975: John Holland first proposed the concept of genetic algorithm in his book "Adaptation in Natural and Artificial Systems".

  • 1980s: Genetic algorithms began to be applied in fields such as function optimization and combinatorial optimization, and gradually attracted attention.

  • 1990s: With the improvement of computer computing power, genetic algorithms have been widely used in engineering, economics, biology and other fields.

  • 2000s to present: The theory and application of genetic algorithms have been further developed, and many improved algorithms have emerged, such as遗传编程(Genetic Programming)、差分进化(Differential Evolution) and so on.

2. Mathematical principles

Genetic algorithm is a random search method based on natural selection and genetics. Its basic idea is to simulate the biological evolution process.choosecrossandMutationsOperation, continuously generating a new generation of population, thus gradually approaching the optimal solution.

The basic steps of genetic algorithm
  1. initialization: Randomly generate the initial population.

  2. choose: Select better individuals according to the fitness function.

  3. cross: New individuals are produced by exchanging some genes of parent individuals.

  4. Mutations: Randomly change some of an individual's genes.

  5. Iteration: Repeat the selection, crossover, and mutation process until the termination condition is met.

Mathematical description

Assuming the population size is N and the chromosome length of each individual is L, the mathematical process of the genetic algorithm is as follows:

  1. initialization: Generate a population , where is a chromosome.

  2. Fitness calculation: Calculate the fitness value of each individual.

  3. choose: Select individuals based on fitness values. Common methods include roulette selection, tournament selection, etc.

  4. cross: Select some individuals for crossover operation to generate new individuals.

  5. Mutations: Perform mutation operations on some new individuals to generate mutant individuals.

  6. Update population: Replace old individuals with new ones to form a new generation of population.

  7. Termination condition: If the preset termination condition (such as the number of iterations or fitness threshold) is reached, the optimal solution is output.

3. Application scenarios

Genetic algorithms have been widely used in many fields due to their adaptability and robustness:

Engineering Optimization
  • Structural Optimization: Design lightweight and high-strength structure.

  • Parameter optimization: Adjust system parameters for optimal performance.

Economics and Finance
  • Portfolio Optimization: Allocate investment assets to maximize returns.

  • market prediction: Predict stock prices and market trends.

Bioinformatics
  • Gene sequence alignment: Perform DNA sequence comparison and analysis.

  • Protein structure prediction: Predict the three-dimensional structure of proteins.

Machine Learning
  • Neural network training: Optimize the weights and structure of neural networks.

  • Feature Selection: Select the features that are most useful for classification or regression.

4. Visualization of Python Examples

picture

The following code example implements a genetic algorithm that solves the traveling salesman problem.旅行商问题The Traveling Salesman Program (TSP) is a classic combinatorial optimization problem that aims to find the shortest path for a traveling salesman to visit each city in a given set of cities exactly once and eventually return to the starting city, that is, to minimize the total travel distance or cost. It is widely used in logistics, production scheduling and other fields.

We first define all provincial capital cities in China (includingTaipei, the capital of Taiwan Province of China) and their coordinates, and use哈夫曼公式Calculate the distance between two points, and then use遗传算法Create an initial population, reproduce, mutate and generate the next generation, continuously optimize the path, and finally record the distance and corresponding path of the shortest path in each generation, find the shortest path in all iterations, and visualize it on the mapAll city locationsFitness changes with the number of iterationsThe final result shows the shortest path from Chongqing through all cities and its total distance. The training was conducted twice. The first training was conducted with 2000 iterations and a running time of about 3 minutes. The second training was conducted with 20000 iterations and a running time of about 15 minutes.

To sum it up in one sentence: Starting from Chongqing, travel to all the provincial capitals in China and use genetic algorithms to find the shortest path. The following is the code implementation:

  1. import numpy as np
  2. import matplotlib.pyplot as plt
  3. from mpl_toolkits.basemap import Basemap
  4. import random
  5. from math import radians, sin, cos, sqrt, atan2
  6. # 设置随机种子以保证结果可重复
  7. random.seed(42)
  8. np.random.seed(42)
  9. # 定义城市和坐标列表,包括台北
  10. cities = {
  11.     "北京": (39.9042116.4074),
  12.     "天津": (39.3434117.3616),
  13.     "石家庄": (38.0428114.5149),
  14.     "太原": (37.8706112.5489),
  15.     "呼和浩特": (40.8426111.7511),
  16.     "沈阳": (41.8057123.4315),
  17.     "长春": (43.8171125.3235),
  18.     "哈尔滨": (45.8038126.5349),
  19.     "上海": (31.2304121.4737),
  20.     "南京": (32.0603118.7969),
  21.     "杭州": (30.2741120.1551),
  22.     "合肥": (31.8206117.2272),
  23.     "福州": (26.0745119.2965),
  24.     "南昌": (28.6820115.8579),
  25.     "济南": (36.6512117.1201),
  26.     "郑州": (34.7466113.6254),
  27.     "武汉": (30.5928114.3055),
  28.     "长沙": (28.2282112.9388),
  29.     "广州": (23.1291113.2644),
  30.     "南宁": (22.8170108.3665),
  31.     "海口": (20.0174110.3492),
  32.     "重庆": (29.5638106.5507),
  33.     "成都": (30.5728104.0668),
  34.     "贵阳": (26.6477106.6302),
  35.     "昆明": (25.0460102.7097),
  36.     "拉萨": (29.652091.1721),
  37.     "西安": (34.3416108.9398),
  38.     "兰州": (36.0611103.8343),
  39.     "西宁": (36.6171101.7782),
  40.     "银川": (38.4872106.2309),
  41.     "乌鲁木齐": (43.825687.6168),
  42.     "台北": (25.032969121.565418)
  43. }
  44. # 城市列表和坐标
  45. city_names = list(cities.keys())
  46. locations = np.array(list(cities.values()))
  47. chongqing_index = city_names.index("重庆")
  48. # 使用哈夫曼公式计算两点间的距离
  49. def haversine(lat1, lon1, lat2, lon2):
  50.     R = 6371.0  # 地球半径,单位为公里
  51.     dlat = radians(lat2 - lat1)
  52.     dlon = radians(lon1 - lon2)
  53.     a = sin(dlat / 2)**2 + cos(radians(lat1)) * cos(radians(lat2)) * sin(dlon / 2)**2
  54.     c = 2 * atan2(sqrt(a), sqrt(1 - a))
  55.     distance = R * c
  56.     return distance
  57. # 计算路径总距离的函数,单位为公里
  58. def calculate_distance(path):
  59.     return sum(haversine(locations[path[i]][0], locations[path[i]][1], locations[path[i + 1]][0], locations[path[i + 1]][1]) for i in range(len(path) - 1))
  60. # 创建初始种群
  61. def create_initial_population(size, num_cities):
  62.     population = []
  63.     for _ in range(size):
  64.         individual = random.sample(range(num_cities), num_cities)
  65.         # 确保重庆为起点
  66.         individual.remove(chongqing_index)
  67.         individual.insert(0, chongqing_index)
  68.         population.append(individual)
  69.     return population
  70. # 对种群进行排名
  71. def rank_population(population):
  72.     # 按照路径总距离对种群进行排序
  73.     return sorted([(i, calculate_distance(individual)) for i, individual in enumerate(population)], key=lambda x: x[1])
  74. # 选择交配池
  75. def select_mating_pool(population, ranked_pop, elite_size):
  76.     # 选择排名前elite_size的个体作为交配池
  77.     return [population[ranked_pop[i][0]] for i in range(elite_size)]
  78. # 繁殖新个体
  79. def breed(parent1, parent2):
  80.     # 繁殖两个父母生成新个体
  81.     geneA = int(random.random() * (len(parent1) - 1)) + 1
  82.     geneB = int(random.random() * (len(parent1) - 1)) + 1
  83.     start_gene = min(geneA, geneB)
  84.     end_gene = max(geneA, geneB)
  85.     child = parent1[:start_gene] + parent2[start_gene:end_gene] + parent1[end_gene:]
  86.     child = list(dict.fromkeys(child))
  87.     missing = set(range(len(parent1))) - set(child)
  88.     for m in missing:
  89.         child.append(m)
  90.     # 确保重庆为起点
  91.     child.remove(chongqing_index)
  92.     child.insert(0, chongqing_index)
  93.     return child
  94. # 突变个体
  95. def mutate(individual, mutation_rate):
  96.     for swapped in range(1, len(individual) - 1):
  97.         if random.random() < mutation_rate:
  98.             swap_with = int(random.random() * (len(individual) - 1)) + 1
  99.             individual[swapped], individual[swap_with= individual[swap_with], individual[swapped]
  100.     return individual
  101. # 生成下一代
  102. def next_generation(current_gen, elite_size, mutation_rate):
  103.     ranked_pop = rank_population(current_gen)
  104.     mating_pool = select_mating_pool(current_gen, ranked_pop, elite_size)
  105.     children = []
  106.     length = len(mating_pool) - elite_size
  107.     pool = random.sample(mating_pool, len(mating_pool))
  108.     for i in range(elite_size):
  109.         children.append(mating_pool[i])
  110.     for i in range(length):
  111.         child = breed(pool[i], pool[len(mating_pool)-i-1])
  112.         children.append(child)
  113.     next_gen = [mutate(ind, mutation_rate) for ind in children]
  114.     return next_gen
  115. # 遗传算法主函数
  116. def genetic_algorithm(population, pop_size, elite_size, mutation_rate, generations):
  117.     pop = create_initial_population(pop_size, len(population))
  118.     progress = [(0, rank_population(pop)[0][1], pop[0])]  # 记录每代的最短距离和代数
  119.     for i in range(generations):
  120.         pop = next_generation(pop, elite_size, mutation_rate)
  121.         best_route_index = rank_population(pop)[0][0]
  122.         best_distance = rank_population(pop)[0][1]
  123.         progress.append((i + 1, best_distance, pop[best_route_index]))
  124.     best_route_index = rank_population(pop)[0][0]
  125.     best_route = pop[best_route_index]
  126.     return best_route, progress
  127. # 调整参数
  128. pop_size = 500       # 增加种群大小
  129. elite_size = 100     # 增加精英比例
  130. mutation_rate = 0.005 # 降低突变率
  131. generations = 20000   # 增加迭代次数
  132. # 运行遗传算法
  133. best_route, progress = genetic_algorithm(city_names, pop_size, elite_size, mutation_rate, generations)
  134. # 找到最短距离及其对应的代数
  135. min_distance = min(progress, key=lambda x: x[1])
  136. best_generation = min_distance[0]
  137. best_distance = min_distance[1]
  138. # 图1:地图上显示城市位置
  139. fig1, ax1 = plt.subplots(figsize=(1010))
  140. m1 = Basemap(projection='merc', llcrnrlat=18, urcrnrlat=50, llcrnrlon=80, urcrnrlon=135, resolution='l', ax=ax1)
  141. m1.drawcoastlines()
  142. m1.drawcountries()
  143. m1.fillcontinents(color='lightgray', lake_color='aqua')
  144. m1.drawmapboundary(fill_color='aqua')
  145. for i, (lat, lon) in enumerate(locations):
  146.     x, y = m1(lon, lat)
  147.     m1.plot(x, y, 'bo')
  148.     ax1.text(x, y, str(i), fontsize=8)
  149. ax1.set_title('City Locations')
  150. plt.show()
  151. # 图2:适应度随迭代次数变化
  152. fig2, ax2 = plt.subplots(figsize=(105))
  153. ax2.plot([x[1for x in progress])
  154. ax2.set_ylabel('Distance (km)')
  155. ax2.set_xlabel('Generation')
  156. ax2.set_title('Fitness over Iterations')
  157. # 在图中标注最短距离和对应代数
  158. ax2.annotate(f'Gen: {best_generation}nDist: {best_distance:.2f} km'
  159.              xy=(best_generation, best_distance), 
  160.              xytext=(best_generation, best_distance + 100),
  161.              arrowprops=dict(facecolor='red', shrink=0.05))
  162. plt.show()
  163. # 图3:显示最优路径
  164. fig3, ax3 = plt.subplots(figsize=(1010))
  165. m2 = Basemap(projection='merc', llcrnrlat=18, urcrnrlat=50, llcrnrlon=80, urcrnrlon=135, resolution='l', ax=ax3)
  166. m2.drawcoastlines()
  167. m2.drawcountries()
  168. m2.fillcontinents(color='lightgray', lake_color='aqua')
  169. m2.drawmapboundary(fill_color='aqua')
  170. # 绘制最优路径和有向箭头
  171. for i in range(len(best_route) - 1):
  172.     x1, y1 = m2(locations[best_route[i]][1], locations[best_route[i]][0])
  173.     x2, y2 = m2(locations[best_route[i + 1]][1], locations[best_route[i + 1]][0])
  174.     m2.plot([x1, x2], [y1, y2], color='g', linewidth=1, marker='o')
  175. # 只添加一个代表方向的箭头
  176. mid_index = len(best_route) // 2
  177. mid_x1, mid_y1 = m2(locations[best_route[mid_index]][1], locations[best_route[mid_index]][0])
  178. mid_x2, mid_y2 = m2(locations[best_route[mid_index + 1]][1], locations[best_route[mid_index + 1]][0])
  179. ax3.annotate('', xy=(mid_x2, mid_y2), xytext=(mid_x1, mid_y1), arrowprops=dict(facecolor='blue', shrink=0.05))
  180. # 在最优路径图上绘制城市位置
  181. for i, (lat, lon) in enumerate(locations):
  182.     x, y = m2(lon, lat)
  183.     m2.plot(x, y, 'bo')
  184.     ax3.text(x, y, str(i), fontsize=8)
  185. # 添加起点和终点标记
  186. start_x, start_y = m2(locations[best_route[0]][1], locations[best_route[0]][0])
  187. end_x, end_y = m2(locations[best_route[-1]][1], locations[best_route[-1]][0])
  188. ax3.plot(start_x, start_y, marker='^', color='red', markersize=15, label='Start (Chongqing)')  # 起点
  189. ax3.plot(end_x, end_y, marker='*', color='blue', markersize=15, label='End')   # 终点
  190. # 添加总距离的图例
  191. ax3.legend(title=f'Total Distance: {best_distance:.2f} km')
  192. ax3.set_title('Optimal Path')
  193. plt.show()

Visualization of the locations of all provincial capitals in China:

picture

The number of iterations is 2,000 (the shortest distance is 31,594 kilometers):

Find the iteration with the shortest path in the history of iterations and visualize the shortest path.

picture

picture

Case 2: The number of iterations is 20,000 (the shortest distance is 29,768 kilometers)

Find the iteration with the shortest path in the historical iteration process and visualize the shortest path of that iteration.

picture

picture

The above content is summarized from the Internet. If it is helpful, please forward it. See you next time!