Technology sharing

Datawhale 2024 AI Aestiva Castra Secunda Phase - Power Demand Praedictio provocare

2024-07-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

#AIsummercamp# Datawhale #summercamp

1. Ad competition

Celeri progressione oeconomiae globalis et accelerationis urbanizationis, ratio potentiae provocationum auget. Accurate praevisio electricitatis postulatio pendet ad operationem virtutis stabilis eget, energiae efficax administratio, et integratio renovationis fontium energiae.

2. Event tasks

Datae consequentiae notitiae et aliae notitiae N dierum electricitatis consumptio historiae multiplicibus domibus respondens, dicunt consumptionem electricitatis domibus respondentem.

2024 iFLYTEK AI Developer Competition-iFLYTEK Patefacio rostris  

3.Task2 Advanced lightgbm, satus pluma engineering

(1) Modulus Import:Haec sectionem continet modulorum quae ex codice requiruntur

  1. import numpy as np
  2. import pandas as pd
  3. import lightgbm as lgb
  4. from sklearn.metrics import mean_squared_log_error, mean_absolute_error
  5. import tqdm
  6. import sys
  7. import os
  8. import gc
  9. import argparse
  10. import warnings
  11. warnings.filterwarnings('ignore')

(2) Data praeparatio

In scenis apparandis, notitia et probatio disciplinae maxime leguntur, et notitia fundamentalis ostentationis exercetur.

  1. train = pd.read_csv('./data/train.csv')
  2. test = pd.read_csv('./data/test.csv')

Brevis introductio ad notitias: Inter eos, id est domus ID, dt dies identificans, minima educatio data dt XI, diversae IDs correspondent longitudini seriei diversae; Consumptio diversorum domorum; Simplex analysis visualis haec est ad adiuvandum nos ut simplex intellectus notitiarum.

  • Histogram scutorum secundum diversa genera

  1. import matplotlib.pyplot as plt
  2. # 不同type类型对应target的柱状图
  3. type_target_df = train.groupby('type')['target'].mean().reset_index()
  4. plt.figure(figsize=(8, 4))
  5. plt.bar(type_target_df['type'], type_target_df['target'], color=['blue', 'green'])
  6. plt.xlabel('Type')
  7. plt.ylabel('Average Target Value')
  8. plt.title('Bar Chart of Target by Type')
  9. plt.show()

  • Linea scopo cum id ut 00037f39cf ut series secundum dt

  1. specific_id_df = train[train['id'] == '00037f39cf']
  2. plt.figure(figsize=(10, 5))
  3. plt.plot(specific_id_df['dt'], specific_id_df['target'], marker='o', linestyle='-')
  4. plt.xlabel('DateTime')
  5. plt.ylabel('Target Value')
  6. plt.title("Line Chart of Target for ID '00037f39cf'")
  7. plt.show()

(3) Feature engineering

Hic lineamenta translationis historicae et fenestrae statisticae notae maxime construuntur;

  • Notae historicae translationis:Informatio prioris statis per translationem historicam obtinetur; ut in figura infra ostenditur, notitia temporis d-1 dari potest tempori d, notitia temporis d dari tempori d+1, ita intellegens the feature construction of one of translation.

  • Fenestra statistica notae: Fenestra statistica varias magnitudinum fenestrarum construere potest, ac deinde medium, maximum, minimum, medianum et varias notitias computare, quae in fenestra range fundatur, quae mutationes notitiarum in scaena recenti considerare possunt. Ut in figura infra ostenditur, notitia trium unitatum ante d momentum peraeque ad d momentum dandum construi potest.

    1. # 合并训练数据和测试数据,并进行排序
    2. data = pd.concat([test, train], axis=0, ignore_index=True)
    3. data = data.sort_values(['id','dt'], ascending=False).reset_index(drop=True)
    4. # 历史平移
    5. for i in range(10,30):
    6. data[f'last{i}_target'] = data.groupby(['id'])['target'].shift(i)
    7. # 窗口统计
    8. data[f'win3_mean_target'] = (data['last10_target'] + data['last11_target'] + data['last12_target']) / 3
    9. # 进行数据切分
    10. train = data[data.target.notnull()].reset_index(drop=True)
    11. test = data[data.target.isnull()].reset_index(drop=True)
    12. # 确定输入特征
    13. train_cols = [f for f in data.columns if f not in ['id','target']]

    IV) exemplar disciplina et test paro praedictio

    Exemplar eligens uti in Lightgbm soleat uti exemplar baseline in certationis fodiendarum notitia. Potest ustulo relative stabili sine necessitate processum parametri tionibus obtinere. Praeterea notandum est quod constructio institutionis et certae confirmationis, quia notitia seriei temporis relationem habet, stricte dividitur secundum seriem temporis data, et prior notitia pro notitia confirmationis adhibetur, hoc efficit ut nulla quaestio in notitia transeat (praedicatio notitiae historicae futurae notitiae non adhibetur).

  1. def time_model(clf, train_df, test_df, cols):
  2. # 训练集和验证集切分
  3. trn_x, trn_y = train_df[train_df.dt>=31][cols], train_df[train_df.dt>=31]['target']
  4. val_x, val_y = train_df[train_df.dt<=30][cols], train_df[train_df.dt<=30]['target']
  5. # 构建模型输入数据
  6. train_matrix = clf.Dataset(trn_x, label=trn_y)
  7. valid_matrix = clf.Dataset(val_x, label=val_y)
  8. # lightgbm参数
  9. lgb_params = {
  10. 'boosting_type': 'gbdt',
  11. 'objective': 'regression',
  12. 'metric': 'mse',
  13. 'min_child_weight': 5,
  14. 'num_leaves': 2 ** 5,
  15. 'lambda_l2': 10,
  16. 'feature_fraction': 0.8,
  17. 'bagging_fraction': 0.8,
  18. 'bagging_freq': 4,
  19. 'learning_rate': 0.05,
  20. 'seed': 2024,
  21. 'nthread' : 16,
  22. 'verbose' : -1,
  23. }
  24. # 训练模型
  25. model = clf.train(lgb_params, train_matrix, 50000, valid_sets=[train_matrix, valid_matrix],
  26. categorical_feature=[], verbose_eval=500, early_stopping_rounds=500)
  27. # 验证集和测试集结果预测
  28. val_pred = model.predict(val_x, num_iteration=model.best_iteration)
  29. test_pred = model.predict(test_df[cols], num_iteration=model.best_iteration)
  30. # 离线分数评估
  31. score = mean_squared_error(val_pred, val_y)
  32. print(score)
  33. return val_pred, test_pred
  34. lgb_oof, lgb_test = time_model(lgb, train, test, train_cols)
  35. # 保存结果文件到本地
  36. test['target'] = lgb_test
  37. test[['id','dt','target']].to_csv('submit.csv', index=None)

accipere "Fractio.

4.Advanced

Momentum pluma engineering per se notum est

Quaedam lineamenta plura addidi, plura potui addere, sed Colab memoriam satis non habuit.

Milia notitiarum sunt, et modus non est plura addere.