Technology sharing

[Machine Learning] Applicatio subsidii vectoris machinis ac analysi principali componentis in machina discendi

2024-07-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Machina vector (SVM) valida et versatilis est machina discendi exemplar idoneum ad classificationem, regressionem, et deprehensionem exteriorem pro lineari et nonlineari. Articulus hic subsidium vectoris machinae algorithmi et eius exsecutionem in scikit-discendo introducet, et breviter explorandum principalem analysin componentem et eius applicationem in scikit-discendo.

1. Overview subsidii vector machinis

Support vectoris machinae (SVM) algorithmus late usus est in agro machinae eruditionis, favens pro sua proprietate significans accurate comparandi facultates minus computando. SVM adhiberi potest ad negotia tum classificationem et regressionem, sed late in quaestionibus classificationis adhibetur.

Quid est machina vector sustentaculum?

Propositum subsidii vector machinis est NNN dimensiva ( NNN est numerus linearum) invenire hyperplanum qui notitias puncta clare distinguere potest. Haec hyperplane dat puncta diversa genera separanda et quantum fieri potest a hyperplane, ut validitas classificationis invigilet.

Insert imaginem descriptionis hic

Ad efficaciorem separationem notitiarum punctorum, plures hyperplanes existere possunt. Propositum est hyperplanem cum margine maximo eligere, i.e., maximam distantiam inter duas classes. Maximising margines accurationem classificationis melioris adiuvat.

Hyperplanes et vector support

Insert imaginem descriptionis hic

Hyperplana est terminus decisionis qui puncta data dividit. Puncta data in utraque parte hyperplani posita in varia genera collocari possunt. Hyperplane dimensiones a numero linearum pendent: si lineamenta input II sunt, hyperplanum est linea recta; Cum numerus linearum 3 superat, hyperplane difficilis fit intellectui intuenti.

Insert imaginem descriptionis hic

Vectores adiuvant ad ea puncta hyperplani proxima, quae situm et directionem hyperplani afficiunt. His vectoribus sustentandis, marginem classificantis augere possumus. Vectores removentes locum mutat hyperplani, ideo cruciales sunt ad aedificandum SVM.

Magnum marginem intuitum

In regressione logistica, munus sigmoidea utimur comprimere output valorem functionis linearis in ambitum [0,1] et pittacia in limine innixa assignare (0.5). In SVM, valore functionis linearis ad definiendum in SVM utimur: si output maior est quam 1, unius classis convenit; SVM range marginales format [-1,1] ponendo limen output valorem ad 1 et -1.

2. Data preprocessing et visualization

Praenuntians diagnoses cancros benignos et malignos utens machinis vectoris sustentantibus.

Basic notitia de notitia paro

  • Numerus linearum XXX est, ut:
    • Radii (mediocris distantiae a centro ad puncta perimetri)
    • Texturae (vexillum deviationis grayscale values)
    • perimetri
    • area
    • Levitas (variatio loci in longitudinem radii)
    • Compactness (Perimeter^2 / Area - 1.0)
    • Concavum (gravitas depressum ex forma)
    • Dimples (numerus partium concavarum forma)
    • symmetria
    • Dimensio fracta ("Coastline approximatio" - 1)
  • Data copia 569 exempla continet, et classis distributio 212 exemplaria maligna et exempla benigna 357 est.
  • Scopum genus:
    • malignus
    • benign

Importare necessarias bibliothecas

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline
sns.set_style('whitegrid')
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

Onus dataset

from sklearn.datasets import load_breast_cancer

cancer = load_breast_cancer()

# 创建 DataFrame
col_names = list(cancer.feature_names)
col_names.append('target')
df = pd.DataFrame(np.c_[cancer.data, cancer.target], columns=col_names)
df.head()
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

Data overview

df.info()print(cancer.target_names)
# ['malignant', 'benign']

# 数据描述:
df.describe()
# 统计摘要:
df.info()
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

data visualization

Scatterplot matrix pluma paria
sns.pairplot(df, hue='target', vars=[
    'mean radius', 'mean texture', 'mean perimeter', 'mean area',
    'mean smoothness', 'mean compactness', 'mean concavity',
    'mean concave points', 'mean symmetry', 'mean fractal dimension'
])
  • 1
  • 2
  • 3
  • 4
  • 5
Categoria distributio talea chart
sns.countplot(x=df['target'], label="Count")
  • 1

Insert imaginem descriptionis hic

Scatterplot mediocris area versus mediocris lenitas
plt.figure(figsize=(10, 8))
sns.scatterplot(x='mean area', y='mean smoothness', hue='target', data=df)
  • 1
  • 2

Insert imaginem descriptionis hic

Heatmap correlationes inter variabiles
plt.figure(figsize=(20,10))
sns.heatmap(df.corr(), annot=True)
  • 1
  • 2

Insert imaginem descriptionis hic

3. Exemplar disciplinae (quaestio solvenda)

In apparatus eruditionis exemplar disciplina est gradus criticus in solutionibus problematum inveniendis.Infra inducemus quomodo utendumscikit-learn Exsequere exemplar disciplinae et monstrare effectum machinae vectoris sustentationis (SVM) sub diversis nucleis.

Data praeparatio et preprocessing

Primum opus est data preprocess parare. Sequens signum est exemplum notitiae preprocessing:

from sklearn.model_selection import cross_val_score, train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, MinMaxScaler

X = df.drop('target', axis=1)
y = df.target

print(f"'X' shape: {X.shape}")
print(f"'y' shape: {y.shape}")
# 'X' shape: (569, 30)
# 'y' shape: (569,)

pipeline = Pipeline([
    ('min_max_scaler', MinMaxScaler()),
    ('std_scaler', StandardScaler())
])

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18

In codice utimur MinMaxScaler etStandardScaler Data scande. Data dividitur in exercitationes et probationes, cum 30% notitiarum usus ad probationem.

Evaluate exemplar perficientur

Ad exemplar faciendum aestimare, definimus a print_score munus, quod potest subtilitatem, classificationem, famam et confusionem matricis institutionis et eventus exponere;

from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
import pandas as pd

def print_score(clf, X_train, y_train, X_test, y_test, train=True):
    if train:
        pred = clf.predict(X_train)
        clf_report = pd.DataFrame(classification_report(y_train, pred, output_dict=True))
        print("Train Result:n================================================")
        print(f"Accuracy Score: {accuracy_score(y_train, pred) * 100:.2f}%")
        print("_______________________________________________")
        print(f"CLASSIFICATION REPORT:n{clf_report}")
        print("_______________________________________________")
        print(f"Confusion Matrix: n {confusion_matrix(y_train, pred)}n")
    else:
        pred = clf.predict(X_test)
        clf_report = pd.DataFrame(classification_report(y_test, pred, output_dict=True))
        print("Test Result:n================================================")        
        print(f"Accuracy Score: {accuracy_score(y_test, pred) * 100:.2f}%")
        print("_______________________________________________")
        print(f"CLASSIFICATION REPORT:n{clf_report}")
        print("_______________________________________________")
        print(f"Confusion Matrix: n {confusion_matrix(y_test, pred)}n")
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22

Support vector apparatus (SVM)

Support vectoris machina (SVM) potens est algorithmus classificatio cuius effectus ab hyperparametris afficitur. Sequentes ambitum principalem SVM introducent et eorum impulsum ad exemplar faciendum:

  • C parametri : Commercium commercium temperat inter puncta disciplinae recte distinguens et limites decisionis lenis habens.minor CCC(Solve) misclassification costs facit (poenas) inferiores (margine mollis), cum maior CCCmisclassificationem pretiosiorem facit (margo dura), exemplum cogere ad inputandam notitias strictius interpretandas.
  • gamma parametri : regere ambitum auctoritatis unius disciplinae.maior γ gammaγ Scopum influentiae propius fac (notitia puncta propius pondus habent) minora γ gammaγ Fac ut latius pateat (solutiones latiores).
  • gradus parametri: nuclei polynomiae functionis ('poly' ) neglecta ab aliis nucleis. Valores hyperparametri optimales per inquisitionem eget invenientur.

Linearis Kernel SVM

nucleus linearis SVM pluribus adiunctis aptus est, praesertim cum notitia copia magnum numerum linearum habet. Sequens exemplum nuclei SVM linearis est codicis usus;

from sklearn.svm import LinearSVC

model = LinearSVC(loss='hinge', dual=True)
model.fit(X_train, y_train)

print_score(model, X_train, y_train, X_test, y_test, train=True)
print_score(model, X_train, y_train, X_test, y_test, train=False)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

Disciplina et probatio eventus sunt hae:

Proventus Training:

Accuracy Score: 86.18%
_______________________________________________
CLASSIFICATION REPORT:
                  0.0         1.0  accuracy   macro avg  weighted avg
precision    1.000000    0.819079  0.861809    0.909539      0.886811
recall       0.630872    1.000000  0.861809    0.815436      0.861809
f1-score     0.773663    0.900542  0.861809    0.837103      0.853042
support    149.000000  249.000000  0.861809  398.000000    398.000000
_______________________________________________
Confusion Matrix: 
 [[ 94  55]
 [  0 249]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

Proventus test:

Accuracy Score: 89.47%
_______________________________________________
CLASSIFICATION REPORT:
                 0.0         1.0  accuracy   macro avg  weighted avg
precision   1.000000    0.857143  0.894737    0.928571      0.909774
recall      0.714286    1.000000  0.894737    0.857143      0.894737
f1-score    0.833333    0.923077  0.894737    0.878205      0.890013
support    63.000000  108.000000  0.894737  171.000000    171.000000
_______________________________________________
Confusion Matrix: 
 [[ 45  18]
 [  0 108]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

Polynomial Kernel SVM

Acinum polynomiale SVM aptum est ad notitias nonlineares. Sequens est exemplum codicis nuclei polynomiae secundae utentis;

from sklearn.svm import SVC

model = SVC(kernel='poly', degree=2, gamma='auto', coef0=1, C=5)
model.fit(X_train, y_train)

print_score(model, X_train, y_train, X_test, y_test, train=True)
print_score(model, X_train, y_train, X_test, y_test, train=False)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

Disciplina et probatio eventus sunt hae:

Proventus Training:

Accuracy Score: 96.98%
_______________________________________________
CLASSIFICATION REPORT:
                  0.0         1.0  accuracy   macro avg  weighted avg
precision    0.985816    0.961089  0.969849    0.973453      0.970346
recall       0.932886    0.991968  0.969849    0.962427      0.969849
f1-score     0.958621    0.976285  0.969849    0.967453      0.969672
support    149.000000  249.000000  0.969849  398.000000    398.000000
_______________________________________________
Confusion Matrix: 
 [[139  10]
 [  2 247]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

Proventus test:

Accuracy Score: 97.08%
_______________________________________________
CLASSIFICATION REPORT:
                 0.0         1.0  accuracy   macro avg  weighted avg
precision   0.967742    0.972477   0.97076    0.970109      0.970733
recall      0.952381    0.981481   0.97076    0.966931      0.970760
f1-score    0.960000    0.976959   0.97076    0.968479      0.970711
support    63.000000  108.000000   0.97076  171.000000    171.000000
_______________________________________________
Confusion Matrix: 
 [[ 60   3]
 [  2 106]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

Munus basis radialis (RBF) nuclei SVM

Munus basis radialis (RBF) nuclei apta sunt ad data nonlinea expedienda. Hoc exemplum in codice RBF core utens est:

model = SVC(kernel='rbf', gamma=0.5, C=0.1)
model.fit(X_train, y_train)

print_score(model, X_train, y_train, X_test, y_test, train=True)
print_score(model, X_train, y_train, X_test, y_test, train=False)
  • 1
  • 2
  • 3
  • 4
  • 5

Disciplina et probatio eventus sunt hae:

Proventus Training:

Accuracy Score: 62.56%
_______________________________________________
CLASSIFICATION REPORT:
             0.0         1.0  accuracy   macro avg  weighted avg
precision    0.0    0.625628  0.625628    0.312814      0.392314
recall       0.0    1.000000  0.625628    0.500000      0.625628
f1-score     0.0    0.769231  0.625628    0.384615      0.615385
support    149.0  249.0    0.625628  398.0        398.0
_______________________________________________
Confusion Matrix: 
 [[  0 149]
 [  0 249]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

Proventus test:

Accuracy Score: 64.97%
_______________________________________________
CLASSIFICATION REPORT:
             0.0         1.0  accuracy   macro avg  weighted avg
precision    0.0    0.655172  0.649661    0.327586      0.409551
recall       0.0    1.000000  0.649661    0.500000      0.649661
f1-score     0.0    0.792453  0.649661    0.396226      0.628252
support    63.0  108.0    0.649661  171.0        171.0
_______________________________________________
Confusion Matrix: 
 [[  0  63]
 [  0 108]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

Summatim

Per exemplar supra disciplinae et aestimationis processum, videre possumus differentias diversorum acinum SVM perficiendi. nucleus linearis SVM bene facit secundum tempus accurationis et disciplinae, et apta ad condiciones cum altioribus dimensionibus datarum. nucleus polynomialis SVM et RBF nucleus SVM meliores effectus in notitia nonlineari habent, sed ad certas moduli unctiones superfluentes ducere possunt. Apta nucleos et hyperparametros eligendo pendet ad exemplar faciendum melius emendandum.

4. SVM notitia praeparationis

digital input : SVM supponit quod input data sit numerorum. Si input data variabiles categoricas sunt, necesse est eas in binas phantasmatas variabiles convertere (unum per genus variabile).

binarii divisio :Basic SVM apta classificationibus binariis quaestionibus est. Etsi SVM maxime adhibeatur pro classificatione binaria, versiones quoque extensae sunt ad regressionem et classificationem multiformem.

X_train = pipeline.fit_transform(X_train)
X_test = pipeline.transform(X_test)
  • 1
  • 2

Exemplar disciplina et iudicium

Sequentia ostendit formationem et probationem eventuum nucleorum diversorum SVM;

Linearis Kernel SVM

print("=======================Linear Kernel SVM==========================")
model = SVC(kernel='linear')
model.fit(X_train, y_train)

print_score(model, X_train, y_train, X_test, y_test, train=True)
print_score(model, X_train, y_train, X_test, y_test, train=False)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

Proventus Training:

Accuracy Score: 98.99%
_______________________________________________
CLASSIFICATION REPORT:
                  0.0         1.0  accuracy   macro avg  weighted avg
precision    1.000000    0.984190   0.98995    0.992095      0.990109
recall       0.973154    1.000000   0.98995    0.986577      0.989950
f1-score     0.986395    0.992032   0.98995    0.989213      0.989921
support    149.000000  249.000000   0.98995  398.000000    398.000000
_______________________________________________
Confusion Matrix: 
 [[145   4]
 [  0 249]]

测试结果

Accuracy Score: 97.66%
_______________________________________________
CLASSIFICATION REPORT:
                 0.0         1.0  accuracy   macro avg  weighted avg
precision   0.968254    0.981481  0.976608    0.974868      0.976608
recall      0.968254    0.981481  0.976608    0.974868      0.976608
f1-score    0.968254    0.981481  0.976608    0.974868      0.976608
support    63.000000  108.000000  0.976608  171.000000    171.000000
_______________________________________________
Confusion Matrix: 
 [[ 61   2]
 [  2 106]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
Polynomial Kernel SVM
print("=======================Polynomial Kernel SVM==========================")
from sklearn.svm import SVC

model = SVC(kernel='poly', degree=2, gamma='auto')
model.fit(X_train, y_train)

print_score(model, X_train, y_train, X_test, y_test, train=True)
print_score(model, X_train, y_train, X_test, y_test, train=False)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

Proventus Training:

Accuracy Score: 85.18%
_______________________________________________
CLASSIFICATION REPORT:
                  0.0         1.0  accuracy   macro avg  weighted avg
precision    0.978723    0.812500  0.851759    0.895612      0.874729
recall       0.617450    0.991968  0.851759    0.804709      0.851759
f1-score     0.757202    0.893309  0.851759    0.825255      0.842354
support    149.000000  249.000000  0.851759  398.000000    398.000000
_______________________________________________
Confusion Matrix: 
 [[ 92  57]
 [  2 247]]

测试结果:

Accuracy Score: 82.46%
_______________________________________________
CLASSIFICATION REPORT:
                 0.0         1.0  accuracy   macro avg  weighted avg
precision   0.923077    0.795455  0.824561    0.859266      0.842473
recall      0.571429    0.972222  0.824561    0.771825      0.824561
f1-score    0.705882    0.875000  0.824561    0.790441      0.812693
support    63.000000  108.000000  0.824561  171.000000    171.000000
_______________________________________________
Confusion Matrix: 
 [[ 36  27]
 [  3 105]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
Basis radialis Function Kernel SVM
print("=======================Radial Kernel SVM==========================")
from sklearn.svm import SVC

model = SVC(kernel='rbf', gamma=1)
model.fit(X_train, y_train)

print_score(model, X_train, y_train, X_test, y_test, train=True)
print_score(model, X_train, y_train, X_test, y_test, train=False)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

Proventus Training:

Accuracy Score: 100.00%
_______________________________________________
CLASSIFICATION REPORT:
             0.0    1.0  accuracy  macro avg  weighted avg
precision    1.0    1.0       1.0        1.0           1.0
recall       1.0    1.0       1.0        1.0           1.0
f1-score     1.0    1.0       1.0        1.0           1.0
support    149.0  249.0       1.0      398.0         398.0
_______________________________________________
Confusion Matrix: 
 [[149   0]
 [  0 249]]

测试结果:

Accuracy Score: 63.74%
_______________________________________________
CLASSIFICATION REPORT:
                 0.0         1.0  accuracy   macro avg  weighted avg
precision   1.000000    0.635294  0.637427    0.817647      0.769659
recall      0.015873    1.000000  0.637427    0.507937      0.637427
f1-score    0.031250    0.776978  0.637427    0.404114      0.502236
support    63.000000  108.000000  0.637427  171.000000    171.000000
_______________________________________________
Confusion Matrix: 
 [[  1  62]
 [  0 108]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27

Support vector machina hyperparametri tuning

from sklearn.model_selection import GridSearchCV

param_grid = {'C': [0.01, 0.1, 0.5, 1, 10, 100], 
              'gamma': [1, 0.75, 0.5, 0.25, 0.1, 0.01, 0.001], 
              'kernel': ['rbf', 'poly', 'linear']} 

grid = GridSearchCV(SVC(), param_grid, refit=True, verbose=1, cv=5)
grid.fit(X_train, y_train)

best_params = grid.best_params_
print(f"Best params: {best_params}")

svm_clf = SVC(**best_params)
svm_clf.fit(X_train, y_train)
print_score(svm_clf, X_train, y_train, X_test, y_test, train=True)
print_score(svm_clf, X_train, y_train, X_test, y_test, train=False)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16

Proventus Training:

Accuracy Score: 98.24%
_______________________________________________
CLASSIFICATION REPORT:
                  0.0         1.0  accuracy   macro avg  weighted avg
precision    0.986301    0.980159  0.982412    0.983230      0.982458
recall       0.966443    0.991968  0.982412    0.979205      0.982412
f1-score     0.976271    0.986028  0.982412    0.981150      0.982375
support    149.000000  249.000000  0.982412  398.000000    398.000000
_______________________________________________
Confusion Matrix: 
 [[144   5]
 [  2 247]]

测试结果:

Accuracy Score: 98.25%
_______________________________________________
CLASSIFICATION REPORT:
                 0.0         1.0  accuracy   macro avg  weighted avg
precision   0.983871    0.981651  0.982456    0.982761      0.982469
recall      0.968254    0.990741  0.982456    0.979497      0.982456
f1-score    0.976000    0.986175  0.982456    0.981088      0.982426
support    63.000000  108.000000  0.982456  171.000000    171.000000
_______________________________________________
Confusion Matrix: 
 [[ 61   2]
 [  1 107]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27

5. Principalis pars analysis

Introductio ad PCA*

Analysis principalis componentis (PCA) est ars quae reductionem dimensionalitatem linearem attingit, notitias in spatium inferiorem dimensiva eminens.

  • Utere singulari valore compositioneProiecta notitia in spatium humile dimensiva per singularem valorem compositionis.
  • praetervised doctrina:PCA intitulatum pro dimensionality reductione data non requirit.
  • Pluma transformatio: Quaere quaere quae lineamenta maxime dissident in notitia.

PCA visualization

Cum alta notitia dimensiva directe ad visualizare difficile est, PCA uti possumus prima duo principalia elementa invenire et notitias in spatio duos dimensionis visualizare. Ad hoc assequendum notitias primum normalizandas esse oportet ut variatio uniuscuiusque notae sit unitas variatio.

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

# 数据标准化
scaler = StandardScaler()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# PCA 降维
pca = PCA(n_components=2)
X_train = pca.fit_transform(X_train)
X_test = pca.transform(X_test)

# 可视化前两个主成分
plt.figure(figsize=(8,6))
plt.scatter(X_train[:,0], X_train[:,1], c=y_train, cmap='plasma')
plt.xlabel('First Principal Component')
plt.ylabel('Second Principal Component')
plt.show()
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22

Insert imaginem descriptionis hic

Per prima duo principalia elementa facile distinguere possumus diversa genera notitiarum punctorum in spatio duorum dimensionum.

Explica components

Etsi extensio dimensiva potens est, significatio partium difficile est directe intelligere. Unaquaeque pars correspondet complexioni notarum originalium, quae apto obiecto PCA obtineri possunt.

Component proprietates cognatae include:

  • ingrediens score: valorem variabilis transformatus.
  • Onus (pondus): deducto pondere lineamentorum.
  • Data compressione et notitia conservationis: Compressionem datam consequi per PCA retinendo clavem informationem.
  • sonitus eliquare: Sonus eliquari potest in processu dimensionalitatis reductionis.
  • Feature extraction and engineering: novas lineas extrahere et construere solebat.

Parameter temperatio apparatus vectoris (SVM) exemplar

Cum vectoris machinis subsidiis adhibitis (SVM) ad exemplar disciplinae adhibitis, hyperparametris aptandis opus est ad optimum exemplar obtinendum. Hoc exemplum est exemplum codicis ad SVM parametris aptandis quaerendis quaerendis (GridSearchCV);

from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV

# 定义参数网格
param_grid = {'C': [0.01, 0.1, 0.5, 1, 10, 100], 
              'gamma': [1, 0.75, 0.5, 0.25, 0.1, 0.01, 0.001], 
              'kernel': ['rbf', 'poly', 'linear']} 

# 网格搜索
grid = GridSearchCV(SVC(), param_grid, refit=True, verbose=1, cv=5)
grid.fit(X_train, y_train)
best_params = grid.best_params_
print(f"Best params: {best_params}")

# 使用最佳参数训练模型
svm_clf = SVC(**best_params)
svm_clf.fit(X_train, y_train)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

Proventus Training:

Accuracy Score: 96.48%
_______________________________________________
CLASSIFICATION REPORT:
                  0.0         1.0  accuracy   macro avg  weighted avg
precision    0.978723    0.957198  0.964824    0.967961      0.965257
recall       0.926174    0.987952  0.964824    0.957063      0.964824
f1-score     0.951724    0.972332  0.964824    0.962028      0.964617
support    149.000000  249.000000  0.964824  398.000000    398.000000
_______________________________________________
Confusion Matrix: 
 [[138  11]
 [  3 246]]

测试结果:

Accuracy Score: 96.49%
_______________________________________________
CLASSIFICATION REPORT:
                 0.0         1.0  accuracy   macro avg  weighted avg
precision   0.967213    0.963636  0.964912    0.965425      0.964954
recall      0.936508    0.981481  0.964912    0.958995      0.964912
f1-score    0.951613    0.972477  0.964912    0.962045      0.964790
support    63.000000  108.000000  0.964912  171.000000    171.000000
_______________________________________________
Confusion Matrix: 
 [[ 59   4]
 [  2 106]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27

6. Libri

In hoc capitulo sequentia didicimus:

  • Support vector apparatus (SVM): Primas notiones SVM et eius exsecutionem in Pythone intellige.
  • SVM nucleo functionis: Inter functiones basin lineares, radiales (RBF) et functiones nucleos polynomiales.
  • notitia praeparatio: qvam parandas notitias ad SVM algorithm.
  • Hyperparameter tuning: Hitum SVM hyperparametri per inquisitionem eget.
  • Principal Component Analysis (PCA): PCA uti ad notitias multiplicitatem reducendas et in scikit-discendo.

refer ad:Support Vector Machine & PCA Tutorial for Beginner
Suadeo mihi cognatas columnas;


Insert imaginem descriptionis hic