Data smoothing (partial)

2024-07-12

1. Moving Average

It is the simplest data smoothing method used to smooth time series data. It reduces noise by calculating the average value of data points within a certain window while retaining the trend of the data.Simple Moving Average (SMA)or indexWeighted Moving Average (EMA).

        plt.rcParams['font.sans-serif'] = ['SimHei']，在整篇文章的代码中绘制图的时候加入此句代码，可以解决下图中文字标题显示不出的问题。

1.1 Simple Moving Average (SMA)：

It is a method of smoothing data by calculating the average of the data points within a fixed window. The size of the window determines the degree of smoothing. A larger window will result in a smoother curve but will react slower to trends, while a smaller window will be more sensitive to fluctuations in the data.

Understanding of the above code: The blue line chart represents the array generated in the code. The calculation method of simple moving average:

The change of the window will change the smoothing effect and the smoothness of the red curve. When the window size is 3, it can be seen that the difficult-to-smooth part has disappeared and the general curve has become smooth, but when the window size is continuously increased, a small straight line is finally obtained. The red curve is the prediction curve.

It should be noted that when installing the seaborn and matplotlib packages, it is difficult to install using >python -m pip install matplotlib or pip install matplotlib https://pypi.tuna.tsinghua.edu.cn/simple or pin install matplotlib, and the following prompt appears in the terminal:

Cannot unpack file C:UsersHONORAppDataLocalTemppip-unpack-4qkfflipsimple.html (downloaded from C:UsersHONORAppDataLocalTemppip-req-build-s6_3j05c, content-type: text/html); cannot detect archive format
ERROR: Cannot determine archive format of C:UsersHONORAppDataLocalTemppip-req-build-s6_3j05c

Use the following statement to install successfully:

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn matplotlib

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn seaborn

1.2 Exponential Weighted Moving Average, EWMA：

The exponentially weighted moving average is a method of smoothing data by applying exponential weights to the data points.The most recent data points are given higher weight, whileOlder data points are given lower weights. This makes the EMA moreSuitable for tracking fast changesThe data.

Understanding of the exponentially weighted moving average code: import the corresponding package, generate the corresponding array (i.e. the blue line), define the smoothing factor, when the smoothing factor is smaller, the smoothing effect is stronger. Vice versa. So what is the algorithm of the exponentially weighted moving average?

The weighted moving average of period t is used as the forecast value for period t+1.

2. Exponential Smoothing

Exponential smoothing is a commonly used time series data smoothing and forecasting method for processing data with trends and seasonality. It captures the changing trend of data by assigning different weights to historical data points, assigning higher weights to newer data. Exponential smoothing is often used to generate forecasts, especially when forecasts for future time points are required.

The main characteristics of exponential smoothing include:

Weighted Smoothing: Exponential smoothing uses exponential weights to smooth the data. Newer data points receive higher weights, while older data points receive lower weights. This means that it is more sensitive to recent data, thus better capturing recent trends in the data.
Three main forms: There are three main forms of exponential smoothing: simple exponential smoothing, double exponential smoothing, and triple exponential smoothing. Each form is used for different types of data and patterns.
- Simple Exponential Smoothing is used to smooth data with trend and seasonality.
- Double Exponential Smoothing is used to smooth data that has a trend but no seasonality.
- Triple Exponential Smoothing is used to smooth data that has both trend and seasonality.
Recursive Update: Exponential smoothing is a recursive method that combines previous smoothed results with new data points to generate a smoothed result for the next time point.
Ability to predict: Exponential smoothing is not only used to smooth data, but it can also be used to generate forecasts for future points in time. This makes it very useful in areas such as demand forecasting, stock price forecasting, and sales forecasting.
applicability: Exponential smoothing is applicable to stationary or non-stationary time series data, and it can handle trends, seasonality, and noise well.
case:

Results:

3. Polynomial Fitting

Polynomial Fitting is a method of data smoothing and curve fitting that uses a polynomial function to approximate or fit the original data in order to better describe the trend or pattern of the data. The goal of polynomial fitting is to find a polynomial function that passes through the given data points and fits these points well.

The general form of a polynomial fit is as follows:

Where is the independent variable, is the dependent variable that depends on, and are the polynomial coefficients. By adjusting these coefficients, the polynomial function can be made to fit the data better.

Polynomial fitting is often used in the following situations:

Data smoothing: Polynomial fitting can be used to eliminate noise or fluctuations in the data to obtain a smooth curve.
trend analysis: Polynomial fitting can be used to identify trends in the data, such as linear trends (first-order polynomials), quadratic trends (second-order polynomials), or higher-order trends.
Curve Fitting: Polynomial fitting can be used to fit experimental data to obtain the best fit to a theoretical model or theoretical curve.
Data interpolation: Polynomial interpolation is a special case of polynomial fitting, which estimates intermediate values by fitting a polynomial between known data points.

The general principle of polynomial fitting is to choose the appropriate polynomial order. Too low an order may not fit the data well, while too high an order may lead to overfitting and be very sensitive to fluctuations in new data. Therefore, choosing the appropriate polynomial order is key. Trinomial case:

Technology Sharing