Skip to content

Commit 9ac4748

Browse files
committed
Update English README.md
1 parent 50a889d commit 9ac4748

File tree

1 file changed

+58
-5
lines changed

1 file changed

+58
-5
lines changed

README.md

Lines changed: 58 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,19 +2,35 @@
22

33
**Read this in other languages: [简体中文](./README.zh-CN.md)**
44

5-
The go_numcalc package is a package developed in Go language. Its main function is to perform basic numerical processing such as data conversion, data grouping, and data smoothing for numerical type data.
5+
The go_numcalc package is a package developed in Go language. Its main function is to perform basic numerical processing
6+
such as data conversion, data grouping, and data smoothing for numerical type data.
67
All functions support only two data types: Int32 and Float32.
78

89
> [!TIP]
910
> This package is written in Go language and contains part of C++, that is, the cgo part
1011
12+
<!-- TOC -->
13+
* [go_numcalc](#go_numcalc)
14+
* [Installation](#installation)
15+
* [Dependencies](#dependencies)
16+
* [Usage](#usage)
17+
* [Project structure](#project-structure)
18+
* [Development](#development)
19+
* [First phase functions](#first-phase-functions)
20+
* [Phase II functions (TODO)](#phase-ii-functions-todo)
21+
* [License](#license)
22+
<!-- TOC -->
23+
1124
## Installation
25+
1226
Use the `go get` to install go_numcalc.
27+
1328
```shell
1429
go get github.com/dingyuqi/go_numcalc
1530
```
1631

1732
## Dependencies
33+
1834
There are two main external libraries used in the NumCalc package:
1935

2036
1. Go language: [Gonum](https://www.gonum.org/)
@@ -23,10 +39,13 @@ There are two main external libraries used in the NumCalc package:
2339
Armadillo does not need to be installed and is called in Cgo as a static library.
2440

2541
## Usage
26-
The following is a simple example of the `LogInt32()` method in conversion. The usage of other functions is consistent with this Demo.
27-
1. Call the `NewCalculator()` method in conversion to initialize a numerical conversion object.
28-
2. Call the `LogInt32()` method. This method will return the logarithm calculation result of the corresponding subscript of the slice `data`.
2942

43+
The following is a simple example of the `LogInt32()` method in conversion. The usage of other functions is consistent
44+
with this Demo.
45+
46+
1. Call the `NewCalculator()` method in conversion to initialize a numerical conversion object.
47+
2. Call the `LogInt32()` method. This method will return the logarithm calculation result of the corresponding subscript
48+
of the slice `data`.
3049

3150
```go
3251
package main
@@ -47,6 +66,7 @@ func main() {
4766
log.Println("result is: ", result)
4867
}
4968
```
69+
5070
> [!TIP]
5171
> This test code is in `example/example_test.go` and can be run directly.
5272
@@ -66,7 +86,40 @@ func main() {
6686
> ```shell
6787
> go build ./example.exe
6888
> ```
69-
> `example_test.go` contains a pure Go language implementation of the same function (logarithmic calculation), used to compare the calculation speed of Cgo.
89+
> `example_test.go` contains a pure Go language implementation of the same function (logarithmic calculation), used to
90+
> compare the calculation speed of Cgo.
91+
92+
## Development
93+
94+
As of August 2023, only the first phase of functions has been implemented, and the implementation language is all based
95+
on the Go language.
96+
97+
### First phase functions
98+
99+
| Serial number | Type | Function | Detailed description of function | Remarks |
100+
|---------------|------------------|-------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------|
101+
| 1 | Data conversion | Minimum and maximum standardization | Perform a linear transformation on the data series so that the processed data all fall within the interval [0, 1] | |
102+
| 2 | Data conversion | Z-score standardization | Subtract the mean and divide by the variance for each data point in the data series so that the processed data approximately conforms to the standard normal distribution of (0, 1) | |
103+
| 3 | Data conversion | Logarithmic transformation | y = log(base, x) | 1. Negative value processing </br> 2. base value |
104+
| 4 | Data conversion | Square root transformation | y = √(x) | Negative value processing |
105+
| 5 | Data Grouping | Cluster Grouping | Use cluster analysis methods to group data points into clusters with similar characteristics. Cluster grouping can be used to discover clustering patterns and categories in the data, which is useful for data mining and classification tasks. | 1. Clustering method (random_subset, static_subset, etc.) 2. Number of clusters | |
106+
| 6 | Data Grouping | Equal Width Grouping | Divide the value range of the data into intervals of equal width. This method is simple and intuitive, but may not reflect the distribution characteristics of the data well, especially when there are imbalanced data or outliers. | Group Width |
107+
| 7 | Data Grouping | Equal Frequency Grouping | Divide the data into groups containing the same number of data points. This method can better consider the distribution characteristics of the data, but for data containing a large number of repeated values, it may cause some groups to have the same values. | Number of Groups |
108+
| 8 | Data Grouping | Grouping Based on Statistics | Divide the data into groups based on the quantiles of the data. Common methods include quartile grouping, decile grouping, etc. This method can divide data into groups with the same data density, which is more effective for skewed distribution data. | Grouping conditions _(Not implemented temporarily due to overlap with the equal frequency grouping function)_ |
109+
| 9 | Outlier judgment | Standard deviation | By calculating the difference between the standard deviation of the data point and the mean, values ​​exceeding a certain threshold are considered outliers. Usually, values ​​exceeding 3 times the standard deviation are considered outliers | 1. true indicates an outlier</br>2. false indicates a non-outlier</br> 3. Threshold for outlier judgment |
110+
| 10 | Outlier judgment | Box plot | According to the quartiles and outlier range of the data, values ​​beyond the upper and lower boundaries are considered outliers. | 1. true indicates an outlier</br>2. false indicates a non-outlier |
111+
112+
### Phase II functions (TODO)
113+
114+
| Serial number | Type | Function | Function description | Remarks |
115+
|---------------|----------------|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
116+
| 1 | Data smoothing | Wavelet filtering | Decompose and reconstruct signals by applying wavelet transform to remove noise or mutations and retain important features in the signal. Wavelet filtering provides better analysis and processing capabilities in both time and frequency domains. | Different base functions have a great impact on the results. Different data need to choose different base functions and frequency ranges according to the analysis requirements. </br> 1. Wavelet basis function (Daubechies wavelet, Haar wavelet, Morlet wavelet)</br>2. Scale parameter (determines the scaling factor of each wavelet basis function in the wavelet transform; a smaller scale parameter can capture higher frequency and detailed signal characteristics, while a larger scale parameter can capture lower frequency and overall trend signal characteristics)</br>3. Decomposition level (determines the order of wavelet transform; a higher decomposition level can provide more detailed frequency and scale information)</br>4. Threshold processing method (keep/discard) |
117+
| 2 | Data smoothing | Moving average | This method smoothes the data by calculating the average value within a certain window size around the data point. The window size determines the degree of smoothing, and a larger window will smooth more fluctuations. Common moving averages include simple moving average and weighted moving average. | If the boundary cannot completely construct a window that meets the window size, the data points at these boundaries are usually removed in the output result, resulting in unequal lengths. 1. Window size </br>2. Weight </br>Boundary processing method |
118+
| 3 | Data smoothing | Exponential smoothing | Exponential smoothing is a recursive smoothing method that gives a higher weight to recent data. The weight of past observations is controlled by specifying a smoothing coefficient, where the larger the smoothing coefficient, the greater the impact on recent data. Exponential smoothing methods are often used to smooth time series data. | Smoothing factor |
119+
| 4 | Data smoothing | Savitzky-Golay smoothing | This is a smoothing method based on polynomial fitting, which smoothes data by fitting neighboring data around the data point to a polynomial curve. The Savitzky-Golay smoothing method can retain the overall shape and trend of the data and has a good noise suppression effect. | 1. Smoothing window </br>2. Polynomial order </br> 3. Derivative order (optional) |
120+
| 5 | Data smoothing | Loess smoothing | Similar to Lowess smoothing, Loess smoothing is also a nonparametric local regression method. It smoothes data by fitting a polynomial to the neighboring data around the data point. Unlike Lowess smoothing, Loess smoothing uses adaptive weighted least squares to better handle nonlinear relationships in the data. | 1. Smoothing coefficient (controls the weight given to past observations) </br>2. Weighting function (default in the library) | |
121+
| 6 | Data smoothing | Lowess smoothing | Lowess smoothing is a nonparametric local regression method that smooths data by fitting a local linear regression model. The method uses weighted least squares to estimate the smoothed value of a data point, with weights assigned based on how far away the data point is. | 1. Smoothing coefficient (controls the weight given to past observations) </br>2. Weighting function (default in the library) |
70122
71123
## License
124+
72125
[MIT](https://choosealicense.com/licenses/mit/) License

0 commit comments

Comments
 (0)