Introduction

Column

Raw data distribution

optional caption text

optional caption text

Get to know the data

     ambient   coolant       u_d       u_q motor_speed     torque      i_d
1 -0.7521430 -1.118446 0.3279352 -1.297858   -1.222428 -0.2501821 1.029572
2 -0.7712632 -1.117021 0.3296648 -1.297686   -1.222429 -0.2491333 1.029509
3 -0.7828916 -1.116681 0.3327715 -1.301822   -1.222428 -0.2494311 1.029448
4 -0.7809354 -1.116764 0.3336999 -1.301852   -1.222430 -0.2486364 1.032845
5 -0.7740426 -1.116775 0.3352061 -1.303118   -1.222429 -0.2487008 1.031807
6 -0.7629362 -1.116955 0.3349012 -1.303017   -1.222429 -0.2481970 1.031031
         i_q        pm stator_yoke stator_tooth stator_winding profile_id
1 -0.2458600 -2.522071   -1.831422    -2.066143      -2.018033          4
2 -0.2458323 -2.522418   -1.830969    -2.064859      -2.017631          4
3 -0.2458179 -2.522673   -1.830400    -2.064073      -2.017343          4
4 -0.2469548 -2.521639   -1.830333    -2.063137      -2.017632          4
5 -0.2466097 -2.521900   -1.830498    -2.062795      -2.018145          4
6 -0.2463406 -2.522203   -1.831931    -2.062549      -2.017884          4

Boxplot

Column

Motivation

Temperature estimation techniques for permanent magnet synchronous motors

Abstract:Monitoring critical temperatures in permanent magnet synchronous motors (PMSMs) is crucial to ensure safe operation and maximum device utilization as well. In this project an enhanced method to determine the rotor temperature of permanent magnet synchronous machines during dynamic operations is presented. The approach is based on a fundamental wave flux observer and is therefore independent of each measurement session to a large extend.In the process of quantitative analysis, the regression model between variables is established by using R software, and the regression model is revised by stepwise regression. FinallyMeasurement results prove the satisfying performance of the observer.
Introduction:Nowadays, the permanent magnet synchronous motor (PMSM) is widely spread in industrial applications. Due to the development of high energy permanent magnet materials over the last decade this type of motor became leading in aspects like torque and power density with respect to volume and weight [1]. In automotive applications these characteristics are essential and thus the PMSM is favourable chosen in comparison to other designs like induction or switched-reluctance motors [2]. From a device utilisation point of view the maximum feasible temperatures in the different motor parts is one of the most critical aspects to consider. Besides the prevention of partial or total device destruction intensive thermal stress can lead to a shortened motor life time [3]. The stator end-winding temperature is typically of great interest, as its cooling is difficult and a significant part of the copper losses is concentrated there [4]. Consequently, the end-winding thermal dynamics are significant posing the risk of a quickly exceeding the thermal limits of the insulation varnish. Hence, it became state-of-the-art to install a temperature sensor in the end-winding allowing an easy real-time monitoring.

Variable interpretation.

optional caption text
Ambient: Ambient temperature as measured by a thermal sensor located closely to the stator.
Coolant: Coolant temperature. The motor is water cooled. Measurement is taken at outflow.
u_d:Voltage d-component.
u_q:Voltage q-component.
motor_speed:motor_speed.
torque:torque induced by current.
i_d:Current d-component.
i_q:Current q-component.
stator_yoke: Stator yoke temperature measured with a thermal sensor.
stator_tooth: Stator tooth temperature measured with a thermal sensor.
stator_winding: stator_winding temperature measured with a thermal sensor.

Summery for Raw data

Column

Linear regression results for Raw data


Call:
lm(formula = pm ~ ambient + coolant + u_d + u_q + motor_speed + 
    torque + i_d + i_q + stator_yoke + stator_tooth + stator_winding)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.76448 -0.29746 -0.00472  0.28467  2.36277 

Coefficients:
                 Estimate Std. Error  t value Pr(>|t|)    
(Intercept)    -0.0009644  0.0004757   -2.027   0.0427 *  
ambient         0.2174087  0.0005707  380.938   <2e-16 ***
coolant        -0.2660965  0.0030472  -87.324   <2e-16 ***
u_d            -0.0364442  0.0010701  -34.057   <2e-16 ***
u_q            -0.3437060  0.0010639 -323.057   <2e-16 ***
motor_speed     0.3329606  0.0017400  191.352   <2e-16 ***
torque         -0.0019567  0.0077177   -0.254   0.7999    
i_d             0.1794808  0.0014143  126.906   <2e-16 ***
i_q             0.0181593  0.0072395    2.508   0.0121 *  
stator_yoke    -1.5580582  0.0094477 -164.914   <2e-16 ***
stator_tooth    4.5504861  0.0119002  382.386   <2e-16 ***
stator_winding -2.2783254  0.0058950 -386.486   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4753 on 998058 degrees of freedom
Multiple R-squared:  0.7722,    Adjusted R-squared:  0.7722 
F-statistic: 3.075e+05 on 11 and 998058 DF,  p-value: < 2.2e-16

Model Performance:Adjusted R-squared: 0.7722, F-statistic: 3.075e+05 on 11 and 998058 DF, p-value: < 2.2e-16,Residual standard error: 0.4753 on 998058 degrees of freedom

Adequacy Check for Raw data

optional caption text

Column

Discussion for Raw data

optional caption text
\(H_0:β_1=β_2=...=β12.\) Vs. \(H_a:β_i≠0,i=1,2,...,12.\)
Based on the output above, we know that F-statistic is 3.075e+05 and the p-value is 2.2e-16<0.01=α. We reject H0 at the level of significance 0.01. We have sufficient evidence that using the regression model with all these factors we intruduced before to predict Permanent Magnet surface temperature is better than just using the mean Permanent Magnet surface temperature .

Discussion for Adequacy Plot

1. Linear Assumption:The residuals “bounce randomly around the 0 line. This suggests that the assumption that the relationship is linear is reasonable.
2. Zero mean Assumption:The error term ε has zero mean
3. Equal Variance Assumption: We exam this assumption by checking the Scale-Location plot. It’s good if you see a horizontal line with equally (randomly) spread points.Based on the plot above, we can find that the residuals appear randomly spread which indicates the equal variance assumption is not violated.
4. Independent Assumption: In this study, the data are not time-series data. If the data are random, we are willing to believe the residuals are independent.
5. Normality Assumption: The histogram is a frequency plot obtained by placing the data in regularly spaced cells and plotting each cell frequency versus the center of the cell. Figure illustrates an approximately normal distribution of residuals produced by the model. However, the Normal QQ plot, dots both ends away from straight-line. and if we check plot residuals have there is a more obvious problem. The variance of the residuals is a curve shape with X and that’s a violation of the constant variance assumption. Overall the assumption of our model are not reasonable here.

Data selection

Column

Histgram by profile_id

Data selection

optional caption text

optional caption text

Correlogram

linear regression model results for selected data


Call:
lm(formula = pm ~ ambient + coolant + u_d + u_q + motor_speed + 
    torque + i_d + i_q + stator_yoke + stator_tooth + stator_winding)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.33063 -0.08722 -0.00696  0.07996  0.88826 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.0202506  0.0048601   4.167 3.10e-05 ***
ambient         0.1735578  0.0042836  40.517  < 2e-16 ***
coolant         0.0485417  0.0070555   6.880 6.09e-12 ***
u_d             0.0007102  0.0013622   0.521  0.60213    
u_q            -0.0510700  0.0022305 -22.896  < 2e-16 ***
motor_speed     0.0415091  0.0027175  15.275  < 2e-16 ***
torque          0.0320753  0.0097979   3.274  0.00106 ** 
i_d             0.0247856  0.0021825  11.357  < 2e-16 ***
i_q            -0.0265914  0.0090703  -2.932  0.00337 ** 
stator_yoke     0.1262089  0.0167678   7.527 5.33e-14 ***
stator_tooth    0.7788091  0.0183374  42.471  < 2e-16 ***
stator_winding -0.2876892  0.0082552 -34.850  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1188 on 32429 degrees of freedom
Multiple R-squared:  0.9432,    Adjusted R-squared:  0.9432 
F-statistic: 4.898e+04 on 11 and 32429 DF,  p-value: < 2.2e-16

Model Performance:Adjusted R-squared: 0.9432 , F-statistic: 4.898e+04 on 11 and 32429 DF, p-value: < 2.2e-16,Residual standard error: 0.1188 on 32429 degrees of freedom.

Column

Motivation and methods

In industry, we don’t just want to provide qualitative strategies. We prefer to provide more accurate quantitative models.In the previous model,Adjusted R-squared: 0.7722,this is an acceptable value.However,from the results of this analysis, one can see the Torque is insignificant for this model.The most interesting target features are rotor temperature (“pm”), stator temperatures ("stator_*") and torque. Especially rotor temperature and torque are not reliably and economically measurable in a commercial vehicle.In addition, we can see the model does not follow the normality assumption very well from Figure Adequacy plot.Finally,the large amount of data greatly lengthens the analysis time.
Since, each section of data is independent.So, we can pick a section data as the object of analysis. we will sacan the basic performance from each sections, then select a few sets of data and check the adequacy manually.

Adequacy Check for selected data

Variable selection

Column

Stepwise


Call:
lm(formula = pm ~ ambient + coolant + u_q + motor_speed + torque + 
    i_d + i_q + stator_yoke + stator_tooth + stator_winding)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.33058 -0.08724 -0.00693  0.07986  0.88822 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.020378   0.004854   4.198 2.70e-05 ***
ambient         0.173478   0.004281  40.524  < 2e-16 ***
coolant         0.048885   0.007025   6.959 3.49e-12 ***
u_q            -0.050909   0.002209 -23.047  < 2e-16 ***
motor_speed     0.041275   0.002680  15.400  < 2e-16 ***
torque          0.028896   0.007669   3.768 0.000165 ***
i_d             0.024614   0.002158  11.408  < 2e-16 ***
i_q            -0.024052   0.007652  -3.143 0.001672 ** 
stator_yoke     0.125832   0.016752   7.511 6.00e-14 ***
stator_tooth    0.778767   0.018337  42.470  < 2e-16 ***
stator_winding -0.287554   0.008251 -34.851  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1188 on 32430 degrees of freedom
Multiple R-squared:  0.9432,    Adjusted R-squared:  0.9432 
F-statistic: 5.388e+04 on 10 and 32430 DF,  p-value: < 2.2e-16

After we perform the stepwise regression, we remove the variable v_d. 

k-fold cross-validation

   nvmax      RMSE  Rsquared        MAE      RMSESD  RsquaredSD
1      1 0.1268380 0.9352609 0.10392719 0.001736443 0.002142796
2      2 0.1707477 0.8825954 0.13767973 0.002349490 0.004207312
3      3 0.1221789 0.9399125 0.09899039 0.001866271 0.001960172
4      4 0.1202229 0.9418180 0.09724431 0.001692387 0.001848611
5      5 0.1193840 0.9426286 0.09668706 0.001609186 0.001825382
6      6 0.1192133 0.9427959 0.09687475 0.001554147 0.001761385
7      7 0.1190543 0.9429465 0.09667859 0.001539559 0.001781631
8      8 0.1189015 0.9430940 0.09641902 0.001608498 0.001805777
9      9 0.1220524 0.9400506 0.09960232 0.001563465 0.001761644
10    10 0.1187804 0.9432098 0.09630433 0.001626685 0.001813835
         MAESD
1  0.001835277
2  0.002177776
3  0.001600410
4  0.001524632
5  0.001577441
6  0.001515830
7  0.001506123
8  0.001572164
9  0.001569055
10 0.001595466
   nvmax
10    10
Subset selection object
11 Variables  (and intercept)
               Forced in Forced out
ambient            FALSE      FALSE
coolant            FALSE      FALSE
u_d                FALSE      FALSE
u_q                FALSE      FALSE
motor_speed        FALSE      FALSE
torque             FALSE      FALSE
i_d                FALSE      FALSE
i_q                FALSE      FALSE
stator_yoke        FALSE      FALSE
stator_tooth       FALSE      FALSE
stator_winding     FALSE      FALSE
1 subsets of each size up to 10
Selection Algorithm: 'sequential replacement'
          ambient coolant u_d u_q motor_speed torque i_d i_q stator_yoke
1  ( 1 )  " "     " "     " " " " " "         " "    " " " " "*"        
2  ( 1 )  "*"     "*"     " " " " " "         " "    " " " " " "        
3  ( 1 )  "*"     " "     " " " " " "         " "    " " " " " "        
4  ( 1 )  "*"     " "     " " "*" " "         " "    " " " " " "        
5  ( 1 )  "*"     "*"     " " "*" " "         " "    " " " " " "        
6  ( 1 )  "*"     " "     " " "*" "*"         " "    " " " " "*"        
7  ( 1 )  "*"     " "     " " "*" "*"         " "    "*" " " "*"        
8  ( 1 )  "*"     " "     " " "*" "*"         "*"    "*" " " "*"        
9  ( 1 )  "*"     "*"     "*" "*" "*"         "*"    "*" "*" "*"        
10  ( 1 ) "*"     "*"     " " "*" "*"         "*"    "*" "*" "*"        
          stator_tooth stator_winding
1  ( 1 )  " "          " "           
2  ( 1 )  " "          " "           
3  ( 1 )  "*"          "*"           
4  ( 1 )  "*"          "*"           
5  ( 1 )  "*"          "*"           
6  ( 1 )  "*"          "*"           
7  ( 1 )  "*"          "*"           
8  ( 1 )  "*"          "*"           
9  ( 1 )  " "          " "           
10  ( 1 ) "*"          "*"           
   (Intercept)        ambient        coolant            u_q    motor_speed 
    0.02037803     0.17347843     0.04888479    -0.05090859     0.04127511 
        torque            i_d            i_q    stator_yoke   stator_tooth 
    0.02889645     0.02461445    -0.02405227     0.12583154     0.77876719 
stator_winding 
   -0.28755439 

\(\hat{y}=0.17347843ambient+0.04888479coolant-0.05090859u_q+0.04motor\)

\(+0.02889645torque+0.02461445i_d-0.02405227i_q+0.12583154statoryoke\)

\(+0.77statortooth-0.28755439statorwinding\)

Column

Discussion

After we perform the stepwise regression, we remove the variable v_d, and we will use ride regression to test the Multicollinearity of this model.

Ride regression

Lambda=0.01. We almost obtain the ordinary least squares estimates. That is mean our model is not suffer form the multicollinearity.

[1] 0.01

Conclusion

Column

Simplified model

aaa

Final model


Call:
lm(formula = pm ~ ambient + stator_yoke + stator_tooth + stator_winding)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.30314 -0.08852 -0.00872  0.08392  0.95404 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)    -0.009424   0.004533  -2.079   0.0376 *  
ambient         0.186009   0.004177  44.529   <2e-16 ***
stator_yoke     0.256954   0.008912  28.832   <2e-16 ***
stator_tooth    0.628525   0.013801  45.543   <2e-16 ***
stator_winding -0.226880   0.006876 -32.994   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1206 on 32436 degrees of freedom
Multiple R-squared:  0.9414,    Adjusted R-squared:  0.9414 
F-statistic: 1.303e+05 on 4 and 32436 DF,  p-value: < 2.2e-16

Model Performance:Adjusted R-squared: 0.9414, F-statistic: 1.303e+05 on 4 and 32436 DF, p-value: < 2.2e-16,Residual standard error: 0.1206 on 32436 degrees of freedom
\(\hat{y}=-0.009424+0.186009ambient+0.256954statoryoke+0.628525statortooth-0.226880statorwinding\)

Column

Conclusion

Conclusion:In this project we successfully built the linear model to predict The temperature of permanent magnet synchronous motors. We first perform the data exploitation, one can see this data have 12 predictors and data has been standardized, most of the values of predictors are located at between -2 and 2. From the histogram we can see the value of these pricetors are evenly distributed.I trend to believe that their distribution is normal.The raw data is too large, causing trouble for data processing. Simultaneously, we are not satisfied with the performance and adequacy plot of the original data.Fortunately, these raw data are composed of 52 independent sets of data, we perform the data selections and pick one set of data which has best performance and adequacy plot. After that the linear model for selected data has been built. We perform front and backwise and stepwise varible selection to remove insignificant variable. we also use ride regression to check the multicolinears effect for this model. The final analysis results show that this model did not suffer from multicolinearty.The last step we check the coralation map and Decided to simplify the model.In the final model we decided use only 4 variable to predict the temperature.Analysis results show very good performance and acceptable adequacy plot.In future work, we may explore the relationship between torque and other variables Because the torque distribution is also very important in the electric motor industry.

Reference

Reference:

1.Data source: https://www.kaggle.com/wkirgsn/electric-motor-temperature.
2. Specht, A., Wallscheid, O., & Böcker, J. (2014, May). Determination of rotor temperature for an interior permanent magnet synchronous machine using a precise flux observer. In 2014 International Power Electronics Conference (IPEC-Hiroshima 2014-ECCE ASIA) (pp. 1501-1507). IEEE.
3. Wallscheid, O., Kirchgässner, W., & Böcker, J. (2017, May). Investigation of long short-term memory networks to temperature prediction for permanent magnet synchronous motors. In 2017 International Joint Conference on Neural Networks (IJCNN) (pp. 1940-1947). IEEE.
4. Wallscheid, O., Huber, T., Peters, W., & Böcker, J. (2014, November). Real-time capable methods to determine the magnet temperature of permanent magnet synchronous motors—A review. In IECON 2014-40th Annual Conference of the IEEE Industrial Electronics Society (pp. 811-818). IEEE.
5. Gaona, D., Wallscheid, O., & Böcker, J. (2017, December). Fusion of a lumped-parameter thermal network and speed-dependent flux observer for PM temperature estimation in synchronous machines. In 2017 IEEE Southern Power Electronics Conference (SPEC) (pp. 1-6). IEEE.
6. Wallscheid, O., & Böcker, J. (2017, May). Fusion of direct and indirect temperature estimation techniques for permanent magnet synchronous motors. In 2017 IEEE International Electric Machines and Drives Conference (IEMDC) (pp. 1-8). IEEE. 7. Gaona, D., Wallscheid, O., & Bocker, J. (2017, December). Sensitivity analysis of a permanent magnet temperature observer for PM synchronous machines using the Monte Carlo method. In 2017 IEEE 12th International Conference on Power Electronics and Drive Systems (PEDS) (pp. 599-606). IEEE.
optional caption text

---
title: "FP"
author: "Sichao Zhou"
output: 
  flexdashboard::flex_dashboard:
    theme: simplex
    orientation: columns
    source_code: embed
---
```{r setup, include=FALSE}
# load necessary packages
library(ggplot2)
library(plotly)
library(plyr)
library(flexdashboard)  ## you need this package to create dashboard

# read the data set here.
temperature <- read.csv("D:/ZSC/Class/MTH543/FinalProject/pmsm_temperature_data.csv", stringsAsFactors = FALSE)
Raw <- temperature
```
Introduction
=======================================================================
Column{.tabset data-width=650}
-----------------------------------------------------------------------
### Raw data distribution  
![optional caption text](/ZSC/Class/MTH543/FinalProject/histfull.png) 


### Get to know the data   

```{r}
head(Raw)
attach(Raw)
```   
    
### Boxplot   

```{r}
temp <- read.csv("D:/ZSC/Class/MTH543/FinalProject/temp.csv", stringsAsFactors = FALSE)
DF <- data.frame(temp)
boxplot(DF, col = rainbow(3, s = 0.5))
```
   
   
Column {.tabset data-width=350} 
-----------------------------------------------------------------------    

### Motivation
**Temperature estimation techniques for permanent magnet synchronous motors**    

**Abstract:**Monitoring critical temperatures in permanent magnet synchronous motors (PMSMs) is crucial to ensure safe operation and maximum device utilization as well. In this project an enhanced method to determine the rotor temperature of permanent magnet synchronous machines during dynamic operations is presented. The approach is based on a fundamental wave flux observer and is therefore independent of each measurement session to a large extend.In the process of quantitative analysis, the regression model between variables is established by using R software, and the regression model is revised by stepwise regression. FinallyMeasurement results prove the satisfying performance of the observer.   
**Introduction:**Nowadays, the permanent magnet synchronous motor (PMSM) is widely spread in industrial applications. Due to the development of high energy permanent magnet materials over the last decade this type of motor became leading in aspects like torque and power density with respect to volume and weight [1]. In automotive applications these characteristics are essential and thus the PMSM is favourable chosen in comparison to other designs like induction or switched-reluctance motors [2]. From a device utilisation point of view the maximum feasible temperatures in the different motor parts is one of the most critical aspects to consider. Besides the prevention of partial or total device destruction intensive thermal stress can lead to a shortened motor life time [3]. The stator end-winding temperature is typically of great interest, as its cooling is difficult and a significant part of the copper losses is concentrated there [4]. Consequently, the end-winding thermal dynamics are significant posing the risk of a quickly exceeding the thermal limits of the insulation varnish. Hence, it became state-of-the-art to install a temperature sensor in the end-winding allowing an easy real-time monitoring.   
    


### Variable interpretation.
   
![optional caption text](/ZSC/Class/MTH543/FinalProject/PMSM1.png)  
**Ambient:** Ambient temperature as measured by a thermal sensor located closely to the stator.      
**Coolant:** Coolant temperature. The motor is water cooled. Measurement is taken at outflow.   
**u_d:**Voltage d-component.   
**u_q:**Voltage q-component.    
**motor_speed:**motor_speed.   
**torque:**torque induced by current.    
**i_d:**Current d-component.   
**i_q:**Current q-component.   
**stator_yoke:** Stator yoke temperature measured with a thermal sensor.     
**stator_tooth:** Stator tooth temperature measured with a thermal sensor.     
**stator_winding:** stator_winding temperature measured with a thermal sensor.   
 

 





Summery for Raw data
=======================================================================
Column{.tabset data-width=680}
-------------------------------------
    
### Linear regression results for Raw data
    
```{r}
attach(Raw)
Raw.model<-lm(pm~ambient+coolant+u_d+u_q+motor_speed+torque+i_d+i_q+stator_yoke+stator_tooth+stator_winding)
summary(Raw.model)
```   
**Model Performance:Adjusted R-squared:  0.7722, F-statistic: 3.075e+05 on 11 and 998058 DF,  p-value: < 2.2e-16,Residual standard error: 0.4753 on 998058 degrees of freedom**
   
### Adequacy Check for Raw data    
![optional caption text](/ZSC/Class/MTH543/FinalProject/RAWAD.png)
```{r}

``` 
 
 
   
Column {.tabset data-width=320}
-------------------------------------
   
### Discussion for Raw data   
![optional caption text](/ZSC/Class/MTH543/FinalProject/HIST1.png)    
$H_0:β_1=β_2=...=β12.$ Vs. $H_a:β_i≠0,i=1,2,...,12.$  
Based on the output above, we know that F-statistic is 3.075e+05 and the p-value is 2.2e-16<0.01=α. We reject H0 at the level of significance 0.01. We have sufficient evidence that using the regression model with all these factors we intruduced before to predict Permanent Magnet surface temperature  is better than just using the mean Permanent Magnet surface temperature .   
 
### Discussion for Adequacy Plot   
**1. Linear Assumption:**The residuals “bounce randomly around the 0 line. This suggests that the assumption that the relationship is linear is reasonable.   
**2. Zero mean Assumption:**The error term ε has zero mean   
**3. Equal Variance Assumption:** We exam this assumption by checking the Scale-Location plot. It’s good if you see a horizontal line with equally (randomly) spread points.Based on the plot above, we can find that the residuals appear randomly spread which indicates the equal variance assumption is not violated.   
**4. Independent Assumption:** In this study, the data are not time-series data. If the data are random, we are willing to believe the residuals are independent.   
**5. Normality Assumption:** The histogram is a frequency plot obtained by placing the data in regularly spaced cells and plotting each cell frequency versus the center of the cell. Figure illustrates an approximately normal distribution of residuals produced by the model. However, the Normal QQ plot, dots both ends away from  straight-line. and if we check plot residuals  have there is a more obvious problem. The variance of the residuals is a curve shape with X and that’s a violation of the constant variance assumption. Overall the assumption of our model are not reasonable here.    

Data selection
=======================================================================
Column{.tabset data-width=600}
-------------------------------------  

### Histgram by profile_id
```{r}
library(ggplot2)
ggplot(temperature, aes(x=profile_id))+ geom_histogram(color="darkblue", fill="lightblue",binwidth=1)

``` 
    
### Data selection    
![optional caption text](/ZSC/Class/MTH543/FinalProject/performance.png)

```{r}

```    

### Correlogram
```{r}  
library(corrplot)
M<-cor(temperature)
cor.mtest <- function(mat, ...) {
    mat <- as.matrix(mat)
    n <- ncol(mat)
    p.mat<- matrix(NA, n, n)
    diag(p.mat) <- 0
    for (i in 1:(n - 1)) {
        for (j in (i + 1):n) {
            tmp <- cor.test(mat[, i], mat[, j], ...)
            p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
        }
    }
  colnames(p.mat) <- rownames(p.mat) <- colnames(mat)
  p.mat
}
p.mat <- cor.mtest(temperature)
col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
corrplot(M, method="color", col=col(200),  
         type="upper", order="hclust", 
         addCoef.col = "black", # Add coefficient of correlation
         tl.col="black", tl.srt=45, #Text label color and rotation
         # Combine with significance
         p.mat = p.mat, sig.level = 0.01, insig = "blank", 
         # hide correlation coefficient on the principal diagonal
         diag=FALSE 
         )

``` 

### linear regression model results for selected data   

```{r}   
df <- read.csv("D:/ZSC/Class/MTH543/FinalProject/FinalProject.csv", stringsAsFactors = FALSE)
df<-na.omit(df)
attach(df)
full.model<-lm(pm~ambient+coolant+u_d+u_q+motor_speed+torque+i_d+i_q+stator_yoke+stator_tooth+stator_winding)
summary(full.model)
```   
Model Performance:Adjusted R-squared: 0.9432 , F-statistic: 4.898e+04 on 11 and 32429 DF, p-value: < 2.2e-16,Residual standard error: 0.1188 on 32429 degrees of freedom.

Column {.tabset data-width=400}
-------------------------------------
   
### Motivation and methods 
In industry, we don't just want to provide qualitative strategies. We prefer to provide more accurate quantitative models.In the previous model,Adjusted R-squared: 0.7722,this is an acceptable value.However,from the results of this analysis, one can see the Torque is insignificant for this model.The most interesting target features are rotor temperature ("pm"), stator temperatures ("stator_*") and torque. Especially rotor temperature and torque are not reliably and economically measurable in a commercial vehicle.In addition, we can see the model does not follow the normality assumption very well from Figure Adequacy plot.Finally,the large amount of data greatly lengthens the analysis time.    
Since, each section of data is independent.So, we can pick a section data as the object of analysis. we will sacan the basic performance from each sections, then select a few sets of data and check the adequacy manually.    

### Adequacy Check for selected data 
```{r}
unstandardizedPredicted <- predict(full.model)
unstandardizedResiduals <- resid(full.model)
standardizedPredicted <- (unstandardizedPredicted - mean(unstandardizedPredicted)) / sd(unstandardizedPredicted)
standardizedResiduals <- (unstandardizedResiduals - mean(unstandardizedResiduals)) / sd(unstandardizedResiduals)
hist(standardizedResiduals, freq = FALSE,breaks=40,col="blue",)
curve(dnorm, add = TRUE)
plot(full.model,1)
plot(full.model,2)
plot(full.model,3)
```      

Variable selection
=======================================================================  
Column{.tabset data-width=600}
-------------------------------------
    
### Stepwise 

```{r}
df <- read.csv("D:/ZSC/Class/MTH543/FinalProject/FinalProject.csv", stringsAsFactors = FALSE)
df<-na.omit(df)
attach(df)
library(MASS)
# forward regression model
fit.forward <- stepAIC(full.model, direction = "forward", trace = FALSE)
# backward regression model
fit.backward <- stepAIC(full.model, direction = "backward", trace = FALSE)
# stepwise regression model
fit.stepwise <- stepAIC(full.model, direction = "both", trace = FALSE)
summary(fit.stepwise)
library(leaps)
fit.subsets <- regsubsets(pm~., data=df, nbest=1, nvmax=NULL, method="exhaustive")
```
After we perform the stepwise regression, we remove the variable v_d.   


### k-fold cross-validation   

```{r}  
library(caret)
# Set seed for reproducibility
set.seed(2019)
# Set up repeated k-fold cross-validation
train.control <- trainControl(method = "cv", number = 20)
# Train the model
model.stepwise <- train(pm ~., data = df,
                    method = "leapSeq", 
                    tuneGrid = data.frame(nvmax = 1:10),
                    trControl = train.control
)
model.stepwise$results
model.stepwise$bestTune
summary(model.stepwise$finalModel)
coef(model.stepwise$finalModel, 10)
new.model<-lm(pm~ambient+coolant+u_q+motor_speed+torque+i_d+i_q+stator_yoke+stator_tooth+stator_winding)
```
$\hat{y}=0.17347843ambient+0.04888479coolant-0.05090859u_q+0.04motor$  

$+0.02889645torque+0.02461445i_d-0.02405227i_q+0.12583154statoryoke$  

$+0.77statortooth-0.28755439statorwinding$   

Column {.tabset data-width=400}
-------------------------------------
   
### Discussion  
```{r}  
plot(fit.subsets,scal="adjr2") 
pairs(~i_q+u_d)

```   
After we perform the stepwise regression, we remove the variable v_d, and we will use ride regression to test the Multicollinearity of this model.

### Ride regression    
   
Lambda=0.01. We almost obtain the ordinary least squares estimates. That is mean our model is not suffer form the multicollinearity.
```{r}
library(glmnet)
df <- as.data.frame(apply(df, 2, scale))
y <- df$pm
X <- as.matrix(df[,-1])
fit_OLS <- lm(y~X)
set.seed(2019)
lambdas <- 10^seq(3, -2, by = -.1)
set.seed(2019)
cv_fit <- cv.glmnet(X, y, alpha = 0, lambda = lambdas)
plot(cv_fit, cex.lab=1.5)
opt_lambda <- cv_fit$lambda.min
opt_lambda
```    
Conclusion
=======================================================================  
Column{.tabset data-width=600}
-------------------------------------  
### Simplified model
```{r} 
#panel.cor <- function(x, y){
#    usr <- par("usr"); on.exit(par(usr))
#    par(usr = c(0, 1, 0, 1))
#    r <- round(cor(x, y), digits=2)
#    txt <- paste0("R = ", r)
#    cex.cor <- 0.8/strwidth(txt)
#    text(0.5, 0.5, txt, cex = cex.cor * r)
#}
#upper.panel<-function(x, y){
#  points(x,y)
#}
#pairs(pm~ambient+stator_yoke+stator_tooth+stator_winding, 
#      lower.panel = panel.cor,
#      upper.panel = upper.panel)
f <- read.csv("D:/ZSC/Class/MTH543/FinalProject/fff.csv", stringsAsFactors = FALSE)
library("PerformanceAnalytics")
chart.Correlation(f, histogram=TRUE, pch=19)

```
aaa   

### Final model  
```{r}  
fff<-lm(pm~ambient+stator_yoke+stator_tooth+stator_winding)
summary(fff)
```   
**Model Performance:Adjusted R-squared:  0.9414, F-statistic: 1.303e+05 on 4 and 32436 DF,  p-value: < 2.2e-16,Residual standard error: 0.1206 on 32436 degrees of freedom**   
$\hat{y}=-0.009424+0.186009ambient+0.256954statoryoke+0.628525statortooth-0.226880statorwinding$  

Column {.tabset data-width=400}
-------------------------------------

### Conclusion   
** Conclusion:**In this project we successfully built the linear model to predict The temperature of permanent magnet synchronous motors. We first perform the data exploitation, one can see this data have 12 predictors and data has been standardized, most of the values of predictors are located at between -2 and 2. From the histogram we can see the value of these pricetors are evenly distributed.I trend to believe that their distribution is normal.The raw data is too large, causing trouble for data processing. Simultaneously, we are not satisfied with the performance and adequacy plot of the original data.Fortunately, these raw data are composed of 52 independent sets of data, we perform the data selections and pick one set of data which has best performance and adequacy plot. After that the linear model for selected data has been built. We perform front and backwise and stepwise varible selection to remove insignificant variable. we also use ride regression to check the multicolinears effect for this model. The final analysis results show that this model did not suffer from multicolinearty.The last step we check the coralation map and Decided to simplify the model.In the final model we decided use only 4 variable to predict the temperature.Analysis results show very good performance and acceptable adequacy plot.In future work, we may explore the relationship between torque and other variables Because the torque distribution is also very important in the electric motor industry.

### Reference   
** Reference:**   

1.Data source: https://www.kaggle.com/wkirgsn/electric-motor-temperature.   
2. Specht, A., Wallscheid, O., & Böcker, J. (2014, May). Determination of rotor temperature for an interior permanent magnet synchronous machine using a precise flux observer. In 2014 International Power Electronics Conference (IPEC-Hiroshima 2014-ECCE ASIA) (pp. 1501-1507). IEEE.   
3. Wallscheid, O., Kirchgässner, W., & Böcker, J. (2017, May). Investigation of long short-term memory networks to temperature prediction for permanent magnet synchronous motors. In 2017 International Joint Conference on Neural Networks (IJCNN) (pp. 1940-1947). IEEE.    
4. Wallscheid, O., Huber, T., Peters, W., & Böcker, J. (2014, November). Real-time capable methods to determine the magnet temperature of permanent magnet synchronous motors—A review. In IECON 2014-40th Annual Conference of the IEEE Industrial Electronics Society (pp. 811-818). IEEE.   
5. Gaona, D., Wallscheid, O., & Böcker, J. (2017, December). Fusion of a lumped-parameter thermal network and speed-dependent flux observer for PM temperature estimation in synchronous machines. In 2017 IEEE Southern Power Electronics Conference (SPEC) (pp. 1-6). IEEE.   
6. Wallscheid, O., & Böcker, J. (2017, May). Fusion of direct and indirect temperature estimation techniques for permanent magnet synchronous motors. In 2017 IEEE International Electric Machines and Drives Conference (IEMDC) (pp. 1-8). IEEE.
7. Gaona, D., Wallscheid, O., & Bocker, J. (2017, December). Sensitivity analysis of a permanent magnet temperature observer for PM synchronous machines using the Monte Carlo method. In 2017 IEEE 12th International Conference on Power Electronics and Drive Systems (PEDS) (pp. 599-606). IEEE.  
![optional caption text](/ZSC/Class/MTH543/FinalProject/thx.jpg)