Preliminary Construction of Prognostic Risk Early Warning Model for Renal Cell Carcinoma with Type 2 Diabetes Mellitus Based on SMOTE Algorithm

with type 2 diabetes mellitus combined with renal cell carcinoma are closely related to poor prognosis. Based on this, the individualized early warning model established by synthetic minority oversampling technique oversampling algorithm is beneficial to patients with high risk of poor prognosis early identification.

Compared with non-Type 2 Diabetes Mellitus (T2DM) Renal Cell Carcinoma (RCC) patients, patients with T2DM have significantly shorter overall survival and higher tumor recurrence and mortality [1,2] .However, studies have also shown [3,4] that there is no statistical correlation between diabetes mellitus and the overall survival of renal cancer patients undergoing surgery.This suggests that the occurrence of RCC induced by T2DM and its influence on poor prognosis may be the result of the combined action of many complex factors.Therefore, it is necessary to analyze the risk factors of poor prognosis in patients with RCC complicated with T2DM and establish an individualized early warning model to take intervention measures to further improve the prognosis of patients and improve the survival rate of patients with renal cancer.In this study, the clinical data of patients with RCC complicated with T2DM were collected and an early warning model of poor prognosis of patients was established based on the Synthetic Minority Oversampling Technique (SMOTE) oversampling algorithm, in order to provide a reference for the prognosis analysis of patients with renal cancer.

Research objects:
Inclusion criteria: All patients were diagnosed with RCC by histopathological examination; the patients were diagnosed with RCC for the first time and did not receive radiotherapy, chemotherapy or surgery before treatment; the patients had T2DM and the clinical diagnosis complied with the relevant standards in the guidelines for the prevention and treatment of type diabetes, and the clinical data of the patients are complete.
Exclusion criteria: Patients with metastatic renal cancer; patients with expected survival ≤2 y.A total of 157 renal cancer patients with T2DM who met the above criteria were included, including 105 males (66.9 %) and 52 females (33.

Follow-up and grouping:
All patients were regularly reviewed and followed up after surgery.Follow-up was conducted once a month within 6 mo after surgery and every 3 mo after 6 mo, until June 2022.Endpoints were defined as tumorspecific death or postoperative tumor recurrence or postoperative distant metastasis.

Collection of clinical data:
Through the hospital electronic medical record system, the following clinical data of patients were collected, including gender, age, body mass index, duration of T2DM (time from the first diagnosis of T2DM to the time of surgery), preoperative fasting blood glucose, preoperative glycosylated Haemoglobin A1C (HbA1c), and blood glucose control methods, tumor location, tumor pathological type, Tumor-Node-Metastasis (TNM)-t stage, surgical method, surgical method and the occurrence of end-point events.

SMOTE oversampling algorithm:
SMOTE oversampling was implemented using Statistical Package for the Social Sciences (SPSS) Modeler 18.1.In this study, a small sample group is a poor prognosis group and the sample multiple (n) should be increased to represent the ratio of the number of patients in the good prognosis group to the number of patients in the poor prognosis group (rounded to the nearest integer).The specific process of SMOTE oversampling [5] is; calculate the k nearest neighbors of each sample in the infection group; randomly select a sample j from the k nearest neighbors of the sample point i in the infection group; calculate sample i and the difference Q of all variable attributes of sample j; randomly generate a value R between 0 and 1; generate a new sample=Sample i+R×Q; repeat steps to until the number of patients in the poor prognosis group reaches n times; repeat steps to until all the sample variables of the poor prognosis group have been processed.The data set expanded by this method is essentially to perform intra-class sample interpolation on the minority class samples, without changing the original spatial boundary of the samples, and has high reliability and validity.
Univariate analysis of the prognosis of renal cancer patients with T2DM was shown in Table 1.In the univariate analysis, the variables with statistically significant differences in clinical data of the two groups of patients were used as independent variables and whether the patients had end point events as the dependent variables were used for binary logistic regression analysis (see Table 2 for the assignment of variables).The results suggest that the increased course of T2DM, high preoperative HbA1c, high Body Mass Index (BMI), and TNM stage T3/T4 are independent risk factors for poor prognosis in renal cancer patients with T2DM (p<0.05).Independent protective factor for prognosis (p<0.05).Probabilistic prediction model P 1 =1/[1+e -(-13.084-0.438*X1+0.446*X2+0.096*X3- 0.781*X4+1.155*X5)].The Hosmer-Lemeshow test was performed on the model, and the results indicated that the coefficient of determination was R 2 =0.692 as shown in Table 3.
Based on the independent risk factors screened in 2.2, oversampling was performed by the SMOTE oversampling algorithm.In this study, a small sample group is a poor prognosis group (27 cases), the number to be expanded is n=good prognosis group/ poor prognosis group=130/27≈5, and the sample size should be expanded to 27+27×5=162 cases, At this time, the ratio of the good prognosis group to the poor prognosis group was close to 1 (0.80).The logistic regression model was refitted to the oversampled data, and the results are shown in Table 4. Early warning model based on SMOTE oversampling algorithm P 2 =1/[1+e -(-13.084-0.438*×1+0.446*×2+0.096*×3-0.781*X4+1.155*×5)] (the assignment of each variable is the same as before).The Hosmer-Lemeshow test was performed on the model, and the results indicated that the coefficient of determination was R 2 =0.833, which was higher than that of the P 1 model as shown in Table 4. Receiver Operating Characteristic (ROC) curve analysis of prediction models P 1 and P 2 as shown in fig. 1.If used, the value is 1, if not used, the value is 0 TNM staging x 5 The stage is T3/T4, the value is 1 and the T1/T2 stage is 0  occurred during the follow-up period, of which 19 cases were tumor-related deaths, and the patient survival rate was 87.9 % (138/157), lower than the 96.2 % reported in which suggests that RCC patients with T2DM have a shorter survival time [9] .A systematic review in China compared the overall survival, cancer-specific survival and tumor-free survival of RCC patients with and without T2DM.The results found that the above-mentioned survival time of patients with T2DM was significantly shortened, which was closely related to the poor prognosis of the patients significant correlation [10] .

Factor
However, the specific clinical factors on how T2DM accelerates the development of poor outcomes in RCC patients are currently not fully understood.This study further explored the risk factors of patients with T2DM and RCC with poor prognosis through univariate and regression analysis.The results showed that increased course of T2DM, high preoperative HbA1c, high BMI, and TNM stage T3/ T4 were independent risk factors for poor prognosis in patients with T2DM and RCC (p<0.05) [11].The study also found that the duration of T2DM in female patients for more than 5 y can significantly increase the risk of RCC and it is closely related to all-cause mortality.This study showed that the risk of poor prognosis in RCC patients with T2DM increased approximately 2-fold for each additional year of T2DM duration.The longer duration of T2DM in patients means that their early exposure to hyperinsulinemia is longer and studies have confirmed that insulin cannot only promote the mitosis and proliferation of tumor cells by inducing insulin-like growth factor 1, but also up regulate epidermal growth [12] .The expression of growth factors promotes the growth of tumor angiogenesis, RCC accounts for about 77.4 % of malignant tumors of the urinary system [6] .Although the incidence and mortality of renal cancer in my country are lower than the world average, they have an increasing trend year by year.According to the survey [7] , the incidence rate of RCC in my country in 2015 increased by 21 % compared with 2003-2007; while the mortality rate of RCC has increased significantly since 1992, among which the mortality rate of renal cancer in male patients is increased every year.The increase is as high as 2.85 %.At present, the specific pathogenesis of renal cancer is not completely clear, but some studies believe that patient's genetic factors, environmental factors and comorbid chronic diseases play an important role in the occurrence and development of RCC.Among them, T2DM with metabolic syndrome as the main manifestation is considered to be an independent risk factor for renal cancer, and renal cancer patients with T2DM have a worse prognosis [8] .The study found that the risk of RCC in male diabetic renal cancer patients is as high as 86 %, which is an independent risk factor for tumor occurrence.It is believed that regular screening of renal disease in elderly diabetic patients is helpful for the early diagnosis and treatment of RCC.However, foreign scholars have found that there is no correlation between T2DM and overall survival of patients with RCC complicated with T2DM and without T2DM.It can be seen that the impact of T2DM on the occurrence and prognosis of RCC is complex and the correlation between the two is still controversial.
In this study, 157 patients with RCC complicated with T2DM were followed up for 2 to 36 mo.The results showed that 27 cases of end-point events BMI is an important indicator for measuring the degree of obesity in the human body.The patient may be in a state of metabolic syndrome.The tumor microenvironment is an important functional site for the proliferation and metastasis of malignant tumor cells and patients in the state of hyperglycemia and lipid metabolism disorder can drive the metabolism of per renal tissue to increase, produce a large number of metabolites and release them into the tumor microenvironment to promote RCC grows, invades and metastasizes [13] , and in the state of hyperglycemia, immune dysfunction can also weaken the body's immune surveillance, resulting in immune escape of tumor cells and continued growth.
In addition, this study also found that in T2DM patients with RCC, receiving metformin hypoglycemic drug treatment was an independent protective factor for poor prognosis (p<0.05).
Metformin is derived from traditional Chinese medicine goat bean and belongs to a biguanide derivative.It can inhibit hepatic gluconeogenesis and glycogenolysis, promote the uptake and utilization of glucose by peripheral target tissues and inhibit the absorption of intestinal glucose to reduce blood sugar.It is currently the first-line treatment for T2DM.Current studies suggest that metformin can have significant anti-tumor effects on various tumors such as lung cancer, liver cancer and breast cancer [14] .Conducted a follow-up study on the risk of RCC in 115 923 T2DM patients, and found that the risk of RCC in patients using metformin was significantly lower than that in patients who had never used metformin.Domestic scholars systematically evaluated the effect of metformin on the survival time of T2DM patients with RCC and found that receiving metformin therapy can significantly prolong the overall survival of patients, reduce the risk of death and help improve the prognosis of patients [15] .The above studies all suggest that metformin can significantly reduce the risk of RCC in T2DM patients and improve the prognosis of patients.This is more consistent with the results of this study.At present, the specific mechanism of metformin's anti-tumor action is not fully understood and it may be related to activating In conclusion, the course of T2DM, preoperative HbA1c and BMI levels, TNM stage and the use of metformin for hypoglycemic therapy are closely related to the prognosis of patients with T2DM and RCC.Based on this, the individualized early warning model established by the SMOTE oversampling algorithm can significantly improve the predictive performance of poor prognosis.However, due to the small sample size and short follow-up time in this study and no external validation was carried out, this model is routinely used in preclinical studies and still needs to be validated by large-sample, multicenter and prospective studies.

Fig. 1 :
Fig. 1: ROC curve analysis of different early warning models (P 1 and P 2 ) for predicting poor prognosis in renal cancer patients with T2DM Note: ( ): P 2 and ( ): P 1

TABLE 4 : LOGISTIC REGRESSION ANALYSIS BASED ON SMOTE OVERSAMPLING ALGORITHM
Unbalanced sample size is a common statistical problem in the medical field, which not only affects the specificity and sensitivity of prediction results, but also reduces the prediction accuracy of regression models.SMOTE is an oversampling algorithm, which is an effective processing method for imbalanced data.It cannot only expand the data set with low sample size, but also does not change the original spatial boundary of the sample and has high reliability and validity.Based on the SMOTE oversampling algorithm, this study effectively expanded the sample size of the poor prognosis group.After generating new data, the regression model was established again.According to the ROC analysis, the Area Under the Curve (AUC) and coefficient of determination of the P 2 model were 0.934 and 0.833, which were higher than those of the P 1 model of 0.734 and 0.692, suggesting that the regression model fitted after SMOTE oversampling had better predictive performance for poor prognosis in patients with T2DM and RCC and a higher proportion of the variance of the dependent variable is explained by the independent variable through the regression relationship.
AMP-Activated Protein Kinase (AMPK) to induce tumor cell apoptosis and cell cycle arrest, promoting