Identify Diabetes-related Targets based on ForgeNet_GPC
- Authors: Yang B.1, Wang L.2, Bao W.3
-
Affiliations:
- School of Information Science and Engineering, Zaozhuang University
- School of Information science and Engineering, Zaozhuang University
- School of Information and Electrical Engineering, Xuzhou University of Technology
- Issue: Vol 20, No 7 (2024)
- Pages: 1042-1054
- Section: Chemistry
- URL: https://ruspoj.com/1573-4099/article/view/644530
- DOI: https://doi.org/10.2174/0115734099258183230929173855
- ID: 644530
Cite item
Full Text
Abstract
Background:Research on potential therapeutic targets and new mechanisms of action can greatly improve the efficiency of new drug development.
Aims:Polygenic genetic diseases, such as diabetes, are caused by the interaction of multiple gene loci and environmental factors.
Objective:In this study, a disease target identification algorithm based on protein recognition is proposed.
Materials and Methods:In this method, the related and unrelated targets are collected from literature databases for treating diabetes. The transcribed proteins corresponding to each target are queried in order to construct a protein dataset. Six protein feature extraction algorithms (AAC, CKSAAGP, DDE, DPC, GAAP, and TPC) are utilized to obtain the feature vectors of each protein, which are merged into the full feature vectors.
Results:A novel classifier (forgeNet_GPC) based on forgeNet and Gaussian process classifier (GPC) is proposed to classify the proteins.
Conclusion:In forgeNet_GPC, forgeNet is utilized to select the important features, and GPC is utilized to solve the classification problem. The experimental results reveal that forgeNet_GPC performs better than 22 classifiers in terms of ROC-AUC, PR-AUC, MCC, Youden Index, and Kappa.
About the authors
Bin Yang
School of Information Science and Engineering, Zaozhuang University
Email: info@benthamscience.net
Linlin Wang
School of Information science and Engineering, Zaozhuang University
Author for correspondence.
Email: info@benthamscience.net
Wenzheng Bao
School of Information and Electrical Engineering, Xuzhou University of Technology
Author for correspondence.
Email: info@benthamscience.net
References
- Sacks, D.A.; Greenspoon, J.S.; Abu-Fadil, S.; Henry, H.M.; Wolde-Tsadik, G.; Yao, J.F.F. Toward universal criteria for gestational diabetes: The 75-gram glucose tolerance test in pregnancy. Am. J. Obstet. Gynecol., 1995, 172(2), 607-614. doi: 10.1016/0002-9378(95)90580-4 PMID: 7856693
- Alberti, K.G.M.M.; Zimmet, P.; Shaw, J. Metabolic syndrome-a new world-wide definition. A Consensus Statement from the International Diabetes Federation. Diabet. Med., 2006, 23(5), 469-480. doi: 10.1111/j.1464-5491.2006.01858.x PMID: 16681555
- Olawale, F.G.; Ajaja, U.I.; Aninye, I.I.; Nwozo, S.O.; Adaramoye, O.A. Long term effects of streptozotocin induced diabetes mellitus on hepatic, nephrotic and cardiac physiology of female wistar rats. Nige. J. Pharma. Res., 2021, 17(1), 71-80. doi: 10.4314/njpr.v17i1.8
- Yang, H.; Fan, S.; Song, D.; Wang, Z.; Ma, S.; Li, S.; Li, X.; Xu, M.; Xu, M.; Wang, X. Long-term streptozotocin-induced diabetes in rats leads to severe damage of brain blood vessels and neurons via enhanced oxidative stress. Mol. Med. Rep., 2013, 7(2), 431-440. doi: 10.3892/mmr.2012.1227 PMID: 23232924
- Anderson, E.J.; Kypson, A.P.; Rodriguez, E.; Anderson, C.A.; Lehr, E.J.; Neufer, P.D. Substrate-specific derangements in mitochondrial metabolism and redox balance in the atrium of the type 2 diabetic human heart. J. Am. Coll. Cardiol., 2009, 54(20), 1891-1898. doi: 10.1016/j.jacc.2009.07.031 PMID: 19892241
- Rizzo, M.; Cianflone, D.; Maranta, F. Treatment of diabetes and heart failure: Facts and hopes. Int. J. Cardiol., 2022, 359, 118-119. doi: 10.1016/j.ijcard.2022.04.035 PMID: 35439587
- Marchetti, P.; Bugliani, M.; Lupi, R.; Marselli, L.; Masini, M.; Boggi, U.; Filipponi, F.; Weir, G.C.; Eizirik, D.L.; Cnop, M. The endoplasmic reticulum in pancreatic beta cells of type 2 diabetes patients. Diabetologia, 2007, 50(12), 2486-2494. doi: 10.1007/s00125-007-0816-8 PMID: 17906960
- Khin, P.P.; Lee, J.H.; Jun, H.S. A brief review of the mechanisms of β-Cell dedifferentiation in type 2 diabetes. Nutrients, 2021, 13(5), 1593. doi: 10.3390/nu13051593 PMID: 34068827
- Buenaventura, T.; Kanda, N.; Douzenis, P.C.; Jones, B.; Bloom, S.R.; Chabosseau, P.; Corrêa, I.R., Jr; Bosco, D.; Piemonti, L.; Marchetti, P.; Johnson, P.R.; Shapiro, A.M.J.; Rutter, G.A.; Tomas, A. A targeted RNAi screen identifies endocytic trafficking factors that control glp-1 receptor signaling in pancreatic β-cells. Diabetes, 2018, 67(3), 385-399. doi: 10.2337/db17-0639 PMID: 29284659
- Wu, T.; Zhang, S; Xu, J.; Zhang, Y.; Sun, T.; Shao, Y.; Wang, J.; Tang, W.; Chen, F.; Han, X. HRD1, an important player in pancreatic β-Cell failure and therapeutic target for type 2 diabetic mice. American Diabetes Association, 2020, 69(5), 940-953. doi: 10.2337/db19-1060
- DAddio, F.; Maestroni, A.; Assi, E.; Ben Nasr, M.; Amabile, G.; Usuelli, V.; Loretelli, C.; Bertuzzi, F.; Antonioli, B.; Cardarelli, F.; El Essawy, B.; Solini, A.; Gerling, I.C.; Bianchi, C.; Becchi, G.; Mazzucchelli, S.; Corradi, D.; Fadini, G.P.; Foschi, D.; Markmann, J.F.; Orsi, E.; krha, J., Jr; Camboni, M.G.; Abdi, R.; James Shapiro, A.M.; Folli, F.; Ludvigsson, J.; Del Prato, S.; Zuccotti, G.; Fiorina, P. The IGFBP3/TMEM219 pathway regulates beta cell homeostasis. Nat. Commun., 2022, 13(1), 684. doi: 10.1038/s41467-022-28360-2 PMID: 35115561
- Katz, L.S.; Brill, G.; Zhang, P.; Kumar, A.; Baumel-Alterzon, S.; Honig, L.B.; Gómez-Banoy, N.; Karakose, E.; Tanase, M.; Doridot, L.; Alvarsson, A.; Davenport, B.; Wang, P.; Lambertini, L.; Stanley, S.A.; Homann, D.; Stewart, A.F.; Lo, J.C.; Herman, M.A.; Garcia-Ocaña, A.; Scott, D.K. Maladaptive positive feedback production of ChREBPβ underlies glucotoxic β-cell failure. Nat. Commun., 2022, 13(1), 4423. doi: 10.1038/s41467-022-32162-x PMID: 35908073
- Nag, A.; Dhindsa, R.S.; Mitchell, J.; Vasavda, C.; Harper, A.R.; Vitsios, D.; Ahnmark, A.; Bilican, B.; Madeyski-Bengtson, K.; Zarrouki, B.; Zoghbi, A.W.; Wang, Q.; Smith, K.R.; Alegre-Díaz, J.; Kuri-Morales, P.; Berumen, J.; Tapia-Conyer, R.; Emberson, J.; Torres, J.M.; Collins, R.; Smith, D.M.; Challis, B.; Paul, D.S.; Bohlooly-Y, M.; Snowden, M.; Baker, D.; Fritsche-Danielson, R.; Pangalos, M.N.; Petrovski, S. Human genetics uncovers MAP3K15 as an obesity-independent therapeutic target for diabetes. Sci. Adv., 2022, 8(46), eadd5430. doi: 10.1126/sciadv.add5430 PMID: 36383675
- Wang, K.; Zhang, Z.; Hang, J.; Liu, J.; Guo, F.; Ding, Y.; Li, M.; Nie, Q.; Lin, J.; Zhuo, Y.; Sun, L.; Luo, X.; Zhong, Q.; Ye, C.; Yun, C.; Zhang, Y.; Wang, J.; Bao, R.; Pang, Y.; Wang, G.; Gonzalez, F.J.; Lei, X.; Qiao, J.; Jiang, C. Microbial-host-isozyme analyses reveal microbial DPP4 as a potential antidiabetic target. Science, 2023, 381(6657), eadd5787. doi: 10.1126/science.add5787 PMID: 37535747
- Hsu, J.T.; Jean, T.C.; Chan, M.A.; Ying, C. Differential display screening for specific gene expression induced by dietary nonsteroidal estrogen. Mol. Reprod. Dev., 1999, 52(2), 141-148. doi: 10.1002/(SICI)1098-2795(199902)52:23.0.CO;2-V PMID: 9890744
- McCoubrey, W.K., Jr; Cooklis, M.A.; Maines, M.D. The structure, organization and differential expression of the rat gene encoding biliverdin reductase. Gene, 1995, 160(2), 235-240. doi: 10.1016/0378-1119(95)00112-J PMID: 7642101
- Hatfield, G.W.; Hung, S.; Baldi, P. Differential analysis of DNA microarray gene expression data. Mol. Microbiol., 2003, 47(4), 871-877. doi: 10.1046/j.1365-2958.2003.03298.x PMID: 12581345
- Rapaport, F.; Khanin, R.; Liang, Y.; Pirun, M.; Krek, A.; Zumbo, P.; Mason, C.E.; Socci, N.D.; Betel, D. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biol., 2013, 14(9), R95. doi: 10.1186/gb-2013-14-9-r95 PMID: 24020486
- Finotello, F.; Di Camillo, B. Measuring differential gene expression with RNA-seq: challenges and strategies for data analysis. Brief. Funct. Genomics, 2015, 14(2), 130-142. doi: 10.1093/bfgp/elu035 PMID: 25240000
- Das, S.; Rai, S.N. SwarnSeq: An improved statistical approach for differential expression analysis of single-cell RNA-seq data. Genomics, 2021, 113(3), 1308-1324. doi: 10.1016/j.ygeno.2021.02.014 PMID: 33662531
- Tusher, V.G.; Tibshirani, R.; Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci., 2001, 98(9), 5116-5121. doi: 10.1073/pnas.091062498 PMID: 11309499
- Yuan, T.; Liu, G.; Ming, Z.; Yi, Z.; Zhou, C. A comparison: Three analysis methods for identifying differentially expressed genes. 2010 2nd International Conference on Signal Processing System., July 2010, vol.3, pp. 2165-2169.
- Li, J.; Witten, D.M.; Johnstone, I.M.; Tibshirani, R. Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics, 2012, 13(3), 523-538. doi: 10.1093/biostatistics/kxr031 PMID: 22003245
- Zhang, G.; Li, Q.; Chen, Q.; Su, S. Network pharmacology: A new approach for chinese herbal medicine research. Evid. Based Complement. Alternat. Med., 2013, 2013, 1-9. doi: 10.1155/2013/621423 PMID: 23762149
- Zhang, A.; Sun, H.; Yang, B.; Wang, X. Predicting new molecular targets for rhein using network pharmacology. BMC Syst. Biol., 2012, 6(1), 20. doi: 10.1186/1752-0509-6-20 PMID: 22433437
- Zhang, Y.; Mao, X.; Guo, Q.; Lin, N.; Li, S. Network pharmacology-based approaches capture essence of chinese herbal medicines. Chin. Herb. Med., 2016, 8(2), 107-116. doi: 10.1016/S1674-6384(16)60018-7
- Roy, S.; Martinez, D.; Platero, H.; Lane, T.; Werner-Washburne, M. Exploiting amino acid composition for predicting protein-protein interactions. PLoS One, 2009, 4(11), e7813. doi: 10.1371/journal.pone.0007813 PMID: 19936254
- Wang, Y.; Wang, P.; Guo, Y.; Huang, S.; Chen, Y.; Xu, L. prPred: A predictor to identify plant resistance proteins by incorporating k-spaced amino acid (group) pairs. Front. Bioeng. Biotechnol., 2021, 8, 645520. doi: 10.3389/fbioe.2020.645520 PMID: 33553134
- Saravanan, V.; Gautham, N. Harnessing computational biology for exact linear b-cell epitope prediction: A novel amino acid composition-based feature descriptor. OMICS, 2015, 19(10), 648-658. doi: 10.1089/omi.2015.0095 PMID: 26406767
- Ding, Y.; Cai, Y.; Zhang, G.; Xu, W. The influence of dipeptide composition on protein thermostability. FEBS Lett., 2004, 569(1-3), 284-288. doi: 10.1016/j.febslet.2004.06.009 PMID: 15225649
- Kabir, M.; Ahmad, S.; Iqbal, M.; Hayat, M. iNR-2L: A two-level sequence-based predictor developed via Chous 5-steps rule and general PseAAC for identifying nuclear receptors and their families. Genomics, 2020, 112(1), 276-285. doi: 10.1016/j.ygeno.2019.02.006 PMID: 30779939
- Chen, Z.; Zhao, P.; Li, F.; Marquez-Lago, T.T.; Leier, A.; Revote, J.; Zhu, Y.; Powell, D.R.; Akutsu, T.; Webb, G.I.; Chou, K.C.; Smith, A.I.; Daly, R.J.; Li, J.; Song, J. iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data. Brief. Bioinform., 2020, 21(3), 1047-1057. doi: 10.1093/bib/bbz041 PMID: 31067315
- Kong, Y.; Yu, T. forgeNet: A graph deep neural network model using tree-based ensemble classifiers for feature graph construction. Bioinformatics, 2020, 36(11), 3507-3515. doi: 10.1093/bioinformatics/btaa164 PMID: 32163118
- Kong, Y.; Yu, T. A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data. Bioinformatics, 2018, 34(21), 3727-3737. doi: 10.1093/bioinformatics/bty429 PMID: 29850911
- Sun, S.; Zhong, P.; Xiao, H.; Wang, R. Active learning with gaussian process classifier for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens., 2015, 53(4), 1746-1760. doi: 10.1109/TGRS.2014.2347343
- Xiao, G.; Cheng, Q.; Zhang, C. Detecting travel modes using rule-based classification system and gaussian process classifier. IEEE Access, 2019, 7, 116741-116752. doi: 10.1109/ACCESS.2019.2936443
- Rafe, V.; Hosseini, M.; Moghaddam, M.J.; Karimi, R. An efficient approach to breast cancer prediction based on neural network, adaboost and gaussian process. J. Med. Imaging Health Inform., 2015, 5(3), 533-538. doi: 10.1166/jmihi.2015.1420
- Zhao, G.Y.; Xu, Z.W.; Liu, J. Prediction method of seismic-induced sand liquefaction based on the Gauss Process classification. Chi. J. Geo. Haz. Con., 2019, 30(1), 93-99.
- Díaz-Uriarte, R.; Alvarez de Andrés, S. Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 2006, 7(1), 3. doi: 10.1186/1471-2105-7-3 PMID: 16398926
- Saunders, C.; Stitson, M.O.; Bottou, J.; Scholkopf, L.; Smola, B.A Support vector machine weston, royal holloway. Computer ence, 2002, 1(4), 1-28. doi: 10.1007/978-3-642-27733-7_299-3
- Collins, M.; Schapire, R.E.; Singer, Y. Logistic regression, adaboost and bregman distances. Mach. Learn., 2002, 48(123), 253-285.
- Li, J.; Sun, L.; Zhang, Q.; Zhang, C. Application of native Bayes classifier to text classification. J. Harbin Eng. Uni., 2003, 24(1), 71-74.
- Zhagparov, Z.; Buribayev, Z.; Joldasbayev, S.; Yerkosova, A.; Zhassuzak, M. Building a system for predicting the yield of grain crops based on machine learning using the xgbregressor algorithm. 2021IEEE International Conference on Smart Information Systems and Technologies (SIST), 28-30 April 2021Nur-Sultan, Kazakhstan, pp. 1-5. doi: 10.1109/SIST50301.2021.9465938
- Jiao, F.; Xu, J.; Yu, L.; Schuurmans, D. Protein fold recognition using the gradient boost algorithm. Computational Systems Bioinformatics, 2006, 43-53.
- Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 2000, 42(1), 80-86. doi: 10.1080/00401706.2000.10485983
- Gardner, W.A. Learning characteristics of stochastic-gradient-descent algorithms: A general study, analysis, and critique. Signal Processing, 1984, 6(2), 113-133. doi: 10.1016/0165-1684(84)90013-6
- Bao, W.; Cui, Q.; Chen, B.; Yang, B. Phage_UniR_LGBM: Phage virion proteins classification with unirep features and lightgbm model. Comput. Math. Methods Med., 2022, 2022, 1-8. doi: 10.1155/2022/9470683 PMID: 35465015
- Kao, H.J.; Nguyen, V.N.; Huang, K.Y.; Chang, W.C.; Lee, T.Y. SuccSite: Incorporating amino acid composition and informative k-spaced amino acid pairs to identify protein succinylation sites. Genomics Proteomics Bioinformatics, 2020, 18(2), 208-219. doi: 10.1016/j.gpb.2018.10.010 PMID: 32592791
- Barukab, O.; Ali, F.; Alghamdi, W.; Bassam, Y.; Afzal Khan, S. DBP-CNN: Deep learning-based prediction of DNA-binding proteins by coupling discrete cosine transform with two-dimensional convolutional neural network. Expert Syst. Appl., 2022, 197, 116729. doi: 10.1016/j.eswa.2022.116729
- Yang, L.; Gao, H.; Liu, Z.; Tang, L. Identification of phage virion proteins by using the g-gap tripeptide composition. Lett. Org. Chem., 2019, 16(4), 332-339. doi: 10.2174/1570178615666180910112813
Supplementary files
