Identify Diabetes-related Targets based on ForgeNet_GPC


Cite item

Full Text

Abstract

Background:Research on potential therapeutic targets and new mechanisms of action can greatly improve the efficiency of new drug development.

Aims:Polygenic genetic diseases, such as diabetes, are caused by the interaction of multiple gene loci and environmental factors.

Objective:In this study, a disease target identification algorithm based on protein recognition is proposed.

Materials and Methods:In this method, the related and unrelated targets are collected from literature databases for treating diabetes. The transcribed proteins corresponding to each target are queried in order to construct a protein dataset. Six protein feature extraction algorithms (AAC, CKSAAGP, DDE, DPC, GAAP, and TPC) are utilized to obtain the feature vectors of each protein, which are merged into the full feature vectors.

Results:A novel classifier (forgeNet_GPC) based on forgeNet and Gaussian process classifier (GPC) is proposed to classify the proteins.

Conclusion:In forgeNet_GPC, forgeNet is utilized to select the important features, and GPC is utilized to solve the classification problem. The experimental results reveal that forgeNet_GPC performs better than 22 classifiers in terms of ROC-AUC, PR-AUC, MCC, Youden Index, and Kappa.

About the authors

Bin Yang

School of Information Science and Engineering, Zaozhuang University

Email: info@benthamscience.net

Linlin Wang

School of Information science and Engineering, Zaozhuang University

Author for correspondence.
Email: info@benthamscience.net

Wenzheng Bao

School of Information and Electrical Engineering, Xuzhou University of Technology

Author for correspondence.
Email: info@benthamscience.net

References

  1. Sacks, D.A.; Greenspoon, J.S.; Abu-Fadil, S.; Henry, H.M.; Wolde-Tsadik, G.; Yao, J.F.F. Toward universal criteria for gestational diabetes: The 75-gram glucose tolerance test in pregnancy. Am. J. Obstet. Gynecol., 1995, 172(2), 607-614. doi: 10.1016/0002-9378(95)90580-4 PMID: 7856693
  2. Alberti, K.G.M.M.; Zimmet, P.; Shaw, J. Metabolic syndrome-a new world-wide definition. A Consensus Statement from the International Diabetes Federation. Diabet. Med., 2006, 23(5), 469-480. doi: 10.1111/j.1464-5491.2006.01858.x PMID: 16681555
  3. Olawale, F.G.; Ajaja, U.I.; Aninye, I.I.; Nwozo, S.O.; Adaramoye, O.A. Long term effects of streptozotocin induced diabetes mellitus on hepatic, nephrotic and cardiac physiology of female wistar rats. Nige. J. Pharma. Res., 2021, 17(1), 71-80. doi: 10.4314/njpr.v17i1.8
  4. Yang, H.; Fan, S.; Song, D.; Wang, Z.; Ma, S.; Li, S.; Li, X.; Xu, M.; Xu, M.; Wang, X. Long-term streptozotocin-induced diabetes in rats leads to severe damage of brain blood vessels and neurons via enhanced oxidative stress. Mol. Med. Rep., 2013, 7(2), 431-440. doi: 10.3892/mmr.2012.1227 PMID: 23232924
  5. Anderson, E.J.; Kypson, A.P.; Rodriguez, E.; Anderson, C.A.; Lehr, E.J.; Neufer, P.D. Substrate-specific derangements in mitochondrial metabolism and redox balance in the atrium of the type 2 diabetic human heart. J. Am. Coll. Cardiol., 2009, 54(20), 1891-1898. doi: 10.1016/j.jacc.2009.07.031 PMID: 19892241
  6. Rizzo, M.; Cianflone, D.; Maranta, F. Treatment of diabetes and heart failure: Facts and hopes. Int. J. Cardiol., 2022, 359, 118-119. doi: 10.1016/j.ijcard.2022.04.035 PMID: 35439587
  7. Marchetti, P.; Bugliani, M.; Lupi, R.; Marselli, L.; Masini, M.; Boggi, U.; Filipponi, F.; Weir, G.C.; Eizirik, D.L.; Cnop, M. The endoplasmic reticulum in pancreatic beta cells of type 2 diabetes patients. Diabetologia, 2007, 50(12), 2486-2494. doi: 10.1007/s00125-007-0816-8 PMID: 17906960
  8. Khin, P.P.; Lee, J.H.; Jun, H.S. A brief review of the mechanisms of β-Cell dedifferentiation in type 2 diabetes. Nutrients, 2021, 13(5), 1593. doi: 10.3390/nu13051593 PMID: 34068827
  9. Buenaventura, T.; Kanda, N.; Douzenis, P.C.; Jones, B.; Bloom, S.R.; Chabosseau, P.; Corrêa, I.R., Jr; Bosco, D.; Piemonti, L.; Marchetti, P.; Johnson, P.R.; Shapiro, A.M.J.; Rutter, G.A.; Tomas, A. A targeted RNAi screen identifies endocytic trafficking factors that control glp-1 receptor signaling in pancreatic β-cells. Diabetes, 2018, 67(3), 385-399. doi: 10.2337/db17-0639 PMID: 29284659
  10. Wu, T.; Zhang, S; Xu, J.; Zhang, Y.; Sun, T.; Shao, Y.; Wang, J.; Tang, W.; Chen, F.; Han, X. HRD1, an important player in pancreatic β-Cell failure and therapeutic target for type 2 diabetic mice. American Diabetes Association, 2020, 69(5), 940-953. doi: 10.2337/db19-1060
  11. D’Addio, F.; Maestroni, A.; Assi, E.; Ben Nasr, M.; Amabile, G.; Usuelli, V.; Loretelli, C.; Bertuzzi, F.; Antonioli, B.; Cardarelli, F.; El Essawy, B.; Solini, A.; Gerling, I.C.; Bianchi, C.; Becchi, G.; Mazzucchelli, S.; Corradi, D.; Fadini, G.P.; Foschi, D.; Markmann, J.F.; Orsi, E.; Škrha, J., Jr; Camboni, M.G.; Abdi, R.; James Shapiro, A.M.; Folli, F.; Ludvigsson, J.; Del Prato, S.; Zuccotti, G.; Fiorina, P. The IGFBP3/TMEM219 pathway regulates beta cell homeostasis. Nat. Commun., 2022, 13(1), 684. doi: 10.1038/s41467-022-28360-2 PMID: 35115561
  12. Katz, L.S.; Brill, G.; Zhang, P.; Kumar, A.; Baumel-Alterzon, S.; Honig, L.B.; Gómez-Banoy, N.; Karakose, E.; Tanase, M.; Doridot, L.; Alvarsson, A.; Davenport, B.; Wang, P.; Lambertini, L.; Stanley, S.A.; Homann, D.; Stewart, A.F.; Lo, J.C.; Herman, M.A.; Garcia-Ocaña, A.; Scott, D.K. Maladaptive positive feedback production of ChREBPβ underlies glucotoxic β-cell failure. Nat. Commun., 2022, 13(1), 4423. doi: 10.1038/s41467-022-32162-x PMID: 35908073
  13. Nag, A.; Dhindsa, R.S.; Mitchell, J.; Vasavda, C.; Harper, A.R.; Vitsios, D.; Ahnmark, A.; Bilican, B.; Madeyski-Bengtson, K.; Zarrouki, B.; Zoghbi, A.W.; Wang, Q.; Smith, K.R.; Alegre-Díaz, J.; Kuri-Morales, P.; Berumen, J.; Tapia-Conyer, R.; Emberson, J.; Torres, J.M.; Collins, R.; Smith, D.M.; Challis, B.; Paul, D.S.; Bohlooly-Y, M.; Snowden, M.; Baker, D.; Fritsche-Danielson, R.; Pangalos, M.N.; Petrovski, S. Human genetics uncovers MAP3K15 as an obesity-independent therapeutic target for diabetes. Sci. Adv., 2022, 8(46), eadd5430. doi: 10.1126/sciadv.add5430 PMID: 36383675
  14. Wang, K.; Zhang, Z.; Hang, J.; Liu, J.; Guo, F.; Ding, Y.; Li, M.; Nie, Q.; Lin, J.; Zhuo, Y.; Sun, L.; Luo, X.; Zhong, Q.; Ye, C.; Yun, C.; Zhang, Y.; Wang, J.; Bao, R.; Pang, Y.; Wang, G.; Gonzalez, F.J.; Lei, X.; Qiao, J.; Jiang, C. Microbial-host-isozyme analyses reveal microbial DPP4 as a potential antidiabetic target. Science, 2023, 381(6657), eadd5787. doi: 10.1126/science.add5787 PMID: 37535747
  15. Hsu, J.T.; Jean, T.C.; Chan, M.A.; Ying, C. Differential display screening for specific gene expression induced by dietary nonsteroidal estrogen. Mol. Reprod. Dev., 1999, 52(2), 141-148. doi: 10.1002/(SICI)1098-2795(199902)52:23.0.CO;2-V PMID: 9890744
  16. McCoubrey, W.K., Jr; Cooklis, M.A.; Maines, M.D. The structure, organization and differential expression of the rat gene encoding biliverdin reductase. Gene, 1995, 160(2), 235-240. doi: 10.1016/0378-1119(95)00112-J PMID: 7642101
  17. Hatfield, G.W.; Hung, S.; Baldi, P. Differential analysis of DNA microarray gene expression data. Mol. Microbiol., 2003, 47(4), 871-877. doi: 10.1046/j.1365-2958.2003.03298.x PMID: 12581345
  18. Rapaport, F.; Khanin, R.; Liang, Y.; Pirun, M.; Krek, A.; Zumbo, P.; Mason, C.E.; Socci, N.D.; Betel, D. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biol., 2013, 14(9), R95. doi: 10.1186/gb-2013-14-9-r95 PMID: 24020486
  19. Finotello, F.; Di Camillo, B. Measuring differential gene expression with RNA-seq: challenges and strategies for data analysis. Brief. Funct. Genomics, 2015, 14(2), 130-142. doi: 10.1093/bfgp/elu035 PMID: 25240000
  20. Das, S.; Rai, S.N. SwarnSeq: An improved statistical approach for differential expression analysis of single-cell RNA-seq data. Genomics, 2021, 113(3), 1308-1324. doi: 10.1016/j.ygeno.2021.02.014 PMID: 33662531
  21. Tusher, V.G.; Tibshirani, R.; Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci., 2001, 98(9), 5116-5121. doi: 10.1073/pnas.091062498 PMID: 11309499
  22. Yuan, T.; Liu, G.; Ming, Z.; Yi, Z.; Zhou, C. A comparison: Three analysis methods for identifying differentially expressed genes. 2010 2nd International Conference on Signal Processing System., July 2010, vol.3, pp. 2165-2169.
  23. Li, J.; Witten, D.M.; Johnstone, I.M.; Tibshirani, R. Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics, 2012, 13(3), 523-538. doi: 10.1093/biostatistics/kxr031 PMID: 22003245
  24. Zhang, G.; Li, Q.; Chen, Q.; Su, S. Network pharmacology: A new approach for chinese herbal medicine research. Evid. Based Complement. Alternat. Med., 2013, 2013, 1-9. doi: 10.1155/2013/621423 PMID: 23762149
  25. Zhang, A.; Sun, H.; Yang, B.; Wang, X. Predicting new molecular targets for rhein using network pharmacology. BMC Syst. Biol., 2012, 6(1), 20. doi: 10.1186/1752-0509-6-20 PMID: 22433437
  26. Zhang, Y.; Mao, X.; Guo, Q.; Lin, N.; Li, S. Network pharmacology-based approaches capture essence of chinese herbal medicines. Chin. Herb. Med., 2016, 8(2), 107-116. doi: 10.1016/S1674-6384(16)60018-7
  27. Roy, S.; Martinez, D.; Platero, H.; Lane, T.; Werner-Washburne, M. Exploiting amino acid composition for predicting protein-protein interactions. PLoS One, 2009, 4(11), e7813. doi: 10.1371/journal.pone.0007813 PMID: 19936254
  28. Wang, Y.; Wang, P.; Guo, Y.; Huang, S.; Chen, Y.; Xu, L. prPred: A predictor to identify plant resistance proteins by incorporating k-spaced amino acid (group) pairs. Front. Bioeng. Biotechnol., 2021, 8, 645520. doi: 10.3389/fbioe.2020.645520 PMID: 33553134
  29. Saravanan, V.; Gautham, N. Harnessing computational biology for exact linear b-cell epitope prediction: A novel amino acid composition-based feature descriptor. OMICS, 2015, 19(10), 648-658. doi: 10.1089/omi.2015.0095 PMID: 26406767
  30. Ding, Y.; Cai, Y.; Zhang, G.; Xu, W. The influence of dipeptide composition on protein thermostability. FEBS Lett., 2004, 569(1-3), 284-288. doi: 10.1016/j.febslet.2004.06.009 PMID: 15225649
  31. Kabir, M.; Ahmad, S.; Iqbal, M.; Hayat, M. iNR-2L: A two-level sequence-based predictor developed via Chou’s 5-steps rule and general PseAAC for identifying nuclear receptors and their families. Genomics, 2020, 112(1), 276-285. doi: 10.1016/j.ygeno.2019.02.006 PMID: 30779939
  32. Chen, Z.; Zhao, P.; Li, F.; Marquez-Lago, T.T.; Leier, A.; Revote, J.; Zhu, Y.; Powell, D.R.; Akutsu, T.; Webb, G.I.; Chou, K.C.; Smith, A.I.; Daly, R.J.; Li, J.; Song, J. iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data. Brief. Bioinform., 2020, 21(3), 1047-1057. doi: 10.1093/bib/bbz041 PMID: 31067315
  33. Kong, Y.; Yu, T. forgeNet: A graph deep neural network model using tree-based ensemble classifiers for feature graph construction. Bioinformatics, 2020, 36(11), 3507-3515. doi: 10.1093/bioinformatics/btaa164 PMID: 32163118
  34. Kong, Y.; Yu, T. A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data. Bioinformatics, 2018, 34(21), 3727-3737. doi: 10.1093/bioinformatics/bty429 PMID: 29850911
  35. Sun, S.; Zhong, P.; Xiao, H.; Wang, R. Active learning with gaussian process classifier for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens., 2015, 53(4), 1746-1760. doi: 10.1109/TGRS.2014.2347343
  36. Xiao, G.; Cheng, Q.; Zhang, C. Detecting travel modes using rule-based classification system and gaussian process classifier. IEEE Access, 2019, 7, 116741-116752. doi: 10.1109/ACCESS.2019.2936443
  37. Rafe, V.; Hosseini, M.; Moghaddam, M.J.; Karimi, R. An efficient approach to breast cancer prediction based on neural network, adaboost and gaussian process. J. Med. Imaging Health Inform., 2015, 5(3), 533-538. doi: 10.1166/jmihi.2015.1420
  38. Zhao, G.Y.; Xu, Z.W.; Liu, J. Prediction method of seismic-induced sand liquefaction based on the Gauss Process classification. Chi. J. Geo. Haz. Con., 2019, 30(1), 93-99.
  39. Díaz-Uriarte, R.; Alvarez de Andrés, S. Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 2006, 7(1), 3. doi: 10.1186/1471-2105-7-3 PMID: 16398926
  40. Saunders, C.; Stitson, M.O.; Bottou, J.; Scholkopf, L.; Smola, B.A Support vector machine weston, royal holloway. Computer ence, 2002, 1(4), 1-28. doi: 10.1007/978-3-642-27733-7_299-3
  41. Collins, M.; Schapire, R.E.; Singer, Y. Logistic regression, adaboost and bregman distances. Mach. Learn., 2002, 48(123), 253-285.
  42. Li, J.; Sun, L.; Zhang, Q.; Zhang, C. Application of native Bayes classifier to text classification. J. Harbin Eng. Uni., 2003, 24(1), 71-74.
  43. Zhagparov, Z.; Buribayev, Z.; Joldasbayev, S.; Yerkosova, A.; Zhassuzak, M. Building a system for predicting the yield of grain crops based on machine learning using the xgbregressor algorithm. 2021IEEE International Conference on Smart Information Systems and Technologies (SIST), 28-30 April 2021Nur-Sultan, Kazakhstan, pp. 1-5. doi: 10.1109/SIST50301.2021.9465938
  44. Jiao, F.; Xu, J.; Yu, L.; Schuurmans, D. Protein fold recognition using the gradient boost algorithm. Computational Systems Bioinformatics, 2006, 43-53.
  45. Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 2000, 42(1), 80-86. doi: 10.1080/00401706.2000.10485983
  46. Gardner, W.A. Learning characteristics of stochastic-gradient-descent algorithms: A general study, analysis, and critique. Signal Processing, 1984, 6(2), 113-133. doi: 10.1016/0165-1684(84)90013-6
  47. Bao, W.; Cui, Q.; Chen, B.; Yang, B. Phage_UniR_LGBM: Phage virion proteins classification with unirep features and lightgbm model. Comput. Math. Methods Med., 2022, 2022, 1-8. doi: 10.1155/2022/9470683 PMID: 35465015
  48. Kao, H.J.; Nguyen, V.N.; Huang, K.Y.; Chang, W.C.; Lee, T.Y. SuccSite: Incorporating amino acid composition and informative k-spaced amino acid pairs to identify protein succinylation sites. Genomics Proteomics Bioinformatics, 2020, 18(2), 208-219. doi: 10.1016/j.gpb.2018.10.010 PMID: 32592791
  49. Barukab, O.; Ali, F.; Alghamdi, W.; Bassam, Y.; Afzal Khan, S. DBP-CNN: Deep learning-based prediction of DNA-binding proteins by coupling discrete cosine transform with two-dimensional convolutional neural network. Expert Syst. Appl., 2022, 197, 116729. doi: 10.1016/j.eswa.2022.116729
  50. Yang, L.; Gao, H.; Liu, Z.; Tang, L. Identification of phage virion proteins by using the g-gap tripeptide composition. Lett. Org. Chem., 2019, 16(4), 332-339. doi: 10.2174/1570178615666180910112813

Supplementary files

Supplementary Files
Action
1. JATS XML

Copyright (c) 2024 Bentham Science Publishers