dc.contributor.advisor | Jjingo, Daudi | |
dc.contributor.advisor | Niyonzima, Nixon | |
dc.contributor.author | Sabakaki, Peter Ziribagwa | |
dc.date.accessioned | 2022-12-09T10:12:21Z | |
dc.date.available | 2022-12-09T10:12:21Z | |
dc.date.issued | 2022 | |
dc.identifier.citation | Sabakaki, P. Z. (2022). Using Gene and miRNA expression profiles in Breast cancer risk prediction : a machine learning-based approach (Unpublished master's dissertation). Makerere University, Kampala, Uganda. | en_US |
dc.identifier.uri | http://hdl.handle.net/10570/11087 | |
dc.description | A dissertation submitted to the Directorate of Research and Graduate Training in partial fulfilment of the requirements for the award of Master of Science Degree in Bioinformatics of Makerere University. | en_US |
dc.description.abstract | Breast cancer (BC) incidence is commonly accompanied by poor treatment outcomes due to late diagnosis. BC risk prediction is important for timely diagnosis but it’s clinical utility is limited by high risk bias. Genetic and epigenetic integration is suggested to optimize risk prediction but it’s applicability is challenged by large dimensionality. This study presented an integrative approach for analyzing gene and miRNA expression profiles by combining the dimension reduction power of differential expression, weighted gene co-expression network and function enrichment analysis to identify gene-miRNA signatures for BC risk prediction. A signature of eleven genes (CCNA2, CDK1, CCNB2, PLK1, MAD2L1, PTTG1, CCNB1, CCNE2, BUB1, CDC25A, CCNE1 & AURKA) and three miRNAs (hsa-mir-429, hsa-mir-449a and hsa-mir-137) was identified. It’s expression profile among BC tumor and healthy samples was used to train and compare eight machine learning classifiers with repeated-stratified cross-validation. Random Forest, XGBoost and K-Nearest Neighbour classifiers shown the best performance (AUC; 0.997 ± 0.002, 0.995 ± 0.005 and 0.990 ± 0.007 respectively). Their predictability was averaged in a “soft” voting classifier and an eventual model of AUC 0.998 ± 0.001 was attained on an external set, with a small trade-off between the training and validation performance. These results are indicative of a possibly reliable risk prediction model for guiding BC prevention and timely diagnosis. Components of the identified signature might also be potential therapeutic targets for BC. | en_US |
dc.description.sponsorship | Forgaty International Center of The National Institute of Health (NIH) and
Health Professions Education and Trainning (HEPI-SHSSU) | en_US |
dc.language.iso | en | en_US |
dc.publisher | Makerere University | en_US |
dc.subject | miRNA | en_US |
dc.subject | Machine learning | en_US |
dc.subject | Genes | en_US |
dc.subject | Breast cancer | en_US |
dc.title | Using Gene and miRNA expression profiles in Breast cancer risk prediction : a machine learning-based approach | en_US |
dc.type | Thesis | en_US |