Please use this identifier to cite or link to this item: http://dspace.univ-guelma.dz/jspui/handle/123456789/13427
Title: K-means & K-mers pour le regroupement et la comparaison de grands ensembles de séquences biologiques
Authors: BOUSMAT, YASSINE
Keywords: Séquence biologique, Alignement multiplede séquences,Clustering, RechercheLocale, Apprentissageautomatique, Métaheuristique.
Issue Date: 2022
Publisher: université de guelma
Abstract: Bioinformatics is very important in extracting as much information as possible from biological data. Even though the old methods are useful, they become unable to measure the amount of biological data from ever-increasing high-throughput sequencing projects. One of the most important areas of bioinformatics is sequence grouping. In this paper, we focus on sequence grouping to help multiple sequence alignment algorithms in case large-scale biological sequences grows with the demand in computational biology. We present our clustering method based on the K-means algorithm which is guided by the k-mers related to the sequences to be aligned. Also, we integrate this method into a multiple alignment strategy to save time for execution without losing quality. We tested the approach on a multi-core processor, in addition to a set of Benchmarks in the literature review. We compared our results with those generated by the UClust clustering algorithm. The results show that our approach fails in terms of calculating time compared to UClust, while maintaining accuracy in all the tested Benchmarks.
URI: http://dspace.univ-guelma.dz/jspui/handle/123456789/13427
Appears in Collections:Master

Files in This Item:
File Description SizeFormat 
BOUSMAT_YASSINE_F5.pdf2,41 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.