Sélection et élimination des attributs redondants pour la classification des gros corpus textuels

Khaled Khodja, Anfel

Please use this identifier to cite or link to this item: https://dspace.univ-guelma.dz/jspui/handle/123456789/15004

Full metadata record

DC Field	Value	Language
dc.contributor.author	Khaled Khodja, Anfel	-
dc.date.accessioned	2023-11-23T11:25:44Z	-
dc.date.available	2023-11-23T11:25:44Z	-
dc.date.issued	2023	-
dc.identifier.uri	http://dspace.univ-guelma.dz/jspui/handle/123456789/15004	-
dc.description.abstract	Feature selection is a crucial process in the pre-processing of data for machine learning. Its aim is to reduce the feature space, speed up the learning process and improve the performance of classification algorithms, while avoiding over-learning. Various statistical methods, such as Information Gain (IG), Chi-squared test (Ch2), Improved Gini Index (IGI), etc., have proved effective in finding the most representative attributes in text corpora, using a reduced execution time compared with methods based on information theory. However, these methods can generate a large number of redundant attributes, which can adversely affect the performance of classification algorithms. In this work, we aim to eliminate this redundancy by measuring the correlation between attributes that have similar or close IG scores. Correlation can be assessed using the mutual information between attributes. Thus, attributes that are strongly related to the target variable (class) and weakly correlated with the other attributes are considered to be the most informative.	en_US
dc.language.iso	fr	en_US
dc.publisher	University of Guelma	en_US
dc.subject	selection, feature, mutual information, correlation, redundancy, classification, text.	en_US
dc.title	Sélection et élimination des attributs redondants pour la classification des gros corpus textuels	en_US
dc.type	Working Paper	en_US
Appears in Collections:	Master

Files in This Item:

File	Description	Size	Format
KHALED KHODJA_ANFAL_F5.pdf		8,32 MB	Adobe PDF	View/Open

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets