Skip to Main content Skip to Navigation
Conference papers

Sparse Multiple Correspondence Analysis

Abstract : Multiple Correspondence Analysis (MCA) is the method of choicefor themultivariate analysis of categorical data. In MCA each qualitative variable is representedby a group of binary variables (with a coding scheme called “complete disjunctive coding”)and each binary variable has a weight inversely proportional to its frequency. The datamatrix concatenates all these binary variables, and once normalized and centered thisdata matrix is analyzed with a generalized singular value decomposition (GSVD) thatincorporates the variable weights as constraints (or “metric”). The GSVD is, of course,based on the plain SVD and so MCA can be sparsified by extending algorithms designedto sparsify the SVD. To do so requires two additional features: to include weights andto be able to sparsify entire groups of variables at once. Another important feature ofsuch a sparsification should be to preserve the orthogonality of the components, Here, weintegrate all these constraints by using an exact projection scheme onto the intersectionof subspaces (i.e., balls) where each ball represents a specific type of constraints. Weillustrate our procedure with the data from a questionnaire survey on the perception ofcheese in two French cities.
Document type :
Conference papers
Complete list of metadatas
Contributor : Arnaud Gloaguen <>
Submitted on : Thursday, December 3, 2020 - 9:28:01 AM
Last modification on : Thursday, December 10, 2020 - 3:41:28 AM


Files produced by the author(s)


  • HAL Id : pasteur-03037346, version 1


Vincent Guillemot, Julie Le Borgne, Arnaud Gloaguen, Arthur Tenenhaus, Gilbert Saporta, et al.. Sparse Multiple Correspondence Analysis. 52èmes Journées de Statistique, May 2020, Nice, France. ⟨pasteur-03037346⟩



Record views


Files downloads