• xmlui.mirage2.page-structure.header.title
    • français
    • English
  • Help
  • Login
  • Language 
    • Français
    • English
View Item 
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesTypeThis CollectionBy Issue DateAuthorsTitlesType

My Account

LoginRegister

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors
Thumbnail

Graph sketching-based Space-efficient Data Clustering

Morvan, Anne; Choromanski, Krzysztof; Gouy-Pailler, Cedric; Atif, Jamal (2018), Graph sketching-based Space-efficient Data Clustering, in Ester, Martin; Pedreschi, Dino, Proceedings of the 2018 SIAM International Conference on Data Mining, SIAM - Society for Industrial and Applied Mathematics : Philadelphia, p. 10-18. 10.1137/1.9781611975321.2

View/Open
Graph_sketching-based.pdf (5.440Mb)
Type
Communication / Conférence
Date
2018
Conference title
2018 SIAM International Conference on Data Mining
Conference date
2018-05
Conference city
San Diego
Conference country
United States
Book title
Proceedings of the 2018 SIAM International Conference on Data Mining
Book author
Ester, Martin; Pedreschi, Dino
Publisher
SIAM - Society for Industrial and Applied Mathematics
Published in
Philadelphia
ISBN
978-1-61197-532-1
Number of pages
764
Pages
10-18
Publication identifier
10.1137/1.9781611975321.2
Metadata
Show full item record
Author(s)
Morvan, Anne

Choromanski, Krzysztof

Gouy-Pailler, Cedric cc

Atif, Jamal
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Abstract (EN)
In this paper, we address the problem of recovering arbitrary-shaped data clusters from datasets while facing high space constraints, as this is for instance the case in many real-world applications when analysis algorithms are directly deployed on resources-limited mobile devices collecting the data. We present DBMSTClu a new space-efficient density-based non-parametric method working on a Minimum Spanning Tree (MST) recovered from a limited number of linear measurements i.e. a sketched version of the dissimilarity graph between the N objects to cluster. Unlike k-means, k-medians or k-medoids algorithms, it does not fail at distinguishing clusters with particular forms thanks to the property of the MST for expressing the underlying structure of a graph. No input parameter is needed contrarily to DBSCAN or the Spectral Clustering method. An approximate MST is retrieved by following the dynamic semi-streaming model in handling the dissimilarity graph as a stream of edge weight updates which is sketched in one pass over the data into a compact structure requiring O(N polylog(N)) space, far better than the theoretical memory cost O(N2) of . The recovered approximate MST as input, DBMSTClu then successfully detects the right number of nonconvex clusters by performing relevant cuts on in a time linear in N. We provide theoretical guarantees on the quality of the clustering partition and also demonstrate its advantage over the existing state-of-the-art on several datasets.
Subjects / Keywords
space constraints; resources-limited mobile devices; DBMSTClu; clustering partition; Spectral Clustering method; data cluster

Related items

Showing items related by title and author.

  • Thumbnail
    Graph-based Clustering under Differential Privacy 
    Pinot, Rafael; Morvan, Anne; Yger, Florian; Gouy-Pailler, Cédric; Atif, Jamal (2018) Communication / Conférence
  • Thumbnail
    Structured adaptive and random spinners for fast machine learning computations 
    Bojarski, Mariusz; Choromanska, Anna; Choromanski, Krzysztof; Fagan, Francois; Gouy-Pailler, Cédric; Morvan, Anne; Sakr, Nourhan; Sarlos, Tamas; Atif, Jamal (2017) Communication / Conférence
  • Thumbnail
    Multi-dimensional signal approximation with sparse structured priors using split Bregman iterations 
    Isaac, Yoann; Barthélemy, Quentin; Gouy-Pailler, Cédric; Sebag, Michèle; Atif, Jamal (2017) Article accepté pour publication ou publié
  • Thumbnail
    Theoretical evidence for adversarial robustness through randomization 
    Pinot, Rafaël; Meunier, Laurent; Araújo, Alexandre; Kashima, Hisashi; Yger, Florian; Gouy-Pailler, Cedric; Atif, Jamal (2019) Communication / Conférence
  • Thumbnail
    A unified view on differential privacy and robustness to adversarial examples 
    Pinot, Rafaël; Yger, Florian; Gouy-Pailler, Cedric; Atif, Jamal (2019) Communication / Conférence
Dauphine PSL Bibliothèque logo
Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16
Phone: 01 44 05 40 94
Contact
Dauphine PSL logoEQUIS logoCreative Commons logo