• xmlui.mirage2.page-structure.header.title
    • français
    • English
  • Help
  • Login
  • Language 
    • Français
    • English
View Item 
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesTypeThis CollectionBy Issue DateAuthorsTitlesType

My Account

LoginRegister

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors
Thumbnail - No thumbnail

Trends of Evolutionary Machine Learning to Address Big Data Mining

Ben Hamida, Sana; Benjelloun, Ghita; Hmida, Hmida (2021), Trends of Evolutionary Machine Learning to Address Big Data Mining, in Inès Saad; Camille Rosenthal-Sabroux; Faiez Gargouri; Pierre-Emmanuel Arduin, Information and Knowledge Systems. Digital Technologies, Artificial Intelligence and Decision Making, Springer International Publishing : Berlin Heidelberg, p. 85-99. 10.1007/978-3-030-85977-0_7

Type
Communication / Conférence
External document link
https://hal.science/hal-03363083v1
Date
2021
Conference title
5th International Conference, ICIKS 2021
Conference date
2021-06
Conference city
Virtual event
Conference country
France
Book title
Information and Knowledge Systems. Digital Technologies, Artificial Intelligence and Decision Making
Book author
Inès Saad; Camille Rosenthal-Sabroux; Faiez Gargouri; Pierre-Emmanuel Arduin
Publisher
Springer International Publishing
Published in
Berlin Heidelberg
ISBN
978-3-030-85976-3
Number of pages
185
Pages
85-99
Publication identifier
10.1007/978-3-030-85977-0_7
Metadata
Show full item record
Author(s)
Ben Hamida, Sana
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Benjelloun, Ghita
PSL Research University, UMR3306
Hmida, Hmida
Université de Tunis El Manar [UTM]
Abstract (EN)
Improving decisions by better mining the available data in an Information System is a common goal in many decision making environments. However, the complexity and the large size of the collected data in modern systems make this goal a challenge for mining methods. Evolutionary Data Mining Algorithms (EDMA), such as Genetic Programming (GP), are powerful meta-heuristics with an empirically proven efficiency on complex machine learning problems. They are expected to be applied to real-world big data tasks and applications in our daily life. Thus, they need, as all machine learning techniques, to be scaled to Big Data bases. This paper review some solutions that could be applied to help EDMA to deal with Big Data challenges. Two solutions are then selected and explained. The first one is based on the algorithmic manipulation involving the introduction of the active learning paradigm thanks to the active data sampling. The second is based on the processing manipulation involving horizontal scaling thanks to the processing distribution over networked nodes. This work explains how each solution is introduced to GP. As preliminary experiences, the extended GP is applied to solve two complex machine learning problem: the Higgs Boson classification problem and the Pulsar detection problem. Experimental results are then discussed and compared to value the efficiency of each solution.
Subjects / Keywords
Big data mining; Genetic Programming; Data sampling; Apache Spark; Horizontal parallelization; Active Learning

Related items

Showing items related by title and author.

  • Thumbnail
    Hierarchical Data Topology Based Selection for Large Scale Learning 
    Hmida, Hmida; Ben Hamida, Sana; Borgi, Amel; Rukoz, Marta (2016) Communication / Conférence
  • Thumbnail
    Tuning Active Sampling Techniques for Evolutionary Learner from Big Data Sets: Review and Discussion 
    Ben Hamida, Sana; Rukoz, Marta (2016) Communication / Conférence
  • Thumbnail
    Scale Genetic Programming for large Data Sets: Case of Higgs Bosons Classification 
    Hmida, Hmida; Ben Hamida, Sana; Borgi, Amel; Rukoz, Marta (2018) Article accepté pour publication ou publié
  • Thumbnail
    Bi-objective CSO for Big Data ScientificWorkflows scheduling in the Cloud: case of LIGO workflow 
    Bousselmin, K.; Ben Hamida, Sana; Rukoz, Marta (2020) Communication / Conférence
  • Thumbnail
    Adaptive sampling for active learning with genetic programming 
    Ben Hamida, Sana; Hmida, Hmida; Borgi, Amel; Rukoz, Marta (2019) Article accepté pour publication ou publié
Dauphine PSL Bibliothèque logo
Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16
Phone: 01 44 05 40 94
Contact
Dauphine PSL logoEQUIS logoCreative Commons logo