• xmlui.mirage2.page-structure.header.title
    • français
    • English
  • Help
  • Login
  • Language 
    • Français
    • English
View Item 
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesTypeThis CollectionBy Issue DateAuthorsTitlesType

My Account

LoginRegister

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors
Thumbnail - No thumbnail

Reuse-based Optimization for Pig Latin

Camacho-Rodríguez, Jesús; Colazzo, Dario; Herschel, Melanie; Manolescu, Ioana; Roy Chowdhury, Soudip (2014), Reuse-based Optimization for Pig Latin, BDA'2014: 30e journées Bases de Données Avancées, 2014-10, Grenoble-Autrans, France

Type
Communication / Conférence
External document link
https://hal.inria.fr/hal-01086497
Date
2014
Conference title
BDA'2014: 30e journées Bases de Données Avancées
Conference date
2014-10
Conference city
Grenoble-Autrans
Conference country
France
Metadata
Show full item record
Author(s)
Camacho-Rodríguez, Jesús
Inria Saclay - Ile de France
Colazzo, Dario
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Herschel, Melanie
Inria Saclay - Ile de France
Manolescu, Ioana cc
Inria Saclay - Ile de France
Roy Chowdhury, Soudip
Inria Saclay - Ile de France
Abstract (EN)
Pig Latin has become a popular language within the data management community interested in the efficient parallel processing of large data volumes. The dataflow-style primi-tives of Pig Latin provide an intuitive way for users to write complex analytical queries, which are in turn compiled into MapReduce jobs. Currently, subexpressions occurring repeatedly in Pig Latin scripts are executed as many times as they occur, leading to avoidable MapReduce jobs. The current Pig Latin optimizer is not capable of recognizing, and thus optimizing, such repeated subexpressions. We present a novel approach for identifying and reusing common subexpressions occurring in Pig Latin scripts. In particular, we lay the foundation of our reuse-based algo-rithms by formalizing the semantics of the Pig Latin query language with extended nested relational algebra for bags. Our algorithm, named PigReuse, operates on the algebraic representations of Pig Latin scripts, identifies subexpression merging opportunities, selects the best ones to execute based on a cost function, and merges other equivalent expressions to share its result. Our experimental results demonstrate the efficiency and effectiveness of our reuse-based algorithms and optimization strategies.
Subjects / Keywords
experiments; PigLatin; reuse-based optimization; optimization

Related items

Showing items related by title and author.

  • Thumbnail
    Reuse-based Optimization for Pig Latin 
    Camacho-Rodríguez, Jesús; Colazzo, Dario; Herschel, Melanie; Manolescu, Ioana; Chowdhury, Soudip Roy (2016) Communication / Conférence
  • Thumbnail
    PAXQuery: A Massively Parallel XQuery Processor 
    Camacho-Rodríguez, Jesús; Colazzo, Dario; Manolescu, Ioana (2014) Communication / Conférence
  • Thumbnail
    PAXQuery: Parallel Analytical XML Processing 
    Camacho-Rodríguez, Jesús; Colazzo, Dario; Manolescu, Ioana; Naranjo, Juan A. M. (2015) Communication / Conférence
  • Thumbnail
    PAXQuery: Efficient Parallel Processing of Complex XQuery 
    Camacho-Rodríguez, Jesús; Colazzo, Dario; Manolescu, Ioana (2015) Article accepté pour publication ou publié
  • Thumbnail
    PAXQuery: Efficient Parallel Processing of Complex XQuery 
    Camacho-Rodríguez, Jesús; Colazzo, Dario; Manolescu, Ioana (2014) Communication / Conférence
Dauphine PSL Bibliothèque logo
Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16
Phone: 01 44 05 40 94
Contact
Dauphine PSL logoEQUIS logoCreative Commons logo