Show simple item record

hal.structure.identifierInria Saclay - Ile de France
dc.contributor.authorCamacho-Rodríguez, Jesús*
hal.structure.identifierLaboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
dc.contributor.authorColazzo, Dario*
hal.structure.identifierInria Saclay - Ile de France
dc.contributor.authorHerschel, Melanie*
hal.structure.identifierInria Saclay - Ile de France
dc.contributor.authorManolescu, Ioana
HAL ID: 742652
ORCID: 0000-0002-0425-2462
*
hal.structure.identifierInria Saclay - Ile de France
dc.contributor.authorRoy Chowdhury, Soudip*
dc.date.accessioned2017-08-29T12:51:23Z
dc.date.available2017-08-29T12:51:23Z
dc.date.issued2014
dc.identifier.urihttps://basepub.dauphine.fr/handle/123456789/16648
dc.language.isoenen
dc.subjectexperimentsen
dc.subjectPigLatinen
dc.subjectreuse-based optimizationen
dc.subjectoptimizationen
dc.subject.ddc005.7en
dc.titleReuse-based Optimization for Pig Latinen
dc.typeCommunication / Conférence
dc.description.abstractenPig Latin has become a popular language within the data management community interested in the efficient parallel processing of large data volumes. The dataflow-style primi-tives of Pig Latin provide an intuitive way for users to write complex analytical queries, which are in turn compiled into MapReduce jobs. Currently, subexpressions occurring repeatedly in Pig Latin scripts are executed as many times as they occur, leading to avoidable MapReduce jobs. The current Pig Latin optimizer is not capable of recognizing, and thus optimizing, such repeated subexpressions. We present a novel approach for identifying and reusing common subexpressions occurring in Pig Latin scripts. In particular, we lay the foundation of our reuse-based algo-rithms by formalizing the semantics of the Pig Latin query language with extended nested relational algebra for bags. Our algorithm, named PigReuse, operates on the algebraic representations of Pig Latin scripts, identifies subexpression merging opportunities, selects the best ones to execute based on a cost function, and merges other equivalent expressions to share its result. Our experimental results demonstrate the efficiency and effectiveness of our reuse-based algorithms and optimization strategies.en
dc.identifier.urlsitehttps://hal.inria.fr/hal-01086497en
dc.subject.ddclabelOrganisation des donnéesen
dc.relation.conftitleBDA'2014: 30e journées Bases de Données Avancéesen
dc.relation.confdate2014-10
dc.relation.confcityGrenoble-Autransen
dc.relation.confcountryFranceen
dc.relation.forthcomingnonen
dc.description.ssrncandidatenonen
dc.description.halcandidatenonen
dc.description.readershiprechercheen
dc.description.audienceInternationalen
dc.relation.Isversionofjnlpeerreviewednonen
dc.relation.Isversionofjnlpeerreviewednonen
dc.date.updated2017-08-29T12:16:28Z
hal.author.functionaut
hal.author.functionaut
hal.author.functionaut
hal.author.functionaut
hal.author.functionaut


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record