• xmlui.mirage2.page-structure.header.title
    • français
    • English
  • Help
  • Login
  • Language 
    • Français
    • English
View Item 
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesTypeThis CollectionBy Issue DateAuthorsTitlesType

My Account

LoginRegister

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors
Thumbnail - Request a copy

On the Anonymization of Workflow Provenance without Compromising the Transparency of Lineage

Belhajjame, Khalid (2022), On the Anonymization of Workflow Provenance without Compromising the Transparency of Lineage, Journal of Data and Information Quality, 14, 1, p. 1–27. 10.1145/3460207

Type
Article accepté pour publication ou publié
Date
2022
Journal name
Journal of Data and Information Quality
Volume
14
Number
1
Publisher
ACM - Association for Computing Machinery
Pages
1–27
Publication identifier
10.1145/3460207
Metadata
Show full item record
Author(s)
Belhajjame, Khalid
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Abstract (EN)
Workflows have been adopted in several scientific fields as a tool for the specification and execution of scientific experiments. In addition to automating the execution of experiments, workflow systems often include capabilities to record provenance information, which contains, among other things, data records used and generated by the workflow as a whole but also by its component modules. It is widely recognized that provenance information can be useful for the interpretation, verification, and re-use of workflow results, justifying its sharing and publication among scientists. However, workflow execution in some branches of science can manipulate sensitive datasets that contain information about individuals. To address this problem, we investigate, in this article, the problem of anonymizing the provenance of workflows. In doing so, we consider a popular class of workflows in which component modules use and generate collections of data records as a result of their invocation, as opposed to a single data record. The solution we propose offers guarantees of confidentiality without compromising lineage information, which provides transparency as to the relationships between the data records used and generated by the workflow modules. We provide algorithmic solutions that show how the provenance of a single module and an entire workflow can be anonymized and present the results of experiments that we conducted for their evaluation.

Related items

Showing items related by title and author.

  • Thumbnail
    Lineage-Preserving Anonymization of the Provenance of Collection-Based Workflows 
    Belhajjame, Khalid (2020) Communication / Conférence
  • Thumbnail
    Static analysis of Taverna workflows to predict provenance patterns 
    Alper, Pinar; Belhajjame, Khalid; Goble, Carole (2017) Article accepté pour publication ou publié
  • Thumbnail
    LabelFlow: Exploiting Workflow Provenance to Surface Scientific Data Provenance 
    Alper, Pinar; Belhajjame, Khalid; Goble, Carole; Karagoz, pinar (2015) Communication / Conférence
  • Thumbnail
    UP & DOWN: Improving Provenance Precision by Combining Workflow- and Trace-Level Information 
    Dey, Saumen; Belhajjame, Khalid; Koop, David; Song, Tianhong; Missier, Paolo; Ludäscher, Bertram (2014) Communication / Conférence
  • Thumbnail
    SHARP: Harmonizing and Bridging Cross-Workflow Provenance 
    Gaignard, Alban; Belhajjame, Khalid; Skaf-Molli, Hala (2017) Communication / Conférence
Dauphine PSL Bibliothèque logo
Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16
Phone: 01 44 05 40 94
Contact
Dauphine PSL logoEQUIS logoCreative Commons logo