Skip to Main content Skip to Navigation
Journal articles

Perspectives on automated composition of workflows in the life sciences

Anna-Lena Lamprecht 1, * Magnus Palmblad 2, * Jon Ison 3, * Veit Schwämmle 4, * Mohammad Sadnan Al Manir 5 Ilkay Altintas 6 Christopher Baker 7 Ammar Ben Hadj Amor 8 Salvador Capella-Gutierrez 9 Paulos Charonyktakis 10 Michael Crusoe 11 Yolanda Gil 12 Carole Goble 13, 14 Timothy Griffin 15 Paul Groth 16 Hans Ienasescu 17 Pratik Jagtap 15 Matúš Kalaš 18 Vedran Kasalica 1 Alireza Khanteymoori 19 Tobias Kuhn 11 Hailiang Mei 2 Hervé Ménager 20 Steffen Möller 21 Robin Richardson 22 Vincent Robert 8 Stian Soiland-Reyes 13, 16 Robert Stevens 13 Szoke Szaniszlo 8 Suzan Verberne 23 Aswin Verhoeven 2 Katherine Wolstencroft 23 
Abstract : Scientific data analyses often combine several computational tools in automated pipelines, or workflows. Thousands of such workflows have been used in the life sciences, though their composition has remained a cumbersome manual process due to a lack of standards for annotation, assembly, and implementation. Recent technological advances have returned the long-standing vision of automated workflow composition into focus. This article summarizes a recent Lorentz Center workshop dedicated to automated composition of workflows in the life sciences. We survey previous initiatives to automate the composition process, and discuss the current state of the art and future perspectives. We start by drawing the “big picture” of the scientific workflow development life cycle, before surveying and discussing current methods, technologies and practices for semantic domain modelling, automation in workflow development, and workflow assessment. Finally, we derive a roadmap of individual and community-based actions to work toward the vision of automated workflow development in the forthcoming years. A central outcome of the workshop is a general description of the workflow life cycle in six stages: 1) scientific question or hypothesis, 2) conceptual workflow, 3) abstract workflow, 4) concrete workflow, 5) production workflow, and 6) scientific results. The transitions between stages are facilitated by diverse tools and methods, usually incorporating domain knowledge in some form. Formal semantic domain modelling is hard and often a bottleneck for the application of semantic technologies. However, life science communities have made considerable progress here in recent years and are continuously improving, renewing interest in the application of semantic technologies for workflow exploration, composition and instantiation. Combined with systematic benchmarking with reference data and large-scale deployment of production-stage workflows, such technologies enable a more systematic process of workflow development than we know today. We believe that this can lead to more robust, reusable, and sustainable workflows in the future.
Document type :
Journal articles
Complete list of metadata

https://hal-pasteur.archives-ouvertes.fr/pasteur-03680115
Contributor : Hervé Ménager Connect in order to contact the contributor
Submitted on : Friday, May 27, 2022 - 2:52:03 PM
Last modification on : Thursday, September 1, 2022 - 4:00:22 AM
Long-term archiving on: : Tuesday, August 30, 2022 - 10:08:22 AM

File

f1000research-10-57615.pdf
Publication funded by an institution

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Anna-Lena Lamprecht, Magnus Palmblad, Jon Ison, Veit Schwämmle, Mohammad Sadnan Al Manir, et al.. Perspectives on automated composition of workflows in the life sciences. F1000Research, Faculty of 1000, 2021, 10, pp.897. ⟨10.12688/f1000research.54159.1⟩. ⟨pasteur-03680115⟩

Share

Metrics

Record views

11

Files downloads

8