Skip to the content.

Workshop on Machine Learning Good Practices

DOME recommendations for better Machine Learning in Computational Biology

virtually co-located with the 21st European Conference on Computational Biology (ECCB 2022)

This discussion-based workshop will focus on (i) Standards for reporting Machine Learning approaches on Life Sciences, particularly the DOME recommendations, (ii) FAIRness for Machine Learning, and (iii) Best practices around Machine Learning.

Description

Large amounts of biological data are continuously created, processed and transformed. Machine Learning (ML) algorithms have become one of the preferred approaches for data processing and understanding given their capacity to deal with large amounts of data. In Life Sciences, ML is used in a variety of fields with the potential of resulting in ground-breaking medical applications. However, there are some issues around reproducibility, transparency, explainability and so on that could, at least partially, be alleviated by consistently applying good practices reflected in good descriptions and documentation wrt the ML model. This workshop/SIG deals with good practices, guidelines and metadata for FAIR, reproducible and transparent ML in Life Sciences. In this first edition, we will focus on the Data, Optimization, Model and Evaluation (DOME) recommendations for supervised ML in biology which aim at facilitating the assessment of the quality and reliability of reported models.

Agenda

Starting at (CEST) Length Activity Participant
13:30 10’ Welcoming and introduction to the workshop Leyla Jael Castro
13:40 15’ Standards for reporting ML in LS Fotis Psomopoulos
13:55 15’ FAIRness for ML Daniel S. Katz
14:10 05’ Comfort break All
14:15 15’ Challenges for AI in life sciences Macha Nikolski
14:30 15’ Invited talk Purvesh Khatri
14:45 15’ Invited talk Chas Nelson
15:00 05’ Comfort break All
15:05 10’ Overview on challenges and opportunities (input gathered from participants) Leyla Jael Castro
15:15 60’ Panel discussion on challenges and opportunities Invited speakers and panelists
16:15 05’ Comfort break  
16:20 10’ Wrap-up Leyla Jael Castro
16:30 30’ Coffee break and finding the next workshop location All

Invited speakers

Fotis Psomopoulos
Fotis Psomopoulos
Principal Investigator at the Institute of Applied Biosciences, Centre for Research and Technology Hellas (CERTH). His research interests lay across three major pillars; (1) Bioinformatics (Phylogenetic profiles and RNA-Seq data analysis) (2) data mining methods and big data techniques for Life Sciences data, and (3) Grid / Cloud Computing, and EGI resources in particular.
Daniel S. Katz
Daniel S. Katz
Daniel S. Katz is Chief Scientist at the National Center for Supercomputing Applications (NCSA), Research Associate Professor in Computer Science (CS), Research Associate Professor in Electrical and Computer Engineering (ECE), Research Associate Professor in the School of Information Sciences (iSchool), and Faculty Affiliate in Computational Science and Engineering (CSE) at the University of Illinois Urbana-Champaign. He is also a Better Scientific Software (BSSw) Fellow and Guest Faculty at Argonne National Laboratory. His research interests are in applications, algorithms, fault tolerance, and programming in parallel and distributed computing, and policy issues, including citation and credit mechanisms and practices associated with software and data, organization and community practices for collaboration, and career paths for computing researchers. He co-founded the Journal of Open Source Software, the US RSE Association, and the Research Software Alliance (ReSA), and co-leads the FORCE11 Software Citation Implementation Working Group and the FORCE11/RDA/ReSA Fair for Research Software group.
Macha Nikolski
Macha Nikolski
Macha Nikolski is a Senior Research Scientist at the National Center for Scientific Research (CNRS), head of Computation Biology and Bioinformatics, head of Bordeaux Bioinformatics Center. She is also the co-lead of ELIXIR Cancer Focus Group. Her research interests are in the analysis of high-throughput omics and imaging data to elucidate the phenotype of interest. Methodological solutions are often a mix of algorithmics, machine and deep learning
Purvesh Khatri
Purvesh Khatri
Associate Pofessor of Medicine and of Biomedical Data Science at Santford University - KathriLab
Chas Nelson
Chas Nelson
Dr Chas Nelson is Founder & CTO of gliff.ai a company whose mission it is solve AI challenges to enable real-world solutions. gliff.ai offers software solutions that let teamsn come together to embed knowledge into high-quality, auditable medical imaging datasets for building world-changing and trustworthy AI

Organizing Committee

This discussion-focused workshop is organized by the Machine Learning Focus Group, part of the ELIXIR Tools Platform.

Name Affiliation Short description
Jennifer Harrow ELIXIR Hub Tools Platform Coordinator at the ELIXIR Hub, which involves engaging with the ELIXIR community around strategic planning for benchmarking and tools registry software, the Galaxy community to provide workflows and also developing best practices for the developer community in Europe. She has co-organized multiple workshops (e.g., FAIReScience 2021) and hackathons on life sciences (e.g., BioHackathon Europe 2018 to 2022).
Fotis E. Psomopoulos Institute of Applied Biosciences, Centre for Research and Technology Hellas Principal Investigator at the Institute of Applied Biosciences/Centre for Research and Technology Hellas. He has co-organized multiple workshops (e.g., FAIReScience 2021, RDA FAIR principles for Machine Learning 2021).
Silvio Tosatto Università degli Studi di Padova Professor in Bioinformatics and Principal Investigator at the BioComputing UP lab of the Dept. of Biomedical Sciences, Università degli Studi di Padova.
Leyla Jael G. Castro ZB MED Information Centre for Life Sciences, NFDI4DataScience Semantic retrieval team leader at ZB MED Information Centre for Life Sciences in Cologne, Germany. She has participated in initiatives related to FAIRness for research software and recommendations for research open software. She has co-organized multiple workshops (e.g., FAIReScience 2021, DaMaLOS 2020 to 2022) and hackathons on life sciences (e.g., BioHackathon Europe 2018 to 2022).

ELIXIR Europe        INAB    CERTH        Uni Padova

NFDI4DS    ZB MED