Proposta sem aluno

Gerado a 2024-05-07 21:21:50 (Europe/Lisbon).

Voltar

Titulo Estágio

MDRL - Multi-Objective Deep Reinforcement Learning in Drug Discovery

Áreas de especialidade

Sistemas Inteligentes

Local do Estágio

Laboratory of Artificial Neural Networks (LARN)

Enquadramento

The drug design process is lengthy and demands huge investment, which can be optimized through a method called de novo. This is a computer-aided drug design technique to build novel chemical molecules with desired pharmacological properties from scratch. However, the result is often molecules that are not feasible to be synthesized in the laboratory. One of the reasons is that multiple, pharmaceutically relevant parameters are not correctly optimized since only the main functional objective is taken into account.
This proposal proposes the use of state-of-the-art Deep Learning methods to develop a computational model that accurately generates novel molecules that not only have the predicted activity against a target but also include multiple pharmacological objectives (DEEP - MORL). MORL extends the conventional single-objective reinforcement learning (RL) methods to characterize two or more objectives simultaneously. Recent applications of RL for tuning generative methods have presented promising results. However, in tasks like molecules generation, it is wanted to optimize some different properties at the same time (e. g. diversity, drug-likeness, and synthesizability). In addition, these properties can often be conflicting, such as molecular diversity and drug-likeness or potency against an intended target and no toxic effects. For this reason, a MORL approach can be more effectively applied than a single objective RL to fine-tune our pre-trained generator to build molecules with desired biological and physicochemical characteristics. The innovation from this proposal comes from the use of a deep generative method combined with multi-objective reinforcement learning.

Objetivo

The main objective of this proposal is to develop a deep generative method combined with multi-objective reinforcement learning and analysis on real database benchmark, which contains molecules and measured biological activity data.
The main goals of this proposal are:
(i) Construct a data set for the MORL model;
(ii) Perform Data Pre-Processing, Normalization and Scaling;
(iii) Select appropriate ML algorithms for building the MORL model;
(iv) Perform Sampling and Model Evaluation and validate the overall model with real data.
(v) Integrate the implemented components reusable platform.

Plano de Trabalhos - Semestre 1

â€¢ Overview of drug discovery, including target identification, lead discovery, and lead optimization;
â€¢ Overview of machine learning techniques, namely Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU) and multi-objective reinforcement learning;
â€¢ Propose initial deep generative model combined with multi-objective reinforcement learning workflow and prepare the first case study
â€¢ Prepare the intermediate report.

Plano de Trabalhos - Semestre 2

â€¢ Select, and pre-process a collection of large datasets for experiments;
â€¢ Study, and select, Machine Learning (ML) algorithms for building the generative model to create valid novel drug molecules (SMILEs);
â€¢ Study and select feature selection algorithms and multi-objective reinforcement learning model machine for generating chemically feasible SMILE strings
â€¢ Analyse experimental results: e.g., study parameter values; compare performance of the reduced datasets vs. previous results, etc.;
â€¢ Prepare a research paper and the final version of the thesis.

Condições

This work will be carried out in the Laboratory of Neural Networks (LARN) of CISUC, where there will be regular supervision and feedback on behalf of the supervisors.
Familiarity with machine learning and data mining algorithms and software tools are essential. Participating students will acquire valuable knowledge and experience with model building and data science by mining massive datasets, which skills are currently in high demand for various technology employers due to the relevance to various applications.

Supervision:
Joel P. Arrais (jpa@dei.uc.pt)
Maryam Abbasi (maryam@dei.uc.pt)

Orientador

Joel Perdiz Arrais
jpa@dei.uc.pt 📩