Services

Bioinformatics Tutorials

Applications of Machine Learning in bioinformatics

Drug Repositioning using SVM

  • Drug Repurposing concept
    • Drug Discovery
  • Benefits
    • Less risky
    • Faster
    • Cheaper
    • Creates opportunities to treat rare, acute, and neglected diseases
  • Methods
    • Gene expressions before and after using drug (Connectivity Map (CMap) dataset)
    • Same target, same drug (Guilt and Association (GBA) )
  • Main idea
    • Similarity of drug-drug, disease-disease => drug-disease relationship
  • drug-drug similarity
  • disease-disease similarity
  • drug-disease interaction prediction
    • Gold standard for learning SVM
  • Evaluation
    • Leave One Drug Out Cross Validation

Speaker: Dr. Fatemeh Zare-Mirakabad

The Application of SVM in Protein

  • Secondary structure prediction
    • Input: a chain of 20 amino acid sequence
    • Output: secondary structure of each amino acid (Helix, Strand, Coil)
    • Challenge: How to give input to SVM
      • Evolutionary information: multiple sequence alignment with a database
    • Several binary SVMs
      • one-versus-rest (H/~H, S/~S, C/~C)
      • one-versus-one (C/H, C/E, H/E)
  • Fold recognition
    • Input: part of protein sequence
    • Output: fold type of that part of protein sequence (27 folds SCOP)
    • Challenge: How to give input to SVM
      • Physical and chemical properties: amino acids composition, predicted secondary structure, hydrophobicity, polarity, ...
  • Cleavage site identification
    • Input: protein sequence
    • Output: every position of protein, whether it is cleavage site or not
  • RNA-binding proteins
    • Input: protein sequence
    • Output: whether the protein bind to RNA or not
      • proteins that bind to RNA: UniProt
      • proteins that do not bind to RNA: PDB
    • Challenge: How to give input to SVM
      • Physical and chemical properties: hydrophobicity, polarity, ...

Speaker: Dr. Fatemeh Zare-Mirakabad

The Application of SVM in Gene expression

  • Input: the gene expression level of each gene for normal and cancer samples
  • Output: whether the new sample has cancer or not by identifying of genes involved in the disease
  • The gene expression level computation -> microarray
  • Challenge of microarray data: the number of features (genes) is greater than the number of samples
    • Select genes that relate to disease -> Mutual Information \[\sum_x \sum_y p(x,y) \log_2 {p(x,y) \over p(x)p(y)}\]
      • Discrete data -> normalize microarray data and categorize to a different level of expression
      • The relation between normal and cancer in each gene (expressed with different level)
      • Lower mutual information means that gene show disease specifically
    • Classify samples to normal and cancer according to the gene expression level

Speaker: Dr. Fatemeh Zare-Mirakabad

The Application of Word2Vec as a Generator in Bioinformatics

Part 1

Speaker: Dr. Fatemeh Zare-Mirakabad

Part 2

Speaker: Dr. Fatemeh Zare-Mirakabad

The Application of UniRep in Bioinformatics

Speaker: Dr. Fatemeh Zare-Mirakabad

The Application of ProGen as a Generator in Bioinformatics

Speaker: Dr. Fatemeh Zare-Mirakabad


Applied Bioinformatics Tools