Tutorial

minimize Introduction

FunEVA was designed to simplify and assist in the selection of functionally relevant Single Nucleotide Polymorphisms (SNP) for large-scale genotyping studies.
This tool is based on many existing tools to perform functional annotation and pathogenicity prediction of each variant.

minimizeGeneral Features

These features are available on the whole website.

Sortable tables

Customize options

Each table is sortable by clicking on the head cell of a column. You can also sort a table by multiples columns, by clicking and sliding or by hidding and showing columns using interactive "Customize options" popups.
Data can be found using search all columns input.


Use the pager to browse the table with the first page, previous page, next page and last page buttons. Choose how many results you want to display.

Popups

Popup
During the analyze of your data popups can appear to help you using FunEVA.
This popups can :
  • warm you that you have validated a new step during the analysis
  • inform you that new informations has been added to result table

Console

Console

Console2
The console shows you the analysis which are processing. It allows you to monitor in real time the progress of your analyzes.
Here you can find:
  • The action FunEVA needs you to realize to proceed to the next step
  • The name of processing analyze
  • The analyze advancing percentage





maximizeFunEVA_prot

INPUT

Format
FunEVA_prot accepts TSV input format with three case-sensitive headers needed :
  • Gene Symbol
  • Var g.
  • Var p.
Var p. refers to the canonical protein sequence from Uniprot. If the protein exists in different isoforms you will be able to link them to Uniprot in the variation report.
You can also specify other optional headers that allow you for example to sort your data more easily (date, age, gender...).
You can download a file example here.

Upload

The loading of your file is possible using the upload form.
Your file can contain up to 3,000 variations. If you have more data to analyze thank you to as many files as you need.
If your file does not contain the headers described above you will be redirected to an error page and your analysis can not be launched.

Pre-computing
After loading your data will be analyzed first to obtain all the information needed to launch modules FunEVA_prot.
This step of pre-computing is divided into three parts :

  • Mapping of Gene Symbol to Gene ID
  • Mapping of Gene ID to UniprotID
  • Parsing of Uniprot XML files
Meanwhile you can select your analysis modules.
After pre-computing it exists several possibilities :
  • No error message :
    • Pre-computing step is finished
  • Redirection to a form to manually adding some Gene IDs
    • You must enter Gene IDs of some Gene Symbol which have not been recognized. This error probably happens because you have entered some Gene Alias instead of Gene Symbols.
    • To correct it please complete the form and reload your file.
  • An error message in console during Mapping of Gene ID to UniprotID
    • This error means that some variations can not be analyzed by FunEVA. This error probably happens when UniprotID doesn't exist for the Gene ID entry.
    • To correct it you must delete this variations and upload your file again.

Analyze

Variation : Amino-acid properties
  • Compare properties of the reference amino acid and the observed amino acid:
    • Chain flexibility
    • Interaction Modes
    • Chain Hbonds
    • Weight
    • Isoelectric Point
    • Hydrophobicity
    • Properties
  • Results available in detailed report of each SNV
Variation: Grantham score/class
The Grantham score caracterizes the difference of amino acids properties giving a pathogenicity score.
Higher is the score higher is the pathogenicity prediction.
  • Grantham class :
    • Score between 0 and 50 : Conservative
    • Score between 50 and 100 : Moderately conservative
    • Score between 100 and 150 : Moderately radical
    • Score higher than 150 : Radical
  • Results are available in table results and in variation report.
Evolutionary analysis: GERP++, PhastCons, PhyloP
  • GERP identifies constrained elements in multiple alignments by quantifying substitution deficits (see this page for more details).
    GERP++>2 in human genome, as this threshold is typically regarded as evolutionarily conserved and potentially functional.
  • PhastCons is a program for identifying evolutionarily conserved elements in a multiple alignment, given a phylogenetic tree. It is based on a statistical model of sequence evolution called a phylogenetic hidden Markov model (phylo-HMM) (see this page for more details)..
  • PhyloP score is based on multiple alignments of 46 genomes. The larger the score, the more conserved the site. It is similar to GERP++ but therefore, even conceptually similar tools may sometimes generate drastically different results.
Algorithmic pathogenicity prediction: SIFT, PolyPhen, Mutation Taster
All these score were retrieved from the dbNSFP (LJB2_DB) starting from June 2013.
  • SIFT predicts whether an amino acid substitution affects protein function. SIFT prediction is based on the degree of conservation of amino acid residues in sequence alignments derived from closely related sequences, collected through PSI-BLAST.(see this page for more details).
    In previous versions of dbNSFP (ljb_sift), the scores were calculated as 1-SIFT. In the updated version 2 (ljb2_sift), the scores were now the SIFT score itself. This mean a variant with score<0.05 is predicted as deleterious.
  • PolyPhen (Polymorphism Phenotyping) is a tool which predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations (see this page for more details).
  • MutationTaster evaluates disease-causing potential of sequence alterations (see this page for more details).
Functional signatures: Motifs and domains, Regions
  • All the informations about domains, motifs and regions are extracted from international reference databases and are then analysed.
    Example of SORL1 protein with an impacted domain
    Example of SORL1 protein with an impacted domain

    • SMART : SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures (see this page for more details).
    • Prosite : PROSITE is a protein database. It consists of entries describing the protein families, domains and functional sites as well as amino acid patterns and profiles in them (see this page for more details).
    • Pfam : Pfam is a large collection of protein families, represented by multiple sequence alignments and hidden Markov models (HMMs) (see this page for more details).
    • Interpro : InterPro provides functional analysis of proteins by classifying them into families and predicting domains and important sites (see this page for more details)
Interaction, ontologies and pathways
  • Stability effect: Mupro allows assessment of the impact of the amino acid substitution on the protein's stability.
    • The keywords INCREASE or DECREASE mean respectively an increase of the stability or a decrease of the stability. (Example : )
  • Protein structure modeling: the Protein Model Portal
    • Protein Model Portal will try to find an experimental structure of the submitted protein.
    • You will be able to view several templates of 3D proteiin structure ordered by sequence identity.
    • An analysis of amino acids is given showing their several properties with a colored scheme.
  • Protein interaction
    • The drawing of functional protein association networks is done using String database.
    • You will find for every submitted variation known and predicted protein-Protein interactions.
  • Ontologies and pathways
    • Ontologies (GO terms) allow to understand the protein functions (F), general process (P) in which the protein is involved and its localisation (C).
    • The pathways are a list of terms which refers to general functions of the protein linked to major processes of the organism.
  • Mutations repositories links