FLAVIdB: FAQ page

Questions Answers
  • Table 1A. Most common Flavivirus species in FLAVIdB including their common species abbreviation, number of full proteome sequences, partial proteome sequences, total number of entries, number of T-cell and B-cell epitope entries (note that some epitopes overlap strains and species), and the number of protein structures.
    SpeciesAbbreviationFull proteome sequencesPartial proteome sequenceAll proteome sequenceT-cell epitopesB-cell epitopesProtein structures
    Dengue virus 1DENV1120910852294112231
    Dengue virus 2DENV283517812616295491
    Dengue virus 3DENV35651665223097401
    Dengue virus 4DENV410360670990241
    Japanese encephalitis virusJEV6712381305 14 
    Kunjin virusKUNV3103106   
    Kyasanur forest disease virusKFDV2100102   
    St. Louis encephalitis virusSLEV5234239 1 
    Tickborne encephalitis virusTBE34620654 8 
    West Nile virusWNV177136115383438 
    Yellow fever virusYFV26389415235 

    Table 1B. Less common Flavivirus species in FLAVIdB including their common species abbreviation and total number of entries.
    SpeciesAbbreviationEntries
    Murray Valley encephalitis virusMVEV70
    Powassan virusPOWV69
    Alkhurma hemorrhagic fever virusAHFV36
    Omsk hemorrhagic fever virusOHFV35
    Kokobera virusKOKV33
    Ilheus virusILHV22
    Louping ill virusLIV21
    Mosquito flavivirusMBV21
    Usutu virusUSUV19
    Aedes flavivirusMBV18
    Tembusu virusMBV15
    Deer Tick virusDTV14
    Rocio virusROC10
    Langat virusLGTV9
    Sepik virusSEPV9
    Edge Hill virusEHV9
    Karshi virusKSIV8
    Zika virusZIKV8
    Entebbe bat virusENTV7
    Alfuy virusALFV7
    Gadgets Gully virusGGYV7
    Bagaza virusBAGV6
    Wesselsbron virusWESSV6
    Modoc virusMODV6
    Montana myotis leukoencephalitis virusMMLV6
    Bussuquara virusBSQV6
    Saboya virusSABV6
    Kamiti River virusKRV5
    Kedougou virusKEDV5
    Apoi virusAPOV5
    Rio Bravo virusRBV5
    Iguape virusIGUV5
    Jugra virusJUGV5
    Kadam virusKADV5
    Meaban virusMEAV5
    Saumarez Reef virusSREV5
    Spondweni virusSPOV5
    Tyuleniy virusTYUV5
    SpeciesAbbrevia-tionEntries
    Cacipacore virusCPCV5
    Kumlinge virusKVE5
    Tamana bat virusTAB4
    Aroa virusAROAV4
    Banzi virusBANV4
    Greek goat encephalitis virusGGE4
    Royal Farm virusRFV4
    Uganda S virusUGSV4
    Dakar bat virusDAKV4
    Israel turkey meningoencephalomyelitis virusITV4
    Koutango virusKOUV4
    Naranjal virusNJLV4
    Ntaya virusNTAV4
    Sal Vieja virusSVV4
    Bouboui virusBOUV3
    Potiskum virusPOTV3
    Spanish Sheep encephalitis virusSSE3
    Bukalasa bat virusBBV3
    Carey Island virusCIV3
    Cowbone Ridge virusCRV3
    Jutiapa virusJUTP3
    Negishi virusNIV3
    Phnom Penh bat virusPPBV3
    Sokoluk virusSOKV3
    Stratford virusSTRV3
    Yokose virusYOKV3
    Nounane virusNOUV2
    Chaoyang virusCYV2
    Turkish Sheep encephalitis virusTSEV2
    Batu Cave virusBCV2
    Ngoye virusNGOV2
    Yaounde virusYAOV2
    Calbertado virusCAV1
    New Mapoon virusNMV1
    San Perlita virusSPV1
    Sitiawan virusSV1
    T'Ho virusMBV1
    Wang Thong virusWTV1
    [Go Back]
  • Table 2. The data fields in each entry (the values are included if available).
    Field title Field content
    HostHost of collection
    CountryLocation of collection
    YearTime of collection
    StrainStrain name
    IsolateIsolate name
    CloneClone name
    NomenclatureShort-hand representation of host, country (ISO code), and year of collection, as well as serotype (where applicable), strain, isolate, and clone name
    Strain typeInformation on whether strain is wild type, laboratory strain, or vaccine strain
    PathologyThe morbidity and mortality of the virus
    [Go Back]
  • User manual for keyword search

    The keyword search function can be used to search entries of flavivirus records, or T-cell epitopes/HLA ligands, and B-cell epitopes. The two search funcations, named "Record search" and "Epitope search" can be chosen in the drop down menu titled " Search". In "Record search", the advanced search function is implemented to allow users to further fileter the search result by setting up parameters in "Refine your search for flavivirus records". These parameters include "species", "pathology" (morbidity and mortality), "protein" (requiring the presence of a specific protein sequence in the entry), "strain type" (wild type strain, laboratory strain, or vaccine strain), "entry type" (partial proteomes or complete proteomes only), and "host" (Figure 1.1).

    Figure 1.1: The FLAVIdb Record search page.

    After the user submits the search by clicking "search", a list of entries matching the search parameters are listed. Each entry can be accessed by clicking the accession number in the left column (Figure 1.2).



    Figure 1.2: The results of a keyword search for "clone". Each result is listed with accession number, species, and entry type. The entry is accessed by clicking the accession number.

    Each individual entry in FLAVIdb is displayed as shown in Figure 1.3. The proteome sequence annotation can be toggled on or off by clicking "show/hide proteome annotation".



    Figure 1.3: An example of a FLAVIdb entry record.

    If the sequence in the entry harbors any experimentally defined T-cell epitopes, these can be displayed by clicking "show/hide T-cell epitopes" which toggles on or off a list as shown in Figure 1.4. Furthermore, HLA binders can be predicted using the link below the text "Predict HLA binders". This prediction is performed with the standard input parameters for NetMHCpan 2.4.



    Figure 1.4: An example of a list of experimentally defined T-cell epitopes.

    The PudMed ID for each T-cell epitope is found in the reference column, and links directly to the article abstract in PubMed. The epitope sequence is also a link, which if clicked, aligns all sequences in FLAVIdb harboring that particular sequence. In the resulting MSA, the given epitope sequence is highlighted as shown in Figure 1.5.



    Figure 1.5: MSA of all sequences harboring a specific T-cell epitope sequence (highlighted in yellow). Each of the entries listed in the left column link back to respective entry records.
    [Go Back]

  • User manual for sequence similarity search

    Sequence similarity searches of FLAVIdb can be performed using BLAST. The variable parameters "E value", "word size", "matrix", and "gap cost" can be adjusted to a number of preset values (Figure 2.1).



    Figure 2.1: The BLAST query interface of the FLAVIdB.

    After executing the BLAST search, the user is redirected to the results page (Figure 2.2), where a list of matches is presented in ascending order of score. Each accession number in the list links back to the respective entry records, and tick boxes to the left of the accession number allows for rapid selection of sequences for MSA.



    Figure 2.2: The output from the FLAVIdB BLAST query.

    As well as the sorted list of matches, the user is presented with detailed results of each pairwise comparison (Figure 2.3).



    Figure 2.3: Details of pairwise BLAST alignment between the query and each match in FLAVIdB.

    [Go Back]

  • User manual for multiple sequence alignment (MSA)

    MSA can be performed for three or more sequences. The selection of sequence for MSA can be selected using the filtering parameters also used in the keyword search function (Figure 3.1). MSA is performed using MAFFT with the default input parameters.



    Figure 3.1: Sequence selection for MSA using MAFFT.

    The result of the MSA is displayed in Pearson/FASTA format. Positions with no variation have no highlighting (white). For positions with variations, the most frequent residue is highlighted with cyan, the rest are highlighted with (in descending order of frequency): Yellow, gray, green, and purple. Any residues with a lower frequency than purple residues are highlighted with the color blue (Figure 3.2).



    Figure 3.2: The output of the MAFFT MSA. The output format is Pearson/FASTA and the residues are color coded by frequency: white (100%), cyan (second most frequent), yellow (third most frequent residues), gray (fourth most frequent residues), green (fifth most frequent residues), purple (sixth most frequent residues), and blue (everything less frequent than the sixth most frequent residues).

    [Go Back]

  • User manual for variability/conservation analysis tool

    The variability and conservation analysis offers the capacity to analyze the variability and conservation of amino acids in proteins from one or more species. Only one protein can be analyzed at the time, but there is no limit to the number of species which can be analyzed collectively. The dataset for variability and conservation analysis can also be narrowed down by hosts, if, for example, one wishes to analyze only human pathogens (Figure 4.1).



    Figure 4.1: Selection of proteins for variability and conservation analysis. The user can choose one or more species, isolated from one or more different hosts. The analysis can be performed for one protein at the time.

    Once a selection has been made, the user is presented with the results graphically (Figure 4.2). A graph is printed with intra-species variability and conservation for each selected species, as well as one graph with inter-species variability and conservation for all species in the selection. The consensus sequence of the protein dataset is displayed on the X axis, the Shannon entropy is displayed on the primary Y axis, and the conservation is displayed on the secondary Y axis. The user is also given the option to download the results in a Microsoft office excel (*.xls) file format, as well as a text file with the consensus sequence. Furthermore, the MSA of every sequence in the analysis can be toggled on and off.



    Figure 4.2: The report from variability and conservation analysis of 1574 envelope proteins from DENV1. The consensus sequence is printed on the X axis, the Shannon entropy is printed on the primary Y axis, and the conservation is printed on the secondary Y axis.

    [Go Back]

  • User manual for species classifier

    In the species classification tool, the user is prompted to paste a single query sequence in FASTA format (Figure 5.1). After the job is processed, the results page will show the highest scoring sequence similarity match found in FLAVIdb, as well as the pairwise identity between the query and highest scoring match. A short text will notify the user of the result of the classification (Figure 5.2).



    Figure 5.1: The species classification tool input form.



    Figure 5.2: Output of the species classification tool. The output is presented as a BLAST report of the highest scoring match, as well as a short text with the result of the classification.

    [Go Back]

  • User manual for Block entropy calculation tool

    In the block entropy calculation tools, the user is prompted to select species, host, and protein for the analysis (Figure 6.1).



    Figure 6.1: User input options for T-cell epitope analysis using the block entropy approach.

    The user is presented with a summary report of all blocks in the selection, i.e. the number of peptides in each block, the number of peptides required for coverage of 99% of the block, and which species are covered by the peptide. Each block can be further examined for immuno-functional conservation (by prediction of T-cell epitopes of each peptide in the block), as well as visually inspected with sequence logo and block logo.

    [Go Back]
  • User manual for B-cell epitope analysis tool

    In the B-cell epitope characterization tool, BBscore, the user will be prompted to choose a dataset for analysis (Figure 7.1). At present (January 2011), only analysis of DENV serotype 1-4 and West Nile virus is available.



    Figure 7.1: The user input interface for BBscore.

    After submission, the user will be presented with a graphical representation of variability and conservation (Figure 4.2), as well as a summary of the B-cell epitope positions in table form. After inspecting the summary, the user can select positions in E protein sequence for visual inspection (Figure 7.2), after which a 3D visualization of the positions is generated (Figure 7.3).



    Figure 7.2: The BBscore scoring table and interface for epitope position selection.

    After inspecting the scoring table, the user can select positions of interest for visual inspection. Initially, a default value, based on the BBscore selection criteria, is shown in the submission box. However, the user can alter this input after inspection of the scoring table.



    Figure 7.3: 3D visualization of the shared neutralizing features of B-cell epitopes in the input selection. As well as the image shown above, two additional views are presented, as shown in Figure 6.5.

    [Go Back]

  • User manual for summary workflow

    In the summary workflow, the user is prompted to select species, host, and protein for analysis (Figure 8.1).



    Figure 8.1: The input interface of the summary workflow.

    After submission, the user is presented with a statistical overview of the entries in FLAVIdb matching the input selection, results of the block entropy analysis, results of the T-cell epitope prediction (Figure 1.4) and mapping of known T-cell epitopes (Figure 1.5), results of the B-cell epitopes characterization and mapping known B-cell epitopes (Figure 7.3), and variability and conservation analysis (Figure 4.2). The output is presented in a printer-friendly format, to accommodate printing of the report.

    [Go Back]
  • User manual for query analyzer workflow

    In the query analyzer workflow, the user will be prompted to either paste an input sequence or enter a FLAVIdb accession number (FVxxxxxx) (Figure 9.1). The query analyzer workflow is presently (January 2011) restricted to analyze only full proteome sequences.



    Figure 9.1: The input form for the query analyzer workflow.

    Once the selection is submitted, the user will be presented with the results of the T-cell epitope predictions (Figure 1.4) as well as mapping of known T-cell epitopes (Figure 1.5), mapping of known and characterized B-cell epitopes (Figure 7.3), and a variability analysis of all strains of the input species (Figure 4.2). The output is presented in a printer-friendly format, to accommodate printing of the report.

    [Go Back]

Version 1.1, June 2011. Developed by Bioinformatics Core at Cancer Vaccine Center, Dana-Farber Cancer Institute.