3. Results

3.1. The Overall Workflow

The overall worflow of MoSwA is illustrated in the Figure below. MoSwA Web as well as the stand alone version will output 6 files. We will describe each of them in here.

_images/Fig1_v1.png

3.2. Short Summary

You will be directed to short summary report page once the analysis is done. Additionally, the same report is provided as .html file in the stand alone version. At the top of this report a brief statistical summary is reported to the user, including;

  • The input alignment length

  • The highest support for a position

  • The average support

  • The number of positions that have no support. (Absence of k-mer without indel/gaps or unknown amino acids)

  • The number of positions that have low support. (Based on the user defined value)

  • The number of unique (distinctive) switches. A position can have more than one switch. (i.e: switches involving index, major, minor and unique can happen at a single position)

  • The number of total switches.

An example of this part of the MoSwA report is given below. The sample file is analyzed for all the motif ranks (index, major, minor and unique). You can access the sample file and the alignment details from here.

3.3. Alignment View

Users are able to view this alignment file in the short summary report, as well as download it in .aln format. This file also produced by the stand alone version. In this alignment view asterix(*) indicates motif switch, and colon(:) indicates switches at positions that have low support. We produce this for two reasons;

  1. Provide users an alignment view of all the aligned kmer positions, users are able to observe motif switch positions referenced against a consensus sequence built from the index sequences. This allows users to spot neighboring and clustering motif switches as well as the possible inter motif relationship

  2. Its also to allow assement of neighboring motif switch positions for inter motif switches.

Below is the showcase of the alignment view within the short summary page.

CLUSTAL X (1.81) multiple sequence alignment

index_consensus                     CPYQGRSSFFRNVVWLIKKdnaYPTIKRSYNNTNQEDLLVLWGIHHPNDA
index_switches                      CPYQGRSSFFRNV*WLIKK**aYPTIKRSYNNTNQEDLLVLWGIHHPNDA
major_switches                      CPYQGR***FR***WLIKK**a*PTIKR**NNTNQEDLLVL*GIHH*NDA
minor_switches                      C****R**FFRNVV*LI***n*YP******N*T*****LVLW*I*****A
unique_switches                     *P*Q****FF*N*V*L*******P**K***N*TNQ**LL*L**IHHP***

index_consensus                     AEQTRLYQNPTTYISVGTSTLNQRLVPKIATRSKVNGQSGRMEFFWTILK
index_switches                      AEQTRLYQNPTTYISVGTSTLNQRLVPKIATRSKVNGQSGRMEFFWTILK
major_switches                      AEQTRL**NPTTYISV**S*LNQRLVPK*ATRSK*NGQSGRME*FWTILK
minor_switches                      **Q**L******YI****STLNQ*LVPK**T*SK***QSGRME*FW*I*K
unique_switches                     **QT***Q******S**TSTL***L****ATR******S*******TILK


3.4. Network Plot Topology

MoSwA produces a network dot plot to illustrate the complex interaction of motif switches. You can inspect the network plot right on the MoSwA web as well as download it as html file format. MoSwA stand alone produces this as a seperate html file as well.

The network plot shows only the user defined motif sites (default is all). It should be noted that the positions with a low support value (user defined) are discarded from the network plot. The network graph is dynamic, users can point at a node to see what it is actually representing. Users can download this plot as html file and take advantage of interactive plot or/and they can export the current view of the plot as png. The network can be divided into 5 components;

1. Downgrading Motifs: The higher ranked motif switching to a lower rank such as Index becoming Major (denoted as I => M), Major becoming Minor (denoted as M => Mi) or any loss of the motif which occurs when a motif is present in the previous position but disappears in the subsequent position. Index motif loss is denoted as “I => -“. Downgrading motifs are placed at the top lane of the plot as nodes and it follows a hiearchial order, from the most left to right

Downgrading Motif Order

I => M

I => Mi

I => U

I => -

M => Mi

M => U

M => -

Mi => U

Mi => -

U => -

2. Upgrading Motifs: The vice versa of the downgrading motifs, Major motif becoming Index in the subsequent position (denoted as M => I), Minor becoming Index (denoted as Mi => I) or any gain of the motif, contrary to motif loss. Index motif gain is denoted as “I => +”. Upgrading motifs are placed at the bottom lane of the plot as nodes and just like Downgrading motifs, !!it follows the same hiearchial order!!. This allows users to easily spot the direct replacing switches.

Upgrading Motif Order

M => I

Mi => I

U => I

I => +

Mi => M

U => M

M => +

U => Mi

Mi => +

U => +

3. Positions: The positions involving in motif switch are laid at the center as nodes, and the color of these nodes are based on the number of edges. The more edges (switches) in the position, the lighter the color (dark blue to grey). The example below is to illustrate. The node that represents 336 is colored light because multiple switches are occuring while position 348 is colored dark blue because it has single switch.

_images/networkpos.png

The positions where motif switches happen are laid at the mid-line of the network plot as nodes.

Moreoever, when a user hovers over the positions, an information pops up about what kind of switches are happening at the indicated position. You can inspect the example below.

htmlcap

4. Edges: ……. set as edges. Grey edges indicate true switches, while others psuedo-switches.

i. Splits: is when a distinct variant k-mer separates into two or more distinct variants in the subsequent position while preserving its current motif rank. A split causes a drop in the frequency of the motif. Example is given below.

_images/split.png

Index spits two distincs variants while preserving its motif rank (index), and pushing a switch in major rank.

ii. Mergers: is when a two or more distinct k-mer variants of a given incidence at a position combined/merged to become one in the subsequent position. A merger causes an increase in the frequency of the motif. Example is given below.

_images/merge.png

Two distinct minors are merging with the index motif, causing an increase in the frequency of the motif, opposite of a split.

Below is a more complex mergers example.

_images/merge2.png

Minor_8 and Minor_10 are merging with what was previously a Index motif and causing an increase in the frequency from 129 to 142. However, Minor_2 and Minor_5 are merging with what was previously a Major motif and causing an greater increase in the frequency from 117 to 150. Altough these mergers are what we call pseudo-switches, it gives rise to true switches. The major at the previous position now has the highest frequency and becomes the Index and the Index at the previous position loses its “crown” and becomes a major.

5. Legend: is interactive too, you can double click on a any of the legend to isolate the related trace or single click to hide it.

3.4.1. Network Plot Sample

Hint

Double click on the legend number 1 ( I => M indicating Index to Major) and then when you get the isolated trace, do a single click on the second legend ( M => I indicating Major to Index). Now, you will be inspecting only those that are inter-switching between each other.

3.5. Hotspot Bar Plot

There are several studies suggesting that certain regions of the gene/genome have a higher proclivity for sequence change. Strong mutation hotspots were discovered in the very first significant collection of mutations within a single gene which is not surprising(1). Indeed, there are discovries identified multiple simultaneous mutations occuring in tiny regions while the rest of the gene/genome remains relatively mutation free(2). Since switches refer to mutations, we followed the same ….. We cluster the positions that show motif switches based on the criteria below and later identify the hotspot positions based on the distance between the clusters:

  1. CMSP; is the number of positions that make up a cluster.

  2. CS; is the cluster size (minimum is 7)

  3. CD; is the distance between clusters.

Example of Cluster and Hotspot Plot

_images/ClusterHotSpotPlot.png

The starting and ending positions that make up the cluster shown in square brackets as [start::end]. Red rectangle indicates the hot spot.

3.6. Pairwise Scores

Index switch positions are noteworthy due to instability of the index at such positions are possibly to be avoided as vaccine targets, therefore, we provide a pairwise alignment score based on PAM30 for index switches, to determine the physico-checmical spectrum of similarity/variability between the index sequence and the replacing variant motif sequence. These scores are provided along with number of statistical analyses such as the total number of protein sequences analysed at the aligned k-mer position, the percentage of index, major, minor and unique motifs at the given position. This comma-seperated file format allows users to open with any text editor or excel. You can download the file from web-server, stand-alone version produces the same file in the output folder.

Scores

Position

Current Index

Preceding Index

Score

Support

Index %

Major %

Minor %

Unique %

94

…..I… (Major)

…..A… (Major)

57

2327

35

29

35

1

164

……DNA (Minor)

……NST (Major)

50

2350

22

20

57

1

170

NST…… (Major)

DNA…… (Major)

49

2350

18

14

66

2

The identical amino acids between the replacing index and the previous index are presented as dots. The “Current Index” column shows the present index motif at the given position, the previous motif rank of the sequence is indicated in paranthesis. “Preceding Index” column shows the index sequence from the previous position and the current motif of the sequence is given within paranthesis.

3.7. Raw Results.JSON

All the detailed results are stored in a json file. This file can be downloaded from the webservice and produced by the stand alone version in the output directory as “Results.json”. The structure is explained below, you can download sample .json file from this link..

#This section consists the top part of the MoSwA report
{
    "MoSwa_Output": {
        "Report": { #First part of the report is statistical information about the input
            "kmer_length": 9,
            "tresh_hold": 100,
            "alignment_length": 568,
            "highest_support": 307,
            "average_support": 296,
            "Positions_no_support": 0,
            "Positions_low_support": 0,
            "Total_switches": {
                "Index": 2,
                "Major": 81,
                "Minor": 111,
                "Unique": 156
            },
            "Unique_switches": 237,
            "Results": "Please see the Results Tab"
            "PairWise_Alignments": "Please see the Scores Tab"

        }
    }
}