SeeTCR: an online platform for TCR sequencing data analysis


SeeTCR processes T-cell repertoire data and produces a variety of quantitative indices, graphs and comparisons.

Upload your samples (or our demo data) to start playing.


© 2016 Friedman Lab, Weizmann Institute of Science

Data Statistics

Download csv...

Session Manager

Load samples:

Demo

Edit metadata:

Save session...

Frequency of CDR3 Property

Graph Description

This plot can be used to observe the distribution of a CDR3 sequence property (e.g. V gene, known antigen association) across the dataset. The samples in the dataset can be grouped into categories (e.g. Mouse, Tissue) in order to observe the statistics across groups. The height of the bar indicates the total normalized count (or frequency) for each sample / group (mean +/- SEM) per sequence property.
Labels:
Text sizes:
Options:

Plot parameters

Detailed CDR3 AA distribution

Graph Description

Shows the distribution of CDR3 nucleotide sequences encoding for a chosen AA sequence. This plot can be used to detect positive selection for a particular CDR3. The height of each stacked bar is the total normalized count for the selected CDR3 AA sequence in a sample (x-axis). The color indicates distinct nt sequences encoding the same CDR3 AA, stated at the bottom.
Labels:
Text sizes:
Options:

Plot parameters

Diversity

Graph Description

The plot shows the proportion of sequences which fall within each quantile of the sample’s size distribution. The sequences in each sample are first ordered by their frequency and divided into n quantiles. Then the frequencies in each quantile are summed. These sums are represented by the width of the boxes in each bar.

Parameters

Parameters

Similarity Heatmaps

Graph Description

The similarity heatmap is useful to get a global view of pairwise similarity across all samples in the dataset. The plot is symmetric, and rows / columns are either presented in original order or clustered. Each intersection of row and column indicates the value of the chosen index for a particular pair of samples. The higher the value, the more similar are the samples. The similarity indices available are:

Jaccard: measures degree of overlap, suitable for CDR3 sequences (calculated with binary=T).

Morrisita-Horn: Similar to Jaccard but accounts for CDR3 abundance and sample size. See Morisita's overlap index

Pearson: Standard Pearson correlation coefficient, takes count / frequency into account.

Spearman: Rank correlation, not affected by count.

Plot types

CDR3 Search

in loaded data

Enter query:


                    
                    

CDR3 Search

in reference data

Enter query:


                    
                    
                    
                      
                      Download csv...
                    
                  

Sharing in mice

Sharing in humans

CDR3 Search - * page is under construction *

in TCRdb

search parameters

search results

Add Data to Project

Choose Samples from the table below

Example: 5:10, 12, 17:21

Selection Preview


All data

Current Session

Import sequence Annotation

Getting Started

  1. What is SeeTCR?
  2. How to use SeeTCR
    1. Data formats and Preparation
    2. Add data to session
    3. Edit the metadata
    4. Import sequence annotation
    5. Analyze the data
    6. Visualize the data

What is SeeTCR?

SeeTCR is an online app for the analysis of T-cell sequencing data. It takes as input TCR repertoires and returns a collection of quantitative indices, graphs and comparisons. SeeTCR is web-based and there are no prerequisite software for using it.

How to use SeeTCR:

  1. Data formats and Preparation:
    SeeTCR accepts data that contains CDR3 sequences, their frequency, and any other additional information. Data can be in Adaptive Biotechnologies’ Immunoseq format, or in csv files that contain: nucleotide sequences (the "CDR3_NT" column), amino acid sequences (the "CDR3_AA" column) and their frequency/copy number (the "count" column). The columns should be named as written in parenthesis. Besides the sequence and the frequency, the files may include any other columns, such as V/D/J genes, sequence length, etc. Click here to download an example of the required format.
  2. Add data to session:
    To start a session, click Data and then Add/edit samples. Choose the files you wish to upload (you can select multiple files at once). Alternatively, if you have previously saved sessions, you can load them by clicking "load session". Once the files are loaded, they will appear in the "edit metadata" section. Tip: You can always add files to an existing session, remove specific samples or clear everything and start from the beginning.
  3. Edit the metadata:
    Adding labels is useful for splitting the data into groups or categories and comparing these groups side by side. In many sections you will see an option to split the data by the different categories (see for example the "shared CDR3s" or "Frequency of CDR3 Property" sections). To add a new label, enter a title and click "add category". An empty column will be added to the metadata table, which you can fill manually or by pasting data from the clipboard. The labels can be of any type, like age, health, phenotype etc.
  4. Import sequence annotation (optional):
    If you have additional sequence information, it is possible to add it to the session. This information should be in the form of a csv/txt file that contains the sequences and the annotation. Once the annotations are loaded, you will be able to see them in Data -> View raw data, and in Visualiztion -> "Frequency of property".
  5. Analyze the data:
    1. Statistics
    2. Sequence sharing
    3. CDR3 search
    4. McPAS-TCR
  6. Visualize the data:
    1. Frequency of Property
    2. Sequence Distribution
    3. Tile Plots
    4. Diversity

About

SeeTCR is developed and maintained by the Friedman Lab at the Weizmann Institute of Science. It is written in R using the Shiny package.
For questions or comments please contact us at: tal.sagiv@weizmann.co.il

View Raw Data

Choose a sample to view:

Raw Data

Sequence sharing

Subsets

Sharing Levels

Description

Select a category on the right to find shared AA sequences between some or all of its members.

The "sharing levels" box summarizes the number of sequences that were found in one of the members only (level 1), in two members only (level 2) etc.

Click on a subset from the list on the left to view the actual CDR3s.

CDR3 Glossary

Data Summary

CDR3s


Filter data