

identity sequences) thereby preventing the quantitative description of immune repertoire architecture. Network visualization limits the informative graphical display of a network to a few hundred antibody clones (100% a.a. Thus far, network analysis has mostly been utilized for visualization of network clusters 7, 8, 9, 10, 11, 12. Network connectivity was later also used to discriminate between diverse repertoires of healthy individuals and clonally expanded repertoires from individuals with diseases such as chronic lymphocytic leukemia 7 and HIV-1 infection 10. Sequence-based networks have first been used to show immune responses defined by similarity between clones, a proxy for clonal expansion 8. Network analysis captures antibody repertoire architecture by representing the similarity landscape of antibody sequences as nodes (antibody clonal sequence) that are connected if sufficiently similar 7, 8, 9, 10, 11, 12 (Fig.

Recently, selected aspects of network analysis have been employed to investigate antibody repertoire architecture in health and disease. However, due to limitations in technological sequencing depth and algorithmic advances, the fundamental construction principles of antibody repertoire architecture have remained largely unknown, thereby hindering a more profound systems understanding of humoral immunity. Understanding sequence-related properties of antibodies is thus valuable for the development of novel therapeutics and vaccines 5, 6. Thus, the similarity landscape of CDR3 amino acid (a.a.) sequences constitutes the clonal architecture of an antibody repertoire this architecture reflects the breadth of antigen-binding and therefore correlates with humoral immune protection and function. Antibody identity (clonality) and antigen specificity are primarily encoded in the highly diverse junctional site of recombination in the variable heavy chain, called the complementarity determining region 3 (CDR3) 4. Additions and deletions of nucleotides at the junctions of the gene segments further increase diversity 2, 3. The source of antibody diversity has long been identified to be the somatic recombination V−, (D− in the heavy chains) and J-genes 1. The high diversity of antibody repertoires, which is defined by the collection of an individual’s B-cell receptor (BCR) and antibody sequences, plays a major role in providing broad and protective humoral immunity. Our analysis provides guidelines for the large-scale network analysis of immune repertoires and may be used in the future to define disease-associated and synthetic repertoires. Finally, repertoire architecture is intrinsically redundant. The architecture of antibody repertoires is robust to the removal of up to 50–90% of randomly selected clones, but fragile to the removal of public clones shared among individuals. Antibody repertoire networks are highly reproducible across individuals despite high antibody sequence dissimilarity. Leveraging a network-based statistical framework, we identify three fundamental principles of antibody repertoire architecture: reproducibility, robustness and redundancy. Here, we establish a high-performance computing platform to construct large-scale networks from comprehensive human and murine antibody repertoire sequencing datasets (>100,000 unique sequences). The major principles that define the architecture of antibody repertoires have remained largely unknown. Published by Oxford University Press on behalf of Nucleic Acids Research.The architecture of mouse and human antibody repertoires is defined by the sequence similarity networks of the clones that compose them.

This platform can help to formulate hypotheses concerning the key residues in antibody structures or interactions to improve the understanding of antibody properties. The Yvis platform also provides an integrated structural database, which is updated weekly, and many different search and filter options. We developed the antibody high-density alignment visualization and analysis (Yvis) platform to provide an innovative, robust and high-density data visualization of antibody sequence alignments, called Collier de Diamants. Therefore, tools that allow the analysis, comparison, and visualization of this large amount of antibody data are crucially needed. As antibodies are a very important tool for diagnosis, therapy, and experimental biology, a large number of antibody structures and sequences have become available in recent years.
