Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
Landing page cover
Human Cancer V3.4 Experimental V1.0

Mutational Signatures

Pioneering techniques to understand the causes of cancer. Mutagenic compounds leave characteristic patterns of damage throughout our genome. Each one creates a unique and distinctive pattern of damage. These patterns are Mutational Signatures. This website is here to share our data and tools so that together, we can cure cancer, faster.

SigProfilerAssignment Learn More

Mutational Signatures (v3.4 - October 2023)

Introduction

Somatic mutations are present in all cells of the human body and occur throughout life. They are the consequence of multiple mutational processes, including the intrinsic slight infidelity of the DNA replication machinery, exogenous or endogenous mutagen exposures, enzymatic modification of DNA and defective DNA repair. Different mutational processes generate unique combinations of mutation types, termed “Mutational Signatures”.

In the past few years, large-scale analyses have revealed many mutational signatures across the spectrum of human cancer types, including the latest effort by the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Network (Alexandrov, L.B. et al., 2020 ) using data from more than 23,000 cancer patients.

About

COSMIC Mutational Signatures is a resource curated in partnership with COSMIC and Cancer Grand Challenges, and in close association with our collaborators at Wellcome Sanger Institute, the Pillay lab at University College London and the Alexandrov lab at University of California.

Mutational signatures as a collection of operative mutational processes

Mutational processes from different aetiologies are active during the course of cancer development. They can be identified using mutational signatures, due to their unique mutational pattern and specific activity on the genome.

This is illustrated in the figure below using a framework of 6 classes of single base substitutions, and three distinct mutational processes, whose respective strengths vary throughout a patient’s life. At the beginning, all mutations were due to the activity of the endogenous mutational process. As time progresses, the other processes get activated and the mutational spectrum of the cancer genome continues to change.

Collections

Signature-based websites

At COSMIC Signatures we identify signatures from analysis of the PCAWG dataset and through curation of specific papers. Papers are looked at particularly (but not exclusively) when there is a specific exposure which captures signatures not present in the PCAWG dataset. Please note that this catalogue of signatures is not exhaustive or a final set, but a reference set of high confidence signatures that have been curated by experts in the field. We aim to update as comprehensively as possible as new data become available and improvements are made to extraction methodologies.

This summary includes the mutational profile, proposed aetiology and tissue distribution of each signature, as well as potential associations with other mutational signatures and how the signature has changed during iterations of analysis.

Currently, six different variant classes are considered, resulting in the following sets of mutational signatures.

SBS Signatures

Single base substitutions (SBS), also known as single nucleotide variants, are defined as a replacement of a certain nucleotide base. Considering the pyrimidines of the Watson-Crick base pairs, there are only six different possible substitutions: C>A, C>G, C>T, T>A, T>C, and T>G. These SBS classes can be further expanded considering the nucleotide context.

Current SBS signatures have been identified using 96 different contexts, considering not only the mutated base, but also the bases immediately 5’ and 3’.

Click on any signature below to learn more about its details.

Find out more

DBS Signatures

Doublet base substitutions (DBS) are generated after the concurrent modification of two consecutive nucleotide bases. There are 78 strand-agnostic DBS mutation types, enumerated here.

More specifically, there are 16 possible source doublet bases (4 x 4). Of these, AT, TA, CG, and GC are their own reverse complement. The remaining 12 can be represented as 6 possible strand-agnostic doublets. Thus, there are 4+6=10 source doublet bases. Because they are their own reverse complements, AT, TA, CG, and GC can each be substituted by only 6 doublets. For the remaining doublets, there are 9 possible DBS mutation types (3 x 3). Therefore, in total there are 4 x 6 + 6 x 9 = 78 strand -agnostic DBS mutation types.

Click on any signature below to learn more about its details.

Find out more

ID Signatures

Small insertions and deletions (ID), also known as indels, are defined as the incorporation or loss of small fragments of DNA (usually between 1 and 50 base pairs) in a specific genomic location.

Although there is no single intuitive and naturally constrained set of ID mutation types (as there arguably are for single base substitutions and doublet base substitutions), a compilation of 83 different types considering size, nucleotides affected and presence on repetitive and/or microhomology regions was used to extract mutational signatures. It can be found here.

Click on any signature below to learn more about its details.

Find out more

CN Signatures

Copy number signatures are defined using the 48-channel copy number classification scheme. The scheme incorporates loss-of-heterozygosity status, total copy number state, and segment length to categorise segments from allele-specific copy number profiles (as major copy number and minor copy number respectively i.e. non-phased profiles), and the signatures displayed here were identified from 9,873 tumour copy number profiles obtained from The Cancer Genome Atlas (TCGA) SNP6 array data spanning 33 cancer types.

Click on any signature below to learn more about its details.

Find out more

SV Signatures

Structural variations (SVs) are large genomic changes typically exceeding 1kb in length, which impact the arrangement of the genome. Types of SVs include large deletions (removal of a genomic segment), tandem duplications (addition of a repeated genomic segment), inversions (flipping of a genomic segment), and translocations (breaking and rejoining of genomic segments at different chromosomal locations).

Deletions, inversions, and tandem duplications are further subdivided by size using the five size ranges: 1-10Kb, 10-100Kb, 100Kb-1Mb, 1Mb-10Mb, and events larger than 10Mb. SVs are also categorised as clustered or non-clustered depending on the distance between adjacent SVs, resulting in 32 SV types in total.

These signatures were derived from an analysis of 10,731 whole genome sequenced samples provided by Genomics England, which were organised into 16 distinct tissue groupings.

Find out more

RNA SBS Signatures

RNA Single Base Substitution (RNA SBS) signatures are defined using the 192-channel for the tri-nucleotide context of every possible point nucleotide change on an RNA molecule. Because RNA molecules are single-stranded, we can infer the exact nucleotide change, thus resulting in the 192 channel, unlike DNA single base substitutions, where the strandedness of the nucleotide change cannot always be inferred. The signatures displayed here were identified from 333 non-small cell lung cancer tumours from the TRACERx (TRAcking Cancer Evolution through therapy (Rx)) cohort.

Find out more
New Product V1.0

Experimental Signatures

Signatures from Experimental Exposures

The Experimental Signature collection captures the mutational processes from potential cancer risk factors which have been observed in experimental models. These mutational processes leave distinct imprints on the genome due to their unique chemistry and specific activities.

This collection provides the opportunity to observe the similarities and differences in the patterns of mutations resulting from a given exposure across different species and model organisms.

This is a complementary approach to the Human Cancer Signature collection. It could further our understanding of the accumulation of somatic mutations resulting from carcinogenic exposures and their contribution to the hallmarks of cancer.

Find out more

Explore your own data

Data downloads

Download current COSMIC Mutational Signatures version 3.4 and previous releases here.

SigProfiler tools

The current set of mutational signatures has been extracted using SigProfiler, a compilation of publicly available bioinformatic tools addressing all the steps needed for signature identification. SigProfiler functionalities include mutation matrix generation from raw data and signature extraction, among others.

Versions

COSMIC Mutational Signatures version 3.4 is the latest release.

Version 3 was released as part of COSMIC release v89 (May 2019), updated to version 3.1 in COSMIC release v91 (June 2020), to version 3.2 in COSMIC release v93 (March 2021), to version 3.3 in COSMIC v95 (May 2022) and most recently version 3.4 in COSMIC v98.

Version 2 signatures (March 2015) were part of earlier COSMIC releases can still be consulted: