Frequently asked questions

If the explanations on this page do not answer your question, do not hesitate to contact us.
Please note that the help file provides extensive information on how to search and retrieve data, and on database annotations and contents.

Annotations
Mutations screened from RNA
Description of deletions and insertions

Data analysis
How can I perform custom analyses?
How to retrieve a specific p53 mutation frequency and residual activity?
Where can I find reports on database analyses?
How to retrieve the type of somatic mutations found in a specific type of cancer?
How to retrieve the TP53 status of a specific cell-line?
How to retrieve the summary of annotations contained in the database for a specific mutation?
How to retrieve the functional properties and biological activities of specific mutants?
How to retrieve the type of tumors in which a specific somatic mutation occurs?
How to retrieve the type of tumors associated with a specific germline mutation?
How to retrieve the frequency (prevalence) of mutated samples for specific types of cancer?

Web site
Where do I find reference sequences for TP53?

Database contents
Why some mutations present in the somatic dataset are not retrieved with the 'Mutation Validation' option?
Why mutation numbers retrieved from the mutation prevalence and mutation spectrum datasets are different?

 

Mutations screened from RNA

All mutations are annotated in the database at the genomic level. For mutations identified from RNA screening, annotations may not be accurate. For example, a mutation described as a deletion of exon 5 at RNA level might in fact be a point mutation located in a splice site at the genomic level (inducing skiping of an exon). It is of note that this concerns only a small fraction of the data included in the database. You may exclude studies that have screened RNA by using the 'Advanced search' option.

Description of deletions and insertions

The exact location of deletions, insertions and complex variations are often poorly described in original reports (often reported at the codon but not genomic level). Annotations for these mutations are thus not precise since we annotate mutations at the genomic level. For example, if a deletion is described as a deletion of one nucleotide at codon 158, it is entered in the database as deletion of the first nucleotide of codon 158 while it may in fact be the second or third nucleotide that is actualy deleted.

Data analysis

How can I perform custom analyses?

The analyses that can be performed with the web based tools are limited. If you want to perform other types of analysis, you may download the dataset that you need. Data in the database are organized in different datasets that provide different types of information related to gene variations. All datasets can be downloaded.

How to retrieve a specific p53 mutation frequency and residual activity?

Select the "Database search"/"Mutation validation" option; select a specific mutation with available criteria; click on "Submit"

Where can I find reports on database analyses?

Several analyses of different datasets of the IARC TP53 database have been published. PubMed links are provided here.

How to retrieve the type of somatic mutations found in a specific type of cancer?

Select the "Database search"/"Mutation spectrum" option; select a specific type of tumor with available criteria; choose among the following display options:
- "Mutation pattern": display the proportion of all mutations classified by their nature: base change, insertions, deletions....) in the selected set of tumors (% shown is the number of mutations of each class divided by the total number of mutations selected);
- "Mutation effect": display the proportion of mutations classified according to their effect on protein sequence;
- "Codon distribution": display the proportion of single base substitutions at each codon posistion;
- "Function pattern": display the proportion of single amino-acid substitutions labeled according to their effect on protein sequence and activities, and ordered according to mutation rate estimates.

How to retrieve the TP53 status of a specific cell-line?

Select the "Database search"/"Cell lines" option; enter the name or characterisitics of a specific cell-line with available criteria; click on the "Submit" button.

How to retrieve the summary of annotations contained in the database for a specific mutation?

Select the "Database search"/"Mutation validation" option; select a specific mutation with available criteria; click on "Submit".

How to retrieve the functional properties and biological activities of specific mutants?

Select the "Database search"/"Function analysis" option; select (a) mutation(s) with available criteria; click on "Submit".

How to retrieve the type of tumors in which a specific somatic mutation occurs?

Select the "Database search"/"Tumor spectrum" option; enter the description of a specific mutation with available criteria; click on the "Somatic mutations" button.

How to retrieve the type of tumors associated with a specific germline mutation?

Select the "Database search"/"Tumor spectrum" option; enter the description of a specific mutation with available criteria; click on the "Germline mutations" button.

How to retrieve the frequency (prevalence) of mutated samples for specific types of cancer?

Select the "Database search"/"Mutation prevalence" option; select a specific type of cancer (and optionally a population and mutation detection method) with available crietria; click on the "Submit" button.

Web site

Where do I find reference sequences for TP53?

Reference sequences for TP53 gene and p53 protein can be found here.

Database contents

Why some mutations present in the somatic dataset are not retrieved with the 'Mutation Validation' option?

There may be two reasons for this: (1) Mutations retrieved with this option only include gene variation that are fully described, while in the dataset of somatic mutations some mutations are not fully described. (2) Somatic mutations may be reported in individuals with different SNP status. If a mutation is close to a SNP, it may have a different impact on the protein sequence depending on the SNP status. For example, the mutation c.637C>T on the first base of codon 213, will result in a p.R213X change in the protein sequence if the SNP present on the third base of the codon is a A (CGA>TGA), while it will result in a p.R213W change if the SNP present on the third base of the codon is a G (CGG>TGG). Since most data and annotations presented on the mutation validation page are related to the reference sequence, the mutation validation tool only display mutations described from the reference sequence. However, mutations not described from the reference sequence are included in the somatic dataset. Thus, in the example above, since the reference sequence contains a A at the third position of codon 213, the p.R213W mutation will be displayed on the FunctionPatternGraph but not with the 'Mutation Validation' option.

Why mutation numbers retrieved from the mutation prevalence and mutation spectrum datasets are different?

Data in the prevalence dataset are "independent" from data included in the somatic dataset. Numbers differ for the following reasons:
• Because we retrieve data from papers, and in many papers the only information that can be extracted is the total number of samples analyzed and total number of samples mutated (mutations are not described in details), mutations can not be included in the somatic dataset. Thus, numbers in the prevalence table do not match numbers in the somatic dataset (mutation spectrum).
• Numbers by histologies may also differ, as for example, a paper may contain mutation details for lung ADC, SCC, LCC (which are all non-small cell lung cancers), but total numbers of samples analyzed for each histology is not available. In this case, mutations corresponding to ADC, SCC and LCC will be entered in the somatic dataset but in the prevalence table the prevalence will be indicated only for non-small cell carcinoma (group that includes the 3 tumor types).
• Cell-lines are not included in the prevalence count.
• Samples with more than one mutations are counted once in the prevalence table while all mutations are entered in the somatic dataset.
• The prevalence may be missing for some papers that describe mutations included in the somatic dataset. The prevalence dataset has been added in a recent version of the database (2001 while the database started in 1994) and not all papers have been reviewed. The non-reviewed papers correspond mainly to publications that describe less than 10 mutations (about 400 papers). For some papers, the prevalence could not be retrieved from the information provided in the publication (about 100 papers).