Database Statistics

R15, November 2010


The current version of the database is R15, released in November 2010.

The R15 release contains 27580 somatic mutations, 597 germline mutations, functional data on 2314 mutant proteins and TP53 gene status of 2263 cell-lines.

Here are some statistics on the database contents:

Somatic mutations (1)
Publication trends

Tumor site distribution of mutations

TP53 mutation prevalence

Type of mutations

Codon distribution of mutations

Prognostic value of somatic mutations

Cell-lines (included in dataset of somatic mutations)
Tumor site distribution

Type of mutations

Codon distribution of mutations

Germline mutations (2)
Publication trends

Type of mutations

Codon distribution

Tumors associated with TP53 germline mutations

Prevalence of TP53 germline mutations in selected cohorts

Functional properties
Dataset contents

(1) Statistics on previous releases of the somatic dataset:

Database version Release date Mutations count Ref. count Last Ref_ID Added ref. Added mutations Deleted mutations* PubMed search**
R3 - 10411 1048 1075 - - - -
R4 July 2000 14050 1320 1369 294 3798 159 Jan 1998 - Apr 2000
R5 June 2001 15121 1412 1480 111 1459 388 May - Dec 2000
R6 Jan 2002 16285 1485 1571 91 1549 385 Jan - June 2001
R7 Sept 2002 17689 1599 1715 144 1477 73 July 2001 - June 2002
R8 June 2003 18585 1680 1810 95 924 28 July 2002 - Feb 2003
R9 July 2004 19809 1769 1921 111 1196 40 March - Dec 2003
R10 July 2005 21587 1876 2055 107 1788 10 Jan - Dec 2004
R11 Nov 2006 23544 1995 2221 120 2014 57 Jan - Dec 2005
R12 Nov 2007 24810 2081 2349 86 1331 65 Jan - Dec 2006
R13*** Nov 2008 24806 2081 2349 - - 4 -
R14 Nov 2009 26597 2179 2483 98 1814 - Jan - Dec 2007
R15**** Nov 2010 27580 2218 2564 48 1021 38 2008 - 2009****

* Data may be deleted if (1) they correspond to duplicate entries or (2) errors. Publication of the same set of samples in different papers by the same authors is a serious problem that has led to duplicates entries in the database in the past. We now perform systematic searches of the database under the author’s name to identify earlier entries that may correspond to the same dataset. We have also extensively reviewed the entire dataset in order to find and eliminate these duplicates. However, despite these efforts, some duplicates may remain in the database and their identification is an ongoing task.
** Papers edited in PubMed at the indicated dates were searched with selected keywords and reviewed to extract relevant data.
*** The dataset of somatic mutations has not been updated.
**** The update only include a selection of papers (see database developments).


(2) Statistics on previous releases of the germline dataset:

Database version Release date Mutations count Ref. count Last Ref_ID PubMed search**
R4 July 2000 144 71 74 Jan 1998 - Apr 2000
R5 June 2001 195 84 91 May - Dec 2000
R6 Jan. 2002 213 90 99 Jan - June 2001
R7 Sept 2002 225 97 106 July 2001- June 2002
R8 June 2003 225 98 108 -
R9 July 2004 264 112 125 July 2002 - Dec 2003
R10 July 2005 283 123 137 Jan - Dec 2004
R11 Nov 2006 376 142 156 Jan - Dec 2005
R12 Nov 2007 399 159 173 Jan 2006 - June 2007
R13 Nov 2008 423 164 181 Jul 2007- July 2008
R14 Nov 2009 535 196 211 Aug 2008 - Aug 2009
R15 Nov 2010 597 209 224 Sep 2009 - Oct 2010

** Papers edited in PubMed at the indicated dates were searched with selected keywords and reviewed to extract relevant data.