Data Submission
How to submit data?


  • Direct submission

  • Reporting in scientific journals

  • Reply to data requests



    You can actively contribute to the development of the IARC TP53 Database by submitting data directly to us at . At the present time we only accept data that have been accepted for publication in peer-reviewed journals. Mutation must be described at the nucleotide level.

    To submit data, you can either send your own electronic file or use the IARC submission form: download form and instructions (2 Excel files and 1 Word file in a compressed zip format).

    The new reference genomic sequence used for TP53 from R13 is NC_000017 (7512445..7531642).

    A tool has been constructed to help you check and format your data >>> click here.

    Sequence converter tool: an Excel table with genomic numbering for TP53 gene using the new (NC_000017-9) and old (X54156) reference sequence can be downloaded here.


    Because the IARC TP53 Database policy is to only include mutation data that are published in peer-reviewed literature, the development of the database relies on the quality and accuracy of published records. When reviewing published papers for updates of the database, we encounter several problems:

    • editing errors: errors in the identification of codon numbers and base sequences are frequent;
    • duplicates: some series of samples are reported in several papers without reference to previous publications;
    • loss of data: many interesting studies on big series of patients are not included in the database because the publications do not provide detailed information on each mutation detected but rather report results in the form of summary tables or graphs; More than 5000 mutations (20% of all reported data) could not be included in the database.
    • loss of information: it is very frequent that the information on patients (age, sex...) and samples (histology, grade, stage) is summarized in a descriptive table and cannot be captured in the database.

    To avoid these problems, we would recommend that authors:

    • (1) clearly identify each re-cited mutation to avoid redundancies in the database;
    • (2) provide unique identifiers for tumor samples;
    • (3) check codon and nucleotide numbers with the tool described above;
    • (4) provide a detailed description of the mutations as recommended above;
    • (5) when available, provide details on tumor samples (site, histology, grade, stage), patient characteristics (age, sex, ethnicity, country of origin, clinical records) and individual exposures to cancer risk factors (tobacco, alcohol, chemicals, ....). If this information cannot be included in the original publication, it can be directly submitted to .

    Here is the kind of table format we recommend: see pdf


    It is our policy to contact authors systematically to obtain their active collaboration in data collection.
    However, the response rate over the last 2 years has been of 25% only.

    We have estimated that at least 5000 mutations (around 20% of the data published) could not been entered in the database because no details on mutations could be extracted from the original papers. These data are thus lost for meta-analysis.

    It would be most useful if publishers and editors could collaborate with the database and request that authors of accepted publications deposit their data in the IARC TP53 Database. A link to the IARC dataset could then be included in the publication as "supplementary data". This strategy would allow journals to avoid publishing long tables giving the details of all mutations reported.