E-Poster Presentation 33rd Lorne Cancer Conference 2021

Wrongly identified nucleotide sequence reagents within the gene-focussed cancer research literature (#135)

Yasunori Park 1 , Rachael A West 2 3 , Pranujan Pathmendra 4 , Bertrand Favier 5 , Amanda Capes-Davis 3 6 , Guillaume Cabanac 7 , Cyril Labbe 8 , Jennifer A Byrne 1 9
  1. School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW, Australia
  2. Children's Cancer Research Unit, Kids Research Institute, The Children's Hospital at Westmead, Westmead, NSW, Australia
  3. Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia
  4. School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW
  5. Univ. Grenoble Alpes, Team GREPI, Etablissement Français du Sang, La Tronche, France
  6. CellBank Australia, Children’s Medical Research Institute, Westmead, NSW, Australia
  7. Computer Science Department, IRIT UMR 5505 CNRS , University of Toulouse, Toulouse, France
  8. Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG, Grenoble, France
  9. NSW Health Pathology, Camperdown, NSW, Australia

Nucleotide sequence reagents such as shRNA and siRNA reagents and RT-PCR primers represent verifiable experimental reagents, because their identities can be independently checked and compared with associated text descriptors. Although published nucleotide sequences are commonly assumed to be correct, we have reported that wrongly identified nucleotide sequence reagents are frequent in highly similar human gene knockdown studies performed in cancer cell lines (1, 2). We therefore developed a semi-automated fact checking tool, Seek & Blastn, to verify the targeting or non-targeting status of published nucleotide sequence reagents (2).

We have employed Seek & Blastn (2) to screen different literature corpora identified either by screening PubMed and Google Scholar using defined key words and/or by PubMed similarity searches. The single gene knockdown corpus includes 174 papers across 83 journals that describe the results of knocking down 17 different human genes in cancer cell lines. We also screened 50 publications that examine the function of human miR-145 in cancer cell lines, and 100 papers that analyse different human genes in combination with either cisplatin or gemcitabine treatment of cancer cell lines. These 150 papers were published across 70 journals. Application of Seek & Blastn with manual results verification demonstrated that 19% (117/631) verified sequences were wrongly identified in 44% (77/174) single gene knockdown papers, and 8% (165/2116) verified sequences were wrongly identified in 58% (87/150) miR-145/ gene + chemotherapy papers. Whereas 51% (60/117) incorrect sequences were repeatedly identified across 77 single gene knockdown papers, only 11% (18/165) incorrect sequences were repeatedly identified across 87 miR-145/ gene + chemotherapy papers. In summary, wrongly identified nucleotide sequence reagents can be identified within a subset of gene-focussed cancer research publications across many different journals, raising concerns about the reliability of such publications and their possible influence on future cancer research. The overall predominance of unique incorrect sequence reagents highlights the ongoing need for unbiased literature screening tools such as Seek & Blastn.

  1. Byrne, J.A., & Labbé, C. (2017). Striking similarities between publications from China describing single gene knockdown experiments in human cancer cell lines. Scientometrics, 110, 1471-1493.
  2. Labbé, C., Grima, N., Gautier, T., Favier, B., & Byrne, J.A. (2019). Semi-automated fact-checking of nucleotide sequence reagents in biomedical research publications: the Seek & Blastn tool. PLOS ONE, 14(3), e0213266.