Nucleotide sequence reagents such as shRNA and siRNA reagents and RT-PCR primers represent verifiable experimental reagents, because their identities can be independently checked and compared with associated text descriptors. Although published nucleotide sequences are commonly assumed to be correct, we have reported that wrongly identified nucleotide sequence reagents are frequent in highly similar human gene knockdown studies performed in cancer cell lines (1, 2). We therefore developed a semi-automated fact checking tool, Seek & Blastn, to verify the targeting or non-targeting status of published nucleotide sequence reagents (2).
We have employed Seek & Blastn (2) to screen different literature corpora identified either by screening PubMed and Google Scholar using defined key words and/or by PubMed similarity searches. The single gene knockdown corpus includes 174 papers across 83 journals that describe the results of knocking down 17 different human genes in cancer cell lines. We also screened 50 publications that examine the function of human miR-145 in cancer cell lines, and 100 papers that analyse different human genes in combination with either cisplatin or gemcitabine treatment of cancer cell lines. These 150 papers were published across 70 journals. Application of Seek & Blastn with manual results verification demonstrated that 19% (117/631) verified sequences were wrongly identified in 44% (77/174) single gene knockdown papers, and 8% (165/2116) verified sequences were wrongly identified in 58% (87/150) miR-145/ gene + chemotherapy papers. Whereas 51% (60/117) incorrect sequences were repeatedly identified across 77 single gene knockdown papers, only 11% (18/165) incorrect sequences were repeatedly identified across 87 miR-145/ gene + chemotherapy papers. In summary, wrongly identified nucleotide sequence reagents can be identified within a subset of gene-focussed cancer research publications across many different journals, raising concerns about the reliability of such publications and their possible influence on future cancer research. The overall predominance of unique incorrect sequence reagents highlights the ongoing need for unbiased literature screening tools such as Seek & Blastn.