NCBI To Rename Incorrect Genomes…

…microbiologists everywhere rejoice.

I spent part of last week at the ASM Conference on Rapid Next-Generation Sequencing and Bioinformatic Pipelines for Enhanced Molecular Epidemiologic Investigation of Pathogens, which for obvious reasons is also referred to as ASMNGS. Lots of good science (though I like science better at 9am in the morning, not 8am. Just saying).

Scott Federhen discussed NCBI’s microbial genomics taxonomy efforts (NCBI is the National Center for Biotechnology Information at NIH). The punchline is this: after consultation with external phylogenetics efforts, NCBI has instituted a policy of correcting genomes that are assigned to the wrong species (e.g., a Klebsiella genome is called an E. coli genome). There are a lot of reasons why genomes would be mislabelled, with the most common reasons being contamination (i.e., someone accidentally mixed together two species and then sequenced them), sample swaps (someone thought he was sequencing sample X, when he sequenced sample Y), data handling fuckups (that’s the highly technical term), or the person who submitted the genome incorrectly identified the genome and didn’t change the submission after sequencing*.

While NCBI is often thought of as a sequence repository (GenBank), it’s actually part of the National Library of Medicine, so changing erroneous genome submissions is a significant shift in policy: imagine if NCBI or NLM changed erroneous articles in PubMed**. That said, the submitters of the genomes are being contacted to inform them of this.

This is a much-needed change. Many research groups as well as public health labs routinely use the genomes in GenBank as part of genomic-based surveillance. Having a few misnamed genomes within a species for which there are hundred or thousands of genomes might not sound like much, but that can really screw up these systems in any number of ways***.

Personally, some of the things I work on have been hampered by this, so, from my perspective, as well as most microbiologists and bioinformaticians, this is a very good development.

*To get into the weeds, you can submit the metadata for a genome (what species it is, where it was isolated, etc.) before any sequencing has begun. Sometimes people fail to correct the metadata after the genome sequencing.

**There are ways to note retractions and for people to leave comments.

***A short, very incomplete, highly technical list of problems:

When trying type strains by placing them in a genome phylogeny, you could end up misidentifying strains.
If you’re developing a typing system for a species (e.g., cgMLST), you don’t want to include data from a completely different species.
If you’re trying to find reference assemblies to improve your own genome assemblies, including incorrectly assigned taxa can screw things up.
Researchers asking basic research question can chase after red herrings because they think they’ve found something unusual in species X, when… you don’t have species X.

NCBI To Rename Incorrect Genomes…

Trending Articles

مجھے کاٹو ناگ ناول از اے حمید

FIPS issues in Windows, .NET, and Visual Studio

Practice Sheet of Right form of verbs for HSC Students

Who’s been sentenced at Northampton Magistrates’ Court

XXX esx.problem.hyperthreading.unmitigated.formatOnHost not found XXX (Build...

Film – Iznogoud: Calif în locul califului – Iznogoud (2005)

Questions regarding Proxmox and Dell Power Edge VRTX

Forum Post: RE: Pump Selecting

UDP RSS performance issues with vmxnet3 on ESXi-6.7

Error code CE-30095-7

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Oh Nadhaa - Umar zahir cover.

Youth accused of having 17 forged $100 notes

XLN Audio Addictive Drums 2 Complete v2.1.7 Incl Keygen HAPPY NEW YEAR-R2R

Doctor winning fight to the death against cancer

Aberdeen woman in £15k benefit fraud

Exit Code 17006 when trying to update and Office 365 install with Project

Words and Expressions Class 9 Solutions | NCERT Class 9 Words and Expressions...

How to assign the custom BDXXX scripts to NPCs?

Deployment configuration on indexers - DC:DeploymentClient -...