DNA metabarcoding (also called amplicon sequencing) is a common approach used to sequence a short standardized region of DNA from the entire sample. This approach is designed to target broad taxonomic groups, rather than individual species, in order to document entire communities of fish, mammals, or marine invertebrates. For example, when a fish metabarcoding approach is used, the resulting data will provide a list of all fish species present in each eDNA sample.
When to use this approach
DNA metabarcoding is used when the objective is to generate broad biodiversity data of all the species present in each sample, rather than focussing on one particular species of interest. This can provide excellent baseline biodiversity data or be used as a part of a long-term monitoring effort to track changes in biodiversity through time.
Input
The starting material for DNA metabarcoding is a standard environmental DNA sample that can be collected from water, soil, air, etc. This approach assumes that there is eDNA from many different organisms present in the sample. For example, a sample of seawater will contain DNA from all the fish species living in the area around where the sample was collected.
Output
The output will consist of a large file of DNA sequences for each sample, corresponding to all the species that were present (see Section 4a). Using bioinformatics pipelines, these raw sequences can be compared against a large reference database to determine which species each sequence belongs to. The final output file is a matrix of Samples x Species, indicating how many DNA sequences from each species were present in each sample (see Section 4a).
Pros
The main advantage of this approach is that it can generate community-level data for all species present in a sample. This is ideal for generating baseline biodiversity data or comparing biodiversity across different locations or through time.
Many studies have compared the results of DNA metabarcoding to conventional visual biodiversity surveys. Generally the eDNA metabarcoding approach identifies more species than a conventional visual survey, can be more rapid and cost effective, and removes the need for expertise in visually identifying the species present.
Cons
Metabarcoding data does have some limitations:
False Negatives
Because this approach simultaneously sequences DNA from many different species, more abundant species will be more common in the data. This means that rare or very small species can be missed. A species present in the environment but not detected in the eDNA data is called a false negative result. As with any detection method (including visual) there will always be some species that are missed during the survey. If the detection of rare species is a priority, then there are different approaches that can be used to increase the probability of detection.
False negatives can also occur in eDNA metabarcoding data when a particular species successfully sequenced, but there is no entry for that species in the reference data base used to decode the DNA sequences. In that case the species, while present, will be listed as ‘unknown’ in the data.
False Positives
It is also possible to get a positive detection for a species that was not actually present in the sample. This is called a false positive result and can occur when the data analysis incorrectly assigns a species to a sequence during processing. These false positive detections can be fixed during the quality control of the data.
Summary
Despite some potential limitations, eDNA metabarcoding represents an efficient cost effective tool for generating large amounts of community-wide data with very little sampling effort. By combining different metabarcoding tools this approach can be used to document many different communities of organisms including fish, aquatic invertebrates, or mammals.
Please see section 3b for an overview of the alternative approach (qPCR), which is used to target a single species of interest.