TY - JOUR
T1 - Probabilistic characterisation of baseline noise in STR profiles
AU - Mönich, Ullrich J.
AU - Duffy, Ken
AU - Médard, Muriel
AU - Cadambe, Viveck
AU - Alfonse, Lauren E.
AU - Grgicak, Catherine
N1 - Publisher Copyright:
© 2015 Elsevier Ireland Ltd. All rights reserved.
PY - 2015/7/27
Y1 - 2015/7/27
N2 - There are three dominant contributing factors that distort short tandem repeat profile measurements, two of which, stutter and variations in the allelic peak heights, have been described extensively. Here we characterise the remaining component, baseline noise. A probabilistic characterisation of the non-allelic noise peaks is not only inherently useful for statistical inference but is also significant for establishing a detection threshold. We do this by analysing the data from 643 single person profiles for the Identifiler Plus kit and 303 for the PowerPlex 16 HS kit. This investigation reveals that although the dye colour is a significant factor, it is not sufficient to have a per-dye colour description of the noise. Furthermore, we show that at a per-locus basis, out of the Gaussian, log-normal, and gamma distribution classes, baseline noise is best described by log-normal distributions and provide a methodology for setting an analytical threshold based on that deduction. In the PowerPlex 16 HS kit, we observe evidence of significant stutter at two repeat units shorter than the allelic peak, which has implications for the definition of baseline noise and signal interpretation. In general, the DNA input mass has an influence on the noise distribution. Thus, it is advisable to study noise and, consequently, to infer quantities like the analytical threshold from data with a DNA input mass comparable to the DNA input mass of the samples to be analysed.
AB - There are three dominant contributing factors that distort short tandem repeat profile measurements, two of which, stutter and variations in the allelic peak heights, have been described extensively. Here we characterise the remaining component, baseline noise. A probabilistic characterisation of the non-allelic noise peaks is not only inherently useful for statistical inference but is also significant for establishing a detection threshold. We do this by analysing the data from 643 single person profiles for the Identifiler Plus kit and 303 for the PowerPlex 16 HS kit. This investigation reveals that although the dye colour is a significant factor, it is not sufficient to have a per-dye colour description of the noise. Furthermore, we show that at a per-locus basis, out of the Gaussian, log-normal, and gamma distribution classes, baseline noise is best described by log-normal distributions and provide a methodology for setting an analytical threshold based on that deduction. In the PowerPlex 16 HS kit, we observe evidence of significant stutter at two repeat units shorter than the allelic peak, which has implications for the definition of baseline noise and signal interpretation. In general, the DNA input mass has an influence on the noise distribution. Thus, it is advisable to study noise and, consequently, to infer quantities like the analytical threshold from data with a DNA input mass comparable to the DNA input mass of the samples to be analysed.
UR - http://www.scopus.com/inward/record.url?scp=84937861050&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84937861050&partnerID=8YFLogxK
U2 - 10.1016/j.fsigen.2015.07.001
DO - 10.1016/j.fsigen.2015.07.001
M3 - Article
C2 - 26218981
AN - SCOPUS:84937861050
SN - 1872-4973
VL - 19
SP - 107
EP - 122
JO - Forensic Science International: Genetics
JF - Forensic Science International: Genetics
ER -