The observation that microbial cell-free DNA (cfDNA) is present in biofluids has inspired new avenues for infectious disease testing . Recent studies have demonstrated the utility of cfDNA metagenomic sequencing of blood and urine to detect a wide range of pathogens that cause a variety of complications, including urinary tract infection, blood-borne infection, and deep-seated infection of tissues that would otherwise require invasive biopsies for diagnosis . The sensitivity of metagenomic cfDNA sequencing is partly determined by the efficiency of recovery of microbial vs host cfDNA. We reasoned that the choice of DNA isolation and library preparation methods would strongly affect the sensitivity of metagenomic cfDNA sequencing because the yield of DNA isolation and library preparation protocols depends on the physical length of the assayed DNA, and because microbial cfDNA is more fragmented than host-specific cfDNA . In this study, we characterized the fragment length biases inherent to select DNA isolation and library preparation procedures and developed a model to correct for these biases. Our study demonstrates that substantial gains in microbial and other short fragment recovery can be obtained by easy-to-implement changes in the sample preparation protocol and highlights the need for standardization in the liquid biopsy field.
Samples were collected via the conventional method for a clean-catch midstream specimen used for standard urine culture. For the kidney transplant patient samples, approximately 50 mL of urine was centrifuged at 3000g on the same day for 30 min and the supernatant was stored in 1 mL aliquots at −80°C. For the tuberculosis patient samples, 10 mL of urine was mixed with 2 mL Streck cell-free DNA urine preserve and centrifuged at 3000g for 30 min at ambient temperature. The supernatant was similarly stored in 1 mL aliquots at −80°C.
Additionally, 40 plasma samples from 6 individuals receiving double-lung transplants at Stanford University Hospital collected in the scope of a previous study were included (lung transplant recipients) . Briefly, peripheral blood was collected in EDTA tubes and centrifuged at 16 000g for 10 min within 24 h after blood collection. Plasma was stored in 1 mL aliquots at −80°C.
In this study, we show that the measured fragment length distributions of urinary and plasma cfDNA and the recovery of microbial- and host-specific cfDNA are dependent on the choice of pre-analytical variables. Data distortions due to cfDNA isolation can be accounted for by transfer functions which, when applied to measured fragment length distributions, produce a common underlying fragment length distribution. Correction for fragment length biases yields a single distribution that is very short, with a mean fragment length <100 bp for both host- and microbe-specific cfDNA.with PCR.
The performance and sensitivity of a metagenomic sequencing assay is directly correlated with the microbial enrichment over host reads. The cfDNA isolation transfer functions readily account for these differences, with kits favoring shorter fragments recovering over 3-fold more clinically reported microbial reads per human read. However, the sequencing library preparation protocols do not tell a similar, straightforward story. Fragment length biases only partially account for practical differences in library preparation methods, suggesting that the physical configurations of cfDNA constitute another driving force. Because both single-stranded library preparation protocols outperformed the double-stranded library preparation assay in terms of microbial enrichment, we believe that these differences lie in the sensitivity of each assay to different DNA conformations. The double-stranded library preparation protocol is effective at capturing blunt-ended double-stranded fragments, like the synthetic sample used to characterize the transfer functions, but is insensitive to the gamut of forms that might be found in a cfDNA sample. Single-stranded library preparation methods are more sensitive to the full range of conformations present in cfDNA samples, which may contain nicks, fragments with overhangs, and single-stranded DNA . Our study also shows that single-stranded library preparation protocols are more sensitive to highly fragmented and degraded DNA that compose much of the microbial fraction.
Our work underscores the importance of considering multiple biases introduced in the sample preparation workflow to achieve highly sensitive metagenomic cfDNA sequencing assays. It further demonstrates the need for standardization in the liquid biopsy field, particularly in cases where metagenomic cfDNA sequencing is used to guide clinical decisions or where the biophysical properties of cfDNA are used to inform diagnostic technology development. Our findings are relevant for cfDNA applications in prenatal testing and cancer screening, where differences in fragment lengths have been leveraged to improve diagnostic performance .