Archaeological Wood Species Identification through DNA Barcoding

Article information

J. Conserv. Sci. 2024;40(5):757-767
Publication date (electronic) : 2024 December 20
doi : https://doi.org/10.12654/JCS.2024.40.5.07
1School of Biological Sciences and Technology, College of Natural Sciences, Chonnam National University, Gwangju 61186, Korea
2Hangang Institute of Cultural Heritage, Bucheon 14502, Korea
3PennBIT Co. Ltd., Gwangju 61062, Korea
**Corresponding author E-mail: pennbit.korea@gmail.com Phone: +82-10-3640-2689
*Equally Contributed to this work
Received 2024 November 26; Revised 2024 December 13; Accepted 2024 December 17.

Abstract

Identifying wood species from archaeological artifacts provides crucial information for understanding ancient technological capabilities, resource utilization patterns, paleovegetation, environmental changes, and societal interactions. While traditional microscopical anatomy analysis and common genetic markers (rbcL, rpoB, matK, atp, and 18S) for general plants exhibit limitations in archaeological samples due to DNA degradation and contamination, this study employed the chloroplast trnL gene with shorter sequences for wood identification. Metabarcoding analysis was performed on ancient DNA extracted from wooden components of bronze artifacts (乙-shaped bronze implement, Tubular bronze implement, and Twin-bird Shaped Pommel) excavated from the early Iron Age Namyangju site, targeting the trnL (UAA) intron P6 loop. Using our classifier with reference database constructed from NCBI GenBank and expanded plant database, the analysis identified eight plant families (Asteraceae, Brassicaceae, Convolvulaceae, Marantaceae, Rosaceae, Fabaceae, Poaceae, and Amaryllidaceae). Notably, Rosaceae showed significant presence across all samples (乙-shaped bronze implement: 33.05%, Tubular bronze implement 1: 3.17%, Tubular bronze implement 2: 29.09%, Twin-bird Shaped Pommel: 45.2%), suggesting the use of native woody Rosaceae species. This genetic approach successfully refined the identification of previously unspecified “broadleaf tree/hardwood” specimens to the family level, demonstrating its effectiveness as a complementary method in archaeological wood identification. The results not only provide insights into ancient wood resource utilization patterns but establish a methodological framework for future archaeobotanical studies using ancient DNA analysis.

1. INTRODUCTION

Identifying wooden artifacts from archaeological sites provides crucial insights into ancient technological capabilities, resource utilization patterns, and cultural exchanges. Indeed, identifying wood species from archaeological remains is particularly significant as it reveals sophisticated woodworking techniques, specialized tool production, and trade networks during pivotal historical transitions (Haneca, et al., 2009; Lo et al., 2018). This analysis becomes especially critical in East Asia when examining artifacts from the early Iron Age (500-1 B.C.), where technological advancement and cultural exchange rapidly evolved through the Korean Peninsula (Yi, 2015).

Notably, various types of iron swords, especially iron swords with Twin-bird Shaped Pommel, 乙-shaped bronze implements, Tubular bronze implements, iron swords with Bronze Ring Pommel, and iron daggers and bronze rings were buried as grave goods at the Geumnam-ri site (37°35’N, 127°13’E) (Figure 1 and 2) The 乙-shaped bronze and Tubular bronze implements were excavated only in the North Korean region, centered on the northwestern region, and are presumed to be Gojoseon-style horse tools. We estimated the age of the burials to be up to the 2nd century B.C. using AMS dates, chronologies of artifacts, and the grave structure. This represents one of the crucial discoveries, revealing the process of iron culture diffusion into South Korea (Hangang Institute of Cultural Heritage, 2022).

Figure 1.

Archaeological site location and sampling process at the Namyangju study site. (A) Topographical map showing the location of the archaeological site (marked in red) in relation to surrounding geographical features and waterways. Aerial view of Namyangju Geumnam-ri archaeological site along the Bukhangang-river. The yellow box indicates the excavation location of bronze artifacts containing wooden components. (B) Documentation of wood sample collection process following contamination prevention protocols. The sampling procedure was conducted using tools to minimize environmental DNA contamination.

Figure 2.

Archaeological artifacts and sampling locations from the Namyangju site. Each row in (A) and (B) presents for 乙-shaped bronze implements and Tubular implements, and for Twin-bird Shaped Pommel respectively. (A) The excavation diagram of burial site, associated artifacts including pottery vessels and bronze artifacts with rings, schematic diagram showing wooden sample locations (YBI_HW, TBI_HW_1, and TBI_HW_2) from the 乙-shaped bronze implement and Tubular bronze implements in red circles, and actual photograph of the bronze implements. (B) The excavation diagram of burial site, associated artifacts including pottery vessels and bronze implements, schematic diagram showing wooden sample location (TBSP_HW) from the Twin-bird Shaped Pommel in a red circle and photograph of the Twin-bird Shaped pommel with wooden pieces.

After recovery, the insides of the 乙-shaped bronze implements, Tubular bronze implements and Twin-bird Shaped Pommel excavated from the Geumnam-ri site in Namyangju were found to have been embedded wood embedded in them during the conservation process. Afterward, a specialized wood conservation institution conducted a conservation analysis and evaluated the wood species. The analysis revealed that the wood species of the 乙-shaped bronze implements and the Tubular bronze implements were broadleaf; the wood inside the Twin-bird Shaped Pommel was a subgenus of Sangsuri (oak tree, Quercus sp.) (Hangang Institute of Cultural Heritage, 2022).

Traditional wood species identification methods primarily rely on anatomical analysis through microscopic examination of transverse, radial, and tangential sections (Nilsson and Rowell 2012). While these approaches have previously provided valuable insights, they are limited when analyzing degraded archaeological specimens where cellular structures may be compromised. Furthermore, the accuracy of morphological identification can be significantly affected by preservation conditions, post-depositional processes, and contamination from buried surrounding environmental influences (Blanchette 2000; Pedersen et al., 2016).

Since the wood species identification embedded in the 乙 -shaped bronze implements and Tubular bronze implements presents an important basis for determining whether the artifacts were suitable for use as horse tack, it remained necessary to identify the wood species of the artifacts by applying other approaches separate to the traditional wood species identification method.

Recent advances in molecular biological techniques performing DNA sequence analysis, offer new possibilities for species identification in archaeological woods (Gugerli et al., 2005; Laiou et al., 2013). The application of DNA barcoding, specifically targeting chloroplast genes such as rbcL (ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit), matK (maturase K), and trnL (transfer RNA leucine), has emerged as a promising tool for species identification (Taberlet et al., 2007; Jiao et al., 2014). The P6 loop region in the trnL gene, with relatively short sequences (10-143 base pairs), has demonstrated particular effectiveness in analyzing degraded DNA from archaeological specimens, achieving successful amplification, and allowing identification of all families and most genera (>75%) in samples older than 10,000 years (Sønstebø et al., 2010). Hence, to our knowledge, we analyzed the wood species of the artifacts using DNA barcoding for the first time in Korea.

2. MATERIALS AND METHODS

2.1. W ood s amples and a ncient D NA (aDNA) purification for next-generation sequencing

The Geumnam-ri archaeological site (37°37’N, 127°14’E) is located at 367-1 Geumnam-ri, Hwado-eup, Namyangju-si, Gyeonggi-do, on a natural embankment formation within alluvial plain of the Bukhangang-river (Figure 1A). Four hardwood specimens, each measured in 2-3 mm3, were collected and stored at -80°C until DNA extraction (Figure 2). Initially, surface contaminants were removed by treating the wood samples with 5% sodium hypochlorite for 5 minutes, followed by rinsing with autoclaved water thrice. The wood pieces were then disrupted mechanically using a bead-beater (MP-Bio, USA) and chemically in lysis buffer containing 10 mM Tris, 2 mM EDTA, and 1% SDS at 56°C for 30 minutes. The aDNA samples dissolved in the solution were subsequently purified using a commercial DNA extraction kit (QIAGEN, Hilden, Germany), followed by the company-recommended procedure (Jiao et al., 2015). Before library preparation for sequencing, the quantity and quality of the wood aDNA were assessed using a Bioanalyzer (Agilent Technologies, Santa Clara, CA). All aDNA purification processes were performed in a laminar flow clean bench to prevent microbial contamination.

2.2. Library preparation, quality control, and DNA Sequencing

The sequencing libraries were prepared following the Illumina 16S Metagenomic Sequencing Library protocols to amplify the trnL region using trnL_F and trnL_R primers. Initial PCR amplification was performed using 10 ng of wood aDNA in a reaction mixture containing 5x reaction buffer, 1 mM of dNTP mix, 500 nM of each forward and reverse primer, and Herculase II fusion DNA polymerase (Agilent Technologies, Santa Clara, CA). The first PCR amplification consisted of initial denaturation at 95°C for 3 minutes, followed by 25 cycles of denaturation at 95°C for 30 seconds, annealing at 55°C for 30 seconds, and extension at 72°C for 30 seconds, with a final extension at 72°C for 5 minutes. The universal primers with Illumina adapter overhang sequences were: trnL_F amplicon PCR Forward Primer (5’-GGGYAA TCCTGAGCCAAA-3’) and trnL_R amplicon PCR Reverse Primer (5’-CATTGAGTCTCTGCACCTATC-3’) (Taberlet et al., 2007).

The first PCR products were purified using AMPure beads (Agencourt Bioscience, Beverly, MA). Subsequently, 10 uL of the purified products underwent a second PCR amplification with NexteraXT Indexed Primers to construct the final library. The second PCR process used the same thermal cycling conditions as the first but with only 10 cycles. The resulting products were purified again using AMPure beads. Library quantification was performed by qPCR using KAPA Library Quantification kits for Illumina Sequencing platforms. Quality assessment was conducted using the TapeStation D1000 ScreenTape (Agilent Technologies, Waldbronn, Germany). Final sequencing was performed on the MiSeq™ platform (Illumina, San Diego, USA).

2.3. Bioinformatic analysis of metabarcoding data

To establish a comprehensive reference database for plant species identification, we retrieved the trnL sequences from the NCBI GenBank using specific query parameters that targeted chloroplast sequences excluding environmental and unverified samples. The retrieved sequences underwent initial dereplication using RESCRIPt (Robeson et al., 2021), followed by trnL g-h regions extracted using both PCR primer-pair search and Edgar’s method (Edgar 2010). This dual approach was implemented to maximize coverage, as the primer-pair search alone yielded limited data due to incomplete primer sequence information in previous GenBank entries. After thorough quality control and dereplication processes, the final curated reference database comprised 119,271 unique trnL g-h sequences representing 39,328 species.

Sequence analysis was performed within the QIIME2 version 2024.05 framework (Bolyen et al., 2019) beginning with primer trimming using Cutadapt (Martin, 2011). We employed parallel processing approaches to ensure robust analysis of the metabarcoding data. The denoising pipeline utilized both DADA2 (Callahan et al., 2016) and Deblur (Amir et al., 2017) plugins for quality filtering, denoising, and chimera removal, generating distinct ASV feature tables. Concurrently, we implemented a clustering-based approach using VSEARCH (Rognes et al., 2016), which processed the sequences through merging and filtering steps. To ensure high-quality clustering and minimize the impact of sequencing errors, we applied a stringent clustering threshold of 99% sequence identity when clustering sequences against our trnL g-h reference dataset, resulting in an OTU feature table.

We developed a Naive Bayes classifier for taxonomic classification using the q2-feature-classifier plugin (Bokulich et al., 2018), training it on our curated trnL g-h reference segment dataset. The reliability of the classifier was validated across all taxonomic levels from kingdom to species using standard performance metrics including precision, recall, and F-measure. We applied this trained classifier to both the ASV and OTU feature tables, implementing a stringent filtering criterion that retained only family-level assignments. The taxonomic assignments were then cross-referenced against the Korean Plant Names Index (http://www.nature.go.kr/kpni/SubIndex.do) to identify native wood plant species and provide a reliable framework for further plant biodiversity assessments.

3. RESULTS

3.1. Initial analysis and sample characterization

The wooden components embedded within bronze artifacts excavated from the Namyangju site, specifically the 乙 -shaped bronze implements, Tubular bronze implements, and Twin-bird Shaped Pommel (Figure 2), were subjected to comprehensive analysis. These samples, as hardwood species, had been preliminarily classified with conventional morphological examination by a specialized wood conservation institution, and precisely suggested to be Quercus sp. for the pommel specimen only. The degraded condition of the wooden remains and their small sample sizes limited the reliability of traditional anatomical identification methods.

3.2. DNA analysis of archaeological wood samples

To overcome the limitations in conventional microstructurebased identification methods and to determine the precise taxonomic classification of these historically significant samples, we employed DNA metabarcoding analysis targeting the chloroplast trnL (UAA) intron P6 loop region.

Four distinct wood samples were analyzed:

Sample 1: 乙-shaped bronze implement hardwood

(YBI_HW)

Sample 2: Tubular bronze implement hardwood 1

(TBI_HW_1)

Sample 3: Tubular bronze implement hardwood 2

(TBI_HW_2)

Sample 4: Twin-bird Shaped Pommel hardwood

(TBSP_HW)

DNA extraction yields ranged from 1.06 to 1.3 ng/μL, totaling 60 to 78 ng. The trnL region was successfully amplified in all samples for next-generation sequencing.

3.3. Overview of metabarcoding analysis pipeline

The overall experimental workflow for plant DNA metabarcoding analysis is illustrated in Figure 3, encompassing sample collection, DNA extraction, barcode amplification, sequencing, and data analysis. The bioinformatic workflow (Figure 4) implemented four sequential steps. In the initial step, primer sequences were trimmed from over 500,000 raw sequencing reads using Cutadapt. Following this, two parallel approaches were implemented: a denoising pipeline utilizing DADA2 and Deblur, and a clustering pipeline using VSEARCH. The denoising pipeline generated 33 ASVs (456,079 reads) with DADA2 and 131 ASVs (10,089 reads) with Deblur, while the clustering approach resulted in 239 OTUs (367,918 reads). Subsequently, a Naive Bayes classifier was trained using the trnL g-h reference segment dataset for taxonomic classification of the identified ASVs and OTUs, providing assignments at the family level of the detected plant sequences.

Figure 3.

Schematic workflow of aDNA metabarcoding process from sampling at the archaeological site to bioinformatic analysis for species identification.

Figure 4.

Workflow diagram of the bioinformatics pipeline for taxonomic classification. The pipeline consists of two main processing streams: (1) Raw FASTQ data processing, which involves adapter and primer trimming followed by parallel denoising using DADA2/Deblur and clustering using VSEARCH methods; (2) Reference database preparation from NCBI GenBank, where trnL g-h sequences were extracted and used to train the Naive Bayes Classifier. These two streams converge for the final taxonomic classification step, where processed sequence data are classified using the trained classifier model. This pipeline integrates both sequence processing and machine learning approaches for accurate taxonomic assignment.

3.4. Performance evaluation of taxonomic classification

A comprehensive reference database was constructed for taxonomic classification, comprising 119,271 unique trnL g-h sequences representing 39,328 species (Figure 5A). The reference sequences ranged from 10 to 143 bp in length, consistent with previously reported trnL g-h region characteristics (Taberlet et al., 2007). The taxonomic composition of the reference database showed hierarchical distribution across different levels, from a single kingdom and phylum to 617 families, 10,216 genera, and 39,328 species. The performance evaluation of the trained Naive Bayes classifier (Figure 5B) revealed high accuracy (F-measure > 0.98) from kingdom to family level, while showing a notable decline in performance at genus and species levels. Based on this evaluation result, we conducted taxonomic classification up to the family level for subsequent analyses.

Figure 5.

Sequence length distribution and taxonomic classification performance analysis. (A) The violin plot shows the trnL g-h sequence length distribution in the reference database. (B) Performance evaluation of the taxonomic classification model across different taxonomic levels. The graph displays three metrics: precision (blue), recall (green), and F-measure (red).

3.5. Taxonomic composition and native woody plant identification

Analysis of taxonomic assignments across different pipelines revealed varying distributions of complete and incomplete taxa (Figure 6A). Complete taxa, defined as those with full classification to the family level, were selected for subsequent analyses, while incomplete taxa were excluded. The proportion of complete taxa was consistently distributed across different analysis methods (Table 1). Thus, using the complete taxa dataset, we examined the relative abundance of plant families across the four samples using different analysis pipelines (Figure 6B). Each pipeline revealed distinct taxonomic compositions, with YBI_HW showing five predominant families: Rosaceae, Asteraceae, Marantaceae, Brassicaceae, and Convolvulaceae. TBI_HW_1 was dominated by Amaryllidaceae and Convolvulaceae, with a minor Rosaceae presence. TBI_HW_2 contained three major families: Convolvulaceae, Rosaceae, and Asteraceae, while TBSP_HW displayed the most diverse composition with Rosaceae, Poaceae, Convolvulaceae, Asteraceae, and Fabaceae.

Figure 6.

Taxonomic assignment profiling through trnL g-h metabarcoding analysis across four samples. (A) The stacked bar plots show the proportion of complete and incomplete taxa identified by DADA2, Deblur, and Closed Clustering methods. Each row represents different trnL samples (from top to bottom: YBI_HW, TBI_HW_1, TBI_HW_2, TBSP_HW). (B) Relative abundance distribution of plant families detected across the three analysis methods. As shown in the legend, the stacked bar plots display the composition of major plant families identified in each sample, with distinct color coding for different families. Each row corresponds to the same wood samples as in panel A. (C) Venn diagrams illustrating the overlap of unique ASVs detected using the three different methods for each sample. Numbers in each section represent shared or unique ASVs among the three approaches.

The proportion of complete and incomplete taxa

We focused on families consistently detected across all three analysis pipelines to identify potential woody plant species suitable for bronze artifacts handle construction (Table 2, Figure 6C). These commonly identified families were cross-referenced with the Korean Plant Names Index database (http://www.nature.go.kr/kpni/index.do), which revealed that only two families contained significant numbers of native woody plant species: Rosaceae (117 species) and Fabaceae (41 species). Other commonly detected families, including Asteraceae (4 species), Brassicaceae, Convolvulaceae, Marantaceae, Poaceae, and Amaryllidaceae (0 species each), contained few or no native woody species. The high abundance of Rosaceae across multiple samples, particularly in YBI_HW and TBSP_HW, combined with its rich diversity of woody species, indicates its potential as a primary source for bronze artifacts handle materials.

The families were consistently detected across all three pipelines

4. DISCUSSION

This study demonstrates the successful application of DNA metabarcoding for identifying species of archaeological wood samples from the Namyangju early Iron Age site while also overcoming the limitations of traditional morphological analysis. The predominance of Rosaceae in tool manufacturing, particularly in specialized implements such as the 乙-shaped bronze implements and Twin-bird Shaped Pommel, suggests confederated selection based on specific mechanical properties. This indicates an advanced understanding of wood characteristics in early Iron Age Korean society.

The trnL gene metabarcoding approach, targeting short sequences (10-143 bp), proved particularly effective in obtaining meaningful results from degraded ancient DNA that has been environmentally compromised over extended periods (Taberlet et al., 2007; Wagner et al., 2018). This success aligns with previous findings showing the P6 loop region in the trnL gene to be effective for plant species identification in environmental samples (Rachmayanti et al., 2009,; Jiao et al., 2014). Our comprehensive reference database, constructed from 119,271 unique trnL sequences in the NCBI GenBank representing 39,328 species with expended plant databases, significantly advances the field of archaeological wood identification. Moreover, the integrating of the data from Korean native plant species (http://www.nature.go.kr/kpni/SubIndex.do) further enhanced the utilization of the database to include regional archaeological studies. This expanded database enabled more precise taxonomic identification than previous studies, particularly for specimens from the Korean Peninsula.

Analysis of wooden artifacts from the Namyangju early Iron Age site revealed compelling evidence of sophisticated wood selection practices, with eight major taxa (Asteraceae, Brassicaceae, Convolvulaceae, Marantaceae, Rosaceae, Fabaceae, Poaceae, and Amaryllidaceae); Rosaceae exhibited a high detection rate (29.09-45.2%). The notably high proportions of Rosaceae in the 乙-shaped bronze implement (33.05%) and Twin-bird Shaped Pommel (45.2%) are particularly significant. The reason why Rosaceae were detected at a high rate can be interpreted in terms of the physical characteristics of Rosaceae wood, geographical and environmental influences, or cultural and wood processing technology in the early Iron Age. Through information on the list of native woody plants in Korea, it can be inferred that these plants are likely to be Crataegus, Prunus, or Pyrus in the Rosaceae family, which have the strength and durability necessary for tool-making, thus providing valuable insights into early Iron Age woodworking expertise. The consistent preference for Rosaceae woods suggests a sophisticated understanding of wood properties and intentional selection practices during tool manufacturing (Parducci et al., 2005). Furthermore, this diversity in the identified eight major plant families provides valuable insights into the early Iron Age environment in the Korean Peninsula, suggesting a complex forest ecosystem typical of temperate deciduous regions (Gugerli et al., 2005; Liepelt et al., 2006).

While successful at the family level, the current approach possesses limitations in providing precise identification at the genus or species levels. These constraints primarily originate from DNA degradation, reference database limitations, and potential contamination from environmental DNA in burial sites. To address these challenges, future research should focus on, expanding comprehensive reference databases, particularly for East Asian native woody plant species, implementing more stringent contamination prevention protocols, developing improved DNA extraction methods for degraded samples, and integrating multiple genetic markers complementary to trnL (Laiou et al., 2013). Moreover, recent developments in sequencing technologies, particularly the MinION sequencing platform (Oxford Nanopore Technologies), offer promising solutions for improving taxonomic resolution (Santos et al., 2020).

This research establishes a robust methodological framework for future studies on archaeological wood. Using aDNA analysis to identify wood species at the family level creates new possibilities for understanding historical wood selection practices, trade networks, and technological development in ancient societies (Akhmetzyanov et al., 2020; Deguilloux et al., 2006). Furthermore, this approach bridges traditional archaeological methods with modern molecular techniques, providing a more comprehensive understanding of ancient craftsmanship and resource utilization patterns.

5. CONCLUSION

Here, we successfully implemented a trnL-targeted metabarcoding analysis to identify wood species from archaeological artifacts excavated from the Namyangju site in Korea, dating to the early Iron Age (2∼3 B.C.). Our approach, which utilizes the P6 loop region in the chloroplast trnL gene, proved effective for identifying degraded archaeological wood samples at the family level despite the challenges typically associated with ancient DNA analysis. The metabarcoding analysis of the trnL g-h region unveiled particular success in identifying Rosaceae as the predominant family across multiple artifacts. While our trnL classifier showed robust performance at the family level, the current methodology presents certain limitations for identification at the species level. However, successful identification at the family level represents a significant advancement over traditional morphological analysis.

Acknowledgements

This research was supported by Global-Learning & Academic research institution for Master’s⋅PhD students, and Postdocs (LAMP) Program of the National Research Foundation of Korea (NRF) grant funded by the Ministry of Education (No. RS-2024-00442775 to CP).

References

Akhmetzyanov L., Copini P., Sass-Klaassen U., Schroeder H., de Groot G.A., Laros I., Daly A.. 2020;DNA of centuries-old timber can reveal its origin. Scientific Reports 10(1)
Amir A., McDonald D., Navas-Molina J.E., Morton J.T., Thompson LR., Hyde ER., Gonzalez A., Knight R.. 2017;Deblur rapidly resolves single-nucleotide community sequence patterns. mSystems 2e00191. –16.
Blanchette R.A.. 2000;A review of microbial deterioration found in archaeological wood from different environments. International Biodeterioration & Biodegradation 46:189–204.
Bokulich N.A., Kaehler B.D., Rideout J.R.. 2018;Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin. Microbiome 6:90.
Bolyen E., Rideout J.R., Dillon M.R.. 2019;Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology 37:852–857.
Callahan B.J., McMurdie P.J., Rosen M.J., Han A.W., Johnson A.J.A., Holmes S.P.. 2016;DADA2: High-resolution sample inference from Illumina amplicon data. Nature Methods 13(7):581–583.
Deguilloux M.F., Bertel L., Celant A., Pemonge M.H., Sadori L., Magri D., Petit R.J.. 2006;Genetic analysis of archaeological wood remains: first results and prospects. Journal of Archaeological Science 33(9):1216–1227.
Edgar R.C.. 2010;Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26(19):2460–2461.
Gugerli F., Parducci L., Petit R.J.. 2005;Ancient plant DNA: review and prospects. New Phytologist 166(2):409–418.
Haneca K., Čufar K., Beeckman H.. 2009;Wood anatomy as a tool for archaeological wood identification: implementation and relevance. Journal of Archaeological Science 36:1–14.
Hangang Institute of Cultural Heritage. 2022. Excavation Report Vol. 117 Geumnam-ri Site Namyangju (2). Bucheon: p. 405–638. 405-423, 636-638.
Jiao L., Liu X., Jiang X., Yin Y.. 2015;Extraction and amplification of DNA from aged and archaeological populus euphratica wood for species identification. Holzforschung 69(8):925–931.
Jiao L., Yin Y., Cheng Y.. 2014;DNA barcoding for identification of the endangered species Aquilaria sinensis: comparison of data from heated or aged wood samples. Holzforschung 68(4):487–494.
Laiou A., Aconiti M.L., Piredda R., Bellarosa R., Simeone MC.. 2013;DNA barcoding as a complementary tool for conservation and valorisation of forest resources. ZooKeys 365:197–213.
Liepelt S., Sperisen C., Deguilloux M.F.. 2006;Authenticated DNA from ancient wood remains. Annals of Botany 98(5):1107–1111.
Lo M.A., Balletti F., Pelosi C.. 2018;Wood in cultural heritage properties and conservation of historical wooden artefacts. European Journal of Science and Theology 14(2):161–171.
Martin M.. 2011;Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet Journal 17(1):10–12.
Nilsson T., Rowell R.. 2012;Historical wood - structure and properties. Journal of Cultural Heritage 13:5–9.
Parducci L., Suyama Y., Lascoux M.. 2005;Ancient DNA from pollen: a genetic record of population history in Scots pine. Molecular Ecology 14(9):2873–2882.
Pedersen M.W., Ruter A., Schweger C.. 2016;Postglacial viability and colonization in North America’s ice-free corridor. Nature 537(7618):45–49.
Rachmayanti Y., Leinemann L., Gailing O.. 2009;DNA from processed and unprocessed wood: Factors influencing the isolation success. Forensic Science International: Genetics 3(3):185–192.
Robeson M.S., O’Rourke D.R., Kaehler B.D., Ziemski M., Dillon M.R., Foster J.T.. 2021;RESCRIPt: reproducible sequence taxonomy reference database management. PLoS Computational Biology 17(11)
Rognes T., Flouri T., Nichols B., Quince C., Mahé F.. 2016;VSEARCH: a versatile open source tool for metagenomics. PeerJ 4e2584.
Santos A.R., Barrientos L., Martinez-Urtaza J.. 2020;Computational methods for 16S metabarcoding studies using Nanopore sequencing data. Computational and Structural Biotechnology Journal 18:296–305.
Sønstebø J.H., Gielly L., Brysting A.K., Elven R., Edwards M., Haile J.. 2010;Using next-generation sequencing for molecular reconstruction of past Arctic vegetation and climate. Molecular Ecology Resources 10:1009–1018.
Taberlet P., Coissac E., Pompanon F., Gielly L., Miquel C., Valentini A.. 2007;Power and limitations of the chloroplast trnL (UAA) intron for plant DNA barcoding. Nucleic Acids Research 35e14.
Wagner S., Lagane F., Seguin-Orlando A.. 2018;High-throughput DNA sequencing of ancient wood. Molecular Ecology 27(5):1138–1154.
Yi K.. 2015;Transition of ancient Korean wood technology during the Iron Age. Asian perspectives 54(1):185–206.

Article information Continued

Figure 1.

Archaeological site location and sampling process at the Namyangju study site. (A) Topographical map showing the location of the archaeological site (marked in red) in relation to surrounding geographical features and waterways. Aerial view of Namyangju Geumnam-ri archaeological site along the Bukhangang-river. The yellow box indicates the excavation location of bronze artifacts containing wooden components. (B) Documentation of wood sample collection process following contamination prevention protocols. The sampling procedure was conducted using tools to minimize environmental DNA contamination.

Figure 2.

Archaeological artifacts and sampling locations from the Namyangju site. Each row in (A) and (B) presents for 乙-shaped bronze implements and Tubular implements, and for Twin-bird Shaped Pommel respectively. (A) The excavation diagram of burial site, associated artifacts including pottery vessels and bronze artifacts with rings, schematic diagram showing wooden sample locations (YBI_HW, TBI_HW_1, and TBI_HW_2) from the 乙-shaped bronze implement and Tubular bronze implements in red circles, and actual photograph of the bronze implements. (B) The excavation diagram of burial site, associated artifacts including pottery vessels and bronze implements, schematic diagram showing wooden sample location (TBSP_HW) from the Twin-bird Shaped Pommel in a red circle and photograph of the Twin-bird Shaped pommel with wooden pieces.

Figure 3.

Schematic workflow of aDNA metabarcoding process from sampling at the archaeological site to bioinformatic analysis for species identification.

Figure 4.

Workflow diagram of the bioinformatics pipeline for taxonomic classification. The pipeline consists of two main processing streams: (1) Raw FASTQ data processing, which involves adapter and primer trimming followed by parallel denoising using DADA2/Deblur and clustering using VSEARCH methods; (2) Reference database preparation from NCBI GenBank, where trnL g-h sequences were extracted and used to train the Naive Bayes Classifier. These two streams converge for the final taxonomic classification step, where processed sequence data are classified using the trained classifier model. This pipeline integrates both sequence processing and machine learning approaches for accurate taxonomic assignment.

Figure 5.

Sequence length distribution and taxonomic classification performance analysis. (A) The violin plot shows the trnL g-h sequence length distribution in the reference database. (B) Performance evaluation of the taxonomic classification model across different taxonomic levels. The graph displays three metrics: precision (blue), recall (green), and F-measure (red).

Figure 6.

Taxonomic assignment profiling through trnL g-h metabarcoding analysis across four samples. (A) The stacked bar plots show the proportion of complete and incomplete taxa identified by DADA2, Deblur, and Closed Clustering methods. Each row represents different trnL samples (from top to bottom: YBI_HW, TBI_HW_1, TBI_HW_2, TBSP_HW). (B) Relative abundance distribution of plant families detected across the three analysis methods. As shown in the legend, the stacked bar plots display the composition of major plant families identified in each sample, with distinct color coding for different families. Each row corresponds to the same wood samples as in panel A. (C) Venn diagrams illustrating the overlap of unique ASVs detected using the three different methods for each sample. Numbers in each section represent shared or unique ASVs among the three approaches.

Table 1.

The proportion of complete and incomplete taxa

Sample Pipeline Complete taxa (n, %) Incomplete taxa (n, %)
YBI_HW DADA2 59,834 (56.09%) 46,844 (43.91%)
Deblur 1086 (52.59%) 979 (47.41%)
VSEARCH (closed clustering) 50,562 (52.67%) 45,438 (47.33%)
TBI_HW_1 DADA2 90,834 (79.35%) 23,635 (20.65%)
Deblur 1472 (53.08%) 1301 (46.92%)
VSEARCH (closed clustering) 72,123 (91.35%) 6828 (8.65%)
TBI_HW_2 DADA2 44,786 (37.24%) 75,467 (62.76%)
Deblur 713 (27.86%) 1846 (72.14%)
VSEARCH (closed clustering) 37,674 (40.14%) 56,185 (59.86%)
TBSP_HW DADA2 64,286 (56.06%) 50,393 (43.94%)
Deblur 1213 (45.06%) 1479 (54.94%)
VSEARCH (closed clustering) 53,658 (54.14%) 45,450 (45.86%)

Taxonomic classification results for four samples processed using three pipelines. The table reports the number of complete and incomplete taxa reads and relative percentage proportion (%).

Table 2.

The families were consistently detected across all three pipelines

Sample Consistently detected taxa Relative abundance (%)
YBI_HW Asteraceae 19.69
Brassicaceae 15.65
Convolvulaceae 12.96
Marantaceae 17.79
Rosaceae 33.05
TBI_HW_1 Amaryllidaceae 49.23
Convolvulaceae 47.6
Rosaceae 3.17
TBI_HW_2 Asteraceae 27.38
Convolvulaceae 41.32
Rosaceae 29.09
TBSP_HW Asteraceae 3.41
Convolvulaceae 23.04
Fabaceae 2.67
Poaceae 24.93
Rosaceae 45.2

The taxa and their relative abundance were consistently detected across four samples. The consistently detected taxa are those observed across all three pipelines (DADA2, Deblur, and VSEARCH). Relative abundance (%) is reported from the DADA2 pipelines.