GWPD: A Multifunctional Platform to Unravel Biological Risk Factors in Global Engineered Water Systems

Aijie Wang , Fang Huang , Wenxiu Wang , Yanmei Zhao , Yiyi Su , Zelin Lei , Rui Gao , Yu Tao , Jun Wei , Haoyi Cheng , Jinsong Liang , Bin Liang , Jianhua Guo , Jiping Jiang , Lu Fan , Shu-Hong Gao

Engineering ›› 2025, Vol. 51 ›› Issue (8) : 270 -278.

PDF (4268KB)
Engineering ›› 2025, Vol. 51 ›› Issue (8) :270 -278. DOI: 10.1016/j.eng.2024.04.022
Research
research-article
GWPD: A Multifunctional Platform to Unravel Biological Risk Factors in Global Engineered Water Systems
Author information +
History +
PDF (4268KB)

Abstract

Engineered water systems such as wastewater treatment plants (WWTPs) are potential reservoirs of various biological risk factors (BRFs), including pathogens, antibiotic resistance genes (ARGs), and virulence factors (VFs). Currently, a BRF database relevant to engineered water systems on a global geographic scale is lacking. Here, we present the Global Wastewater Pathogen Database (GWPD), an online database that provides information on the diversity, abundance, and distribution of BRFs from 1302 metagenome samples obtained from 186 cities, 68 countries, and six continents. We sorted these samples into six types: sewer networks, influent, anoxic activated sludge, oxic activated sludge, effluent, and receiving/natural waters. In total, 476 pathogens, 442 ARGs, and 246 VFs were identified. As a multifunctional database, GWPD provides an interactive visualization of these BRFs in a world map, an information retrieval interface, and an online one-click service for BRF annotation from metagenome sequencing data. GWPD is built on a web service framework, which can be readily extended to future versions of GWPD by adding more functional modules and connecting to other data sources, such as epidemic databases, to support risk assessment and control in the context of “One Health.”

Graphical abstract

Keywords

Pathogen / Antibiotic resistance genes / Biological risk factors / Metagenome / Engineered water system / Wastewater treatment

Cite this article

Download citation ▾
Aijie Wang, Fang Huang, Wenxiu Wang, Yanmei Zhao, Yiyi Su, Zelin Lei, Rui Gao, Yu Tao, Jun Wei, Haoyi Cheng, Jinsong Liang, Bin Liang, Jianhua Guo, Jiping Jiang, Lu Fan, Shu-Hong Gao. GWPD: A Multifunctional Platform to Unravel Biological Risk Factors in Global Engineered Water Systems. Engineering, 2025, 51(8): 270-278 DOI:10.1016/j.eng.2024.04.022

登录浏览全文

4963

注册一个新账户 忘记密码

1. Introduction

Diverse biological risk factors (BRFs), including pathogenic microorganisms, antibiotic resistance genes (ARGs), and virulence factors (VFs), have been frequently detected in engineered water systems such as wastewater treatment plants (WWTPs) [1], [2], [3]. Even though WWTP effluents are usually disinfected, disinfection-resistant pathogens are discharged to natural water bodies or transmitted to atmosphere environments, causing health risks to human communities [4], [5], [6]. For example, the bacterial strains Pseudomonas aeruginosa PAO1 and Escherichia coli (E. coli) CGMCC1.3373 are chlorine-resistant [7], and tetracycline-resistant Enterobacter-1 shows high tolerance to ultraviolet disinfection [8]. The tracing of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in sewer networks and WWTP effluents during the coronavirus disease 2019 (COVID-19) pandemic further highlighted the nonnegligible biological threat in engineered water environments [9], [7]. As there is an increasing demand for the use of reclaimed wastewater (i.e., treated effluent water) as a secondary water resource to address the water crisis [10], there is an urgent need to understand the occurrence and dispersal patterns of BRFs in global engineered water systems to alleviate biosafety concerns.

Accurate identification of BRFs from environmental samples is a prerequisite for revealing a real picture of BRF diversity in water systems. Conventional methods, including cultivation and quantitative polymerase chain reaction (qPCR) approaches, such SARS-CoV-2 in sewage, are limited by their ability to target only a specific pathogen or a small spectrum of BRFs in samples, missing the majority of BRFs [11], [12]. Untargeted methods analyze either 16S ribosomal RNA (rRNA) gene-based or metagenome-based sequencing data. For example, the pipeline for Multiple Bacterial Pathogen Detection and the Mypathogen database contain pipelines for bacterial pathogen detection from 16S rRNA gene sequencing data [13], [14]. However, 16S-based approaches are limited in taxonomic resolution and often cannot classify pathogens to the species level [15], [16]. Moreover, viral and eukaryotic pathogens will be missed by such approaches.

To date, online pipelines and platforms for identifying BRFs from metagenome data are still scarce. The metagenome analysis tools of the PathoSystems Resource Integration Center identify ARGs and VFs by read mapping using k-mer alignment (KMA) [17], [18], while the ARGs online analysis pipeline (ARGs-OAP) v3.0 annotates ARGs from metagenomic reads by mapping with Basic Local Alignment Search Tool X (BLASTX) [19]. However, platforms specifically designed for analyzing and illustrating the geographic distribution of BRFs, including pathogenic microbes, ARGs, and VFs, in global engineered water systems based on metagenomic data are still lacking.

In this study, we constructed a multifunctional platform, the Global Wastewater Pathogen Database (GWPD), to analyze and illustrate the vast diversity of BRFs in global engineered water systems. Its metagenome-based searching approach employs optimized bioinformatic tools to simultaneously identify bacterial, viral, and eukaryotic pathogens, along with ARGs and VFs. The web implementation of this database provides an interactive visualization of the distribution of these BRFs, an information search tool, and an analytical pipeline for annotating user-provided metagenomic data. GWPD will support the surveillance of BRFs, especially those related to wastewater-based epidemiology, and ultimately facilitate risk control and biosafety related to water-environment-health in the context of “One Health” [20]. GWPD is freely available at http://gwpd.hitsz.edu.cn. It has already garnered more than 150 registered users globally, with more than 30 000 accesses by Jan 2024.

2. Methods

2.1. The construction of a knowledge dictionary of BRFs

We collected information on 1908 bacterials, virals, and eukaryotic pathogens related to 157 hosts and 889 diseases available from the Nation Center for the Biotechnology Information (NCBI), the Centers for Disease Control and Prevention (CDC), the Chinese Centers for Disease Control and Prevention, the International Committee on Taxonomy of Viruses (ICTV), and the Pathogen–Host Interactions Database (PHI-base) [21]. The potential hosts and infectious diseases, reflecting the possible risks of pathogens, are shown in Table S1 in Appendix A.

In addition, information on 2373 ARGs belonging to 47 ARG classes (Fig. 1) was obtained from the Comprehensive Antibiotic Resistance Database (CARD) [22], and the class of each ARG is listed in Table S2 in Appendix A. Moreover, 541 VFs were collected from the Virulence Factor Database (VFDB) [23]. These information elements construct the knowledge dictionary of GWPD.

2.2. Metagenomic data collection

For data retrieval, relevant publications on WWTP-related metagenomic studies were retrieved from the Web of Science using the following search formula: topics (TS) = (pathogen OR pathogenic microorganism OR pathogenic bacteria OR pathogenic virus OR pathogenic fungi OR microbial OR microorganism) AND TS = (ARG OR resistance gene OR antibiotic resistance gene OR antimicrobial resistance OR resistome MGE OR mobile genetic element OR resistant) AND TS = (VF OR VF gene OR VFG OR virulence factor) AND TS = (metagenome OR metagenomics OR metagenomic sequencing OR high-throughput OR NGS OR next-generation sequencing OR shot-gun OR Illumina OR short reads OR metagenome assembly) AND TS = (wastewater treatment plant OR wastewater OR WWTPs OR full scale OR activated sludge OR influent OR effluent OR disinfect OR receiving water OR natural water OR river OR lake OR ocean) AND TS = (function OR functional gene OR functional microorganism). The data filtering criteria for the identified projects were as follows: ① Sequencing method: shotgun Illumina sequencing, with a sequencing depth of no less than 10 GB; ② sampling date: within the last 15 years (May 2009–Dec 2023); and ③ sample types: from engineered water systems, including wastewater sewer networks, influent, anoxic activated sludge, oxic activated sludge, effluent, or receiving water/natural water. Metadata of the relevant samples were obtained, including the sample types and geographic information (i.e., location, latitude, and longitude).

The sequencing datasets were downloaded from the NCBI Sequence Read Archive, which includes the following projects: PRJNA506137, PRJEB39738, PRJEB37889, PRJEB28815, PRJNA484416, PRJNA688020, PRJNA672704, PRJNA230567, PRJEB28796, PRJEB37137, PRJNA526679, PRJNA428383, PRJNA524094, PRJNA556302, PRJEB28033, PRJNA542960, PRJNA473136, PRJNA549409, PRJNA420682, PRJNA406858, PRJNA432264, PRJNA438174, SRP107015, PRJNA505617, PRJEB13831, PRJNA524456, SRP127721, PRJNA377521, SRP159184, PRJNA434744, PRJNA400857, PRJNA342017, PRJNA288131, SRP067931, PRJEB14051, PRJNA529711, PRJNA507800, PRJNA482680, PRJEB36775, PRJNA661613, PRJNA414521, PRJNA577390, and PRJNA419931.

2.3. The pipeline for analyzing metagenome data

The data processing procedure is shown in Fig. 2. The raw reads were trimmed and quality-filtered by using flexbar (v3.5.0) [24] and skewer (v0.2.2) [25] with the parameters of average Q-score higher than 30 and read length higher than 75 nucleotides (nt). The community composition was classified by using MetaPhlAn (v3.0) [26]. The composition of the pathogenic microbes was analyzed by MegaPath (v2.0) [27] with the default parameters. Information on pathogenic microbes was generated from the MegaPath results according to our knowledge dictionary by species name matching.

For the annotation of ARGs and VFs, clean reads were assembled by using MEGAHIT [28] with k-mer parameters of 21, 29, 39, 59, 79, 99, 119, and 141, and assemblies longer than 2 kB were retained for further analysis. Putative coding sequences were predicted by using Prodigal (v2.6.3, meta) [29] and then annotated for ARGs by using DeepARG [30] with a probability threshold of 0.8 and for VFs by using Diamond BlastP [31] against the VFDB database [23] with an e-value cutoff of 1 × 10−5. For ARG and VF abundance estimation, clean reads were mapped to the ARG- and VF-containing contigs by using BWA-MEM (v2.3.2) [18]. Reads with an aligned region lower than 90% of their length and with an identity lower than 99% were removed by using Bamm. The mapped read numbers were counted by using BBMap [32]. The number of copies per genome of the ARGs and VFs in each sample were calculated via normalization to the average coverage of conserved single-copy genes, namely, COG0048, COG0049, COG0087, COG0088, COG0091, COG0093, COG0094, COG0096, COG0097, COG0099, COG0100, COG0102, COG0184, COG0186, COG0256, and COG0522.

2.4. Database reconstruction and web implementation

The PostgreSQL database was adopted to store the data from the knowledge dictionary and annotation results of the pathogenic microorganisms, ARGs, and VFs in the samples. PostGIS (a plugin in PostgreSQL) was used to save the geographic information. Both exact and regular expression matching were used in the Searching function by utilizing Axios.

Python was utilized to implement the backend of the website. The web interface was implemented via code running in Vue.js, Cascading Style Sheets (CSS), Hyper Text Markup Language (HTML), and JavaScript by using Elements. Echarts was used for heatmap and stacked chart visualization. A fragmented upload approach was utilized in the annotation module to stage and record the number of slices to achieve breakpoint resumption.

Task submission triggers file transfer by the Open Geographic Modeling and Simulation (OpenGMS) model and initiates an asynchronous computation task in the Celery assignment queue [33], which reflects the task status. The results are saved in the PostgreSQL database after the computation. The web service has been tested in the web browsers Google Chrome (v117.0.5938.89), Microsoft Edge (v113.0.1774.35), and Mozilla Firefox (v18.05).

3. Results

3.1. Database statistics

To construct the knowledge dictionary (Fig. 2), which serves as the reference database for annotation, we searched and sorted 1908 pathogenic microorganisms in total, including bacteria, fungi, viruses, and metazoa, from public data sources (e.g., PHI-base [21], CDC, NCBI, and ICTV), of which 991 are bacteria, 514 are fungi, 399 are viruses, and four are metazoa. These pathogens can potentially infect 157 host species and result in 889 diseases. Additionally, GWPD harbors information on 2373 ARGs belonging to 47 ARG classes from the CARD [22] and 541 VFs from the VFDB [23] (Tables 1 and Table 2). The information on all these BRFs has been integrated into the knowledge dictionary (Fig. 2).

We then collected 1302 metagenomic samples with metadata from 186 cities in 68 countries on six continents, including 413 from Asia, 406 from Europe, and 320 from North America, followed by South America (96), Africa (56), and Oceania (11). Specifically, the datasets include 770 samples from 21 developed countries, 503 from 39 developing countries, and 29 from eight low-income countries. The largest quantity of samples was distributed in the US (307), followed by China (285), Germany (102), Argentina (86), and Sweden (73) (Fig. S1 in Appendix A). We noticed that samples from low-income countries are underrepresented in the current version of the GWPD. All of the samples were assigned to six types: sewer network, WWTP influent, anoxic activated sludge, oxic activated sludge, WWTP effluent, and receiving/natural water (Fig. S2 in Appendix A). Specifically, among the effluent samples, seven were collected post-disinfection by different technologies, including four by ultraviolet light, two by ozone, and one by chlorine. A total of 476 pathogens, 442 ARGs belonging to 20 classes, and 246 VFs were identified in the collected samples (Table 1).

3.2. Interactive illustration of the global distribution of BRFs

Interactive interfaces are available on GWPD for users to visualize the global distribution of BRFs in engineered water systems (Fig. 3 and Fig. S3 in Appendix A). Specifically, the interactive homepage presents the 30 most abundant BRFs on the world map and the map of China (Fig. S3). The font size of the words in the word cloud is proportional to the relative abundance of the BRFs. A heatmap will be generated when users click the names of the BRFs in the word cloud, showing the distribution and abundance of the specific BRFs across countries worldwide or provinces nationally in China (Fig. S3). Potential hosts and diseases associated with the top 30 pathogens are listed in Table S1.

Moreover, the numbers of cities, WWTPs, samples, and identified pathogenic microorganisms are displayed by clicking a country/region on the world map or a province on the map of China. The homepage can guide users to a specific webpage by selecting a specific country/region or a Chinese province on the map. The webpage shows the composition of the ten most abundant BRFs for each sample type collected in a province/state of a country in a city within a Chinese province (Fig. 3).

Overall, the dominant pathogens vary across continents (Fig. S4 in Appendix A). Streptococcus suis (S. suis), E. coli, and Klebsiella pneumoniae are predominant in Africa and Europe, while E. coli, Pseudomonas mendocina (P. aeruginosa), and Pseudomonas alcaligenes (P. alcaligenes) are abundant in Asia. In North America, the dominant pathogens are S. suis, E. coli, and Bacteroides uniformis. In Oceania, S. suis, E. coli, and Acinetobacter hemolyticus are predominant, while in South America, E. coli has the highest abundance, followed by P. aeruginosa and Gordonia bronchialis. Moreover, ARGs commonly found and prevalent in wastewater treatment processes are prevalent in GWPD [34], [35]. Six continents share two highly abundant ARGs, ksgA (an aminoglycoside) and bacA (a peptide), while other dominant ARGs differ among continents (Fig. S4 in Appendix A). For instance, aadA is the most abundant ARG in Africa, followed by OXA and sul1. ksgA, ompR, and bacA are predominant in Asia, while ksgA, kdpE, and bacA are highly abundant in Europe. The most abundant ARGs in North America, Oceania, and South America are ksgA, msrE, and OXA, respectively. Finally, the dominant VFs across continents exhibit a highly uniform pattern, including EF-Tu, Type IV pili, flagella, capsules, LPS, Hsp60, and HSI-I T6SS-secreted effectors, indicating the probable prevalence of these VFs in global engineered water systems (Fig. S4).

3.3. Information browsing and retrieval

Two methods are used by users to retrieve GWPD data: searching/browsing by location and searching/browsing by the names of pathogenic microorganisms (Fig. 4). When searching/browsing by location, information on the province/state or city of interest is shown, including information on the identified pathogenic microorganisms, ARGs, and VFs, as well as the potential hosts and diseases of these BRFs according to the GWPD knowledge dictionary. Alternatively, users can search by pathogen name, where the specific pathogens are shown, including potential hosts, diseases, ARGs, and associated VFs.

3.4. Online service for automatic metagenome annotation

The annotation function is designed to provide a one-step analysis service for BRF annotation for researchers who lack advanced bioinformatics skills (Fig. 5(a)). After registration and login, Illumina paired-end metagenomic sequencing reads can be uploaded with one click, and the analytical pipeline will then be automatically triggered (Figs. 5(b) and (c)). Detailed notes, such as the naming scheme, file formats, and sizes, are introduced in the example. Currently, we have implemented 256 Central Processing Units (CPUs) specifically for the online analysis function, and 32 CPUs are allocated for each task. Thus, eight tasks are supported simultaneously. The average duration for a task involving 10 GB of raw sequencing data is approximately 12 h. Task status can be tracked, and the annotated results in the form of Excel sheets will be sent to users as e-mail attachments when the task is finished. In addition, users can log into GWPD to visualize the preliminary analysis results, consisting of the compositions of microorganisms, pathogen species, and ARGs and VFs (Fig. 5(d)).

3.5. Other functionalities

The “Help” page frequently offers questions with answers and website navigation. A comment window is also provided to welcome any public queries and suggestions for further website improvement. The “About” page introduces our project mission and the management and technical support teams of the GWPD. The citations of the publicly available data used in the GWPD are also provided.

3.6. Extensibility of the GWPD platform

GWPD is designed to be extensible, which provides feasible interfaces for future updates from a technique perspective (Fig. 6). Echart, Axios, OpenGMS, and Vue.js were used to visualize abundance, retrieve search results, conduct annotations, and add functional modules, respectively. Based on the existing tools and development framework, subdatabases of BRFs from other aquatic environments and more functional modules for data analysis (e.g., source tracking and quantitative microbiological risk assessment [36]) can be included in further updated versions of the GWPD. We also plan to connect remote public databases relevant to wastewater epidemiology to integrate new cloud computing modules for cross-database BRF analysis.

4. Discussions

GWPD provides a comprehensive catalog of BRFs in global engineered water systems, including WWTP-related samples. The geographic distribution of BRFs among countries and continents implies their potential threat to human health. However, the variations in dominant pathogens and ARGs among continents may be attributed to uneven sampling (such as limited sample collection in Africa and Oceania, particularly in low-income countries) (Figs. S1 and S2). In addition, we still have limited information on BRFs from specific samples related to WWTPs, including sewer networks, effluents from different disinfection technologies (e.g., chlorine, ultraviolet, ozone, chlorine dioxide, combined ozone/UV disinfection, etc.), and receiving waters. For example, during the COVID-19 pandemic, studies have demonstrated the transmission of SARS-CoV-2 in the feces of infected individuals to WWTPs via sewer networks [37], [38]. Monitoring sewage from sewer networks for pathogen tracing enables real-time surveillance of communities [7]. Furthermore, some strains of E. coli and P. aeruginosa have chlorine resistance and ultraviolet tolerance. These highly risky pathogens may further leak into receiving waters and threaten human communities [4]. Thus, it is important to analyze the occurrence of BRFs after disinfection with different technologies to evaluate removal efficiency before discharge to receiving waters.

GWPD provides an illustration of the relative abundance of potential pathogenic microorganisms, ARGs, and VFs in global engineered water systems. These original data are valuable resources for further in-depth analyses. For instance, samples with geographic information can be analyzed together to trace the dissemination of specific pathogens by using tools such as Source Tracker [39]. In addition, the QMRA can be used to estimate the risk of each sample, specific habitat, or even a particular city via pathogen dissemination from source to sink [36]. Epidemic data of clinically relevant pathogens can be included in the analysis of environmental data in the scheme of water-based epidemiology.

Currently, the annotation service only supports paired-end Illumina short reads but future versions of GWPD will be implemented with the capacity to process Pacific Biosciences (PacBio, USA) and Oxford Nanopore (UK) reads considering the rapid progress of long-read sequencing technologies [40]. Long-read data may provide more accurate (e.g., strain-level) identification of pathogens. Moreover, long-read data can potentially link ARGs and VFs to their host genomes, enabling tracing of the transmission of these BRFs between pathogens [41]. Furthermore, alternative software tools in the annotation pipeline (e.g., Bracken [42] and mOTUs [43] for microbiome taxonomic profiling and ARGs-OAP v3.0 [19] for ARG annotation, quantification, and risk ranking) can be used in the future to facilitate methodological comparisons and standardization.

In conclusion, GWPD provides high-throughput metagenome-based identification and visualization of BRFs in global engineered water systems. The potential risks from the BRFs revealed in this database warrant further research on source tracking, risk assessment, and epidemic analysis. This database will also provide primary data and encourage the optimization of BRF-control technologies in WWTPs and other engineered water systems.

CRediT authorship contribution statement

Aijie Wang: Conceptualization, Formal analysis, Project administration, Supervision, Visualization, Writing – review & editing. Fang Huang: Data curation, Formal analysis, Investigation, Software, Visualization, Writing – original draft. Wenxiu Wang: Writing – review & editing, Software. Yanmei Zhao: Software, Writing – review & editing. Yiyi Su: Writing – review & editing. Zelin Lei: Writing – review & editing. Rui Gao: Writing – review & editing. Yu Tao: Writing – review & editing. Jun Wei: Writing – review & editing. Haoyi Cheng: Writing – review & editing. Jinsong Liang: Writing – review & editing. Bin Liang: Writing – review & editing. Jianhua Guo: Writing – review & editing. Jiping Jiang: Writing – review & editing. Lu Fan: Conceptualization, Funding acquisition, Methodology, Project administration, Software, Supervision, Writing – review & editing. Shu-Hong Gao: Conceptualization, Funding acquisition, Investigation, Project administration, Supervision, Writing – review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (52321005, 52293441, 52293443, and 52230004), the Shenzhen Stability Support Key Program in Colleges and Universities of China (GXWD20231127195344001), the Natural Science Foundation of Guangdong Basic and Applied Basic Research Foundation (2024A1515010085), the Shenzhen Overseas High-level Talents Research Startup Program (20200518750C), the Shenzhen Science and Technology Program (KQTD20190929172630447), and the Open Project of the Key Laboratory of Environmental Biotechnology, CAS (KF2021006). The GWPD website was supported by Shenzhen Zhishu Environmental Technology Co., Ltd., China.

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.eng.2024.04.022.

References

[1]

Bogler A, Packman A, Furman A, Gross A, Kushmaro A, Ronen A, et al.Rethinking wastewater risks and monitoring in light of the COVID-19 pandemic.Nat Sustain 2020; 3(12):981-990.

[2]

Xiao Y, Zhao F, Peñuelas J, Huang Q, Zhu YG.Super pathogens from environmental biotechnologies threaten global health.Natl Sci Rev 2021; 8(9):nwab110.

[3]

Liang J, Mao G, Yin X, Ma L, Liu L, Bai Y, et al.Identification and quantification of bacterial genomes carrying antibiotic resistance genes and virulence factor genes for aquatic microbiological risk assessment.Water Res 2020; 168:115160.

[4]

Karaolia P, Vasileiadis S, Michael SG, Karpouzas DG, Fatta-Kassinos D.Shotgun metagenomics assessment of the resistome, mobilome, pathogen dynamics and their ecological control modes in full-scale urban wastewater treatment plants.J Hazard Mater 2021; 418:126387.

[5]

Garner E, McLain J, Bowers J, Engelthaler DM, Edwards MA, Pruden A.Microbial ecology and water chemistry impact regrowth of opportunistic pathogens in full-scale reclaimed water distribution systems.Environ Sci Technol 2018; 52(16):9056-9068.

[6]

Jia S, Shi P, Hu Q, Li B, Zhang T, Zhang XX.Bacterial community shift drives antibiotic resistance promotion during drinking water chlorination.Environ Sci Technol 2015; 49(20):12271-12279.

[7]

Larsen DA, Wigginton KR.Tracking COVID-19 with wastewater.Nat Biotechnol 2020; 38(10):1151-1153.

[8]

Li M, Yang YF, Lu Y, Zhang DY, Liu Y, Cui XF, et al.Natural host-environmental media-human: a new potential pathway of COVID-19 outbreak.Engineering 2020; 6(10):1085-1098.

[9]

Amman F, Markt R, Endler L, Hupfauf S, Agerer B, Schedl A, et al.Viral variant-resolved wastewater surveillance of SARS-CoV-2 at national scale.Nat Biotechnol 2022; 40(12):1814-1822.

[10]

Editorial N.Water crisis: how local technologies can help solve a global problem.Nature 2023; 620(7972):7.

[11]

Wiesinger-Mayr H, Vierlinger K, Pichler R, Kriegner A, Hirschl AM, Presterl E, et al.Identification of human pathogens isolated from blood using microarray hybridisation and signal pattern recognition.BMC Microbiol 2007; 7(1):78.

[12]

Ji Y, Zhang B, Van SF, Horn WP, Woodnutt G, et al.Identification of critical staphylococcal genes using conditional phenotypes generated by antisense RNA.Science 2001; 293(5538):2266-2269.

[13]

Yang XR, Jiang GF, Zhang YZ, Wang NQ, Zhang YL, Wang XF, et al.MBPD: a multiple bacterial pathogen detection pipeline for One Health practices.iMeta 2023; 2(1):e82.

[14]

Zhang TT, Miao JJ, Han N, Qiang YJ, Zhang W.MPD: a pathogen genome and metagenome database.DATABASE OXFORD 2018; 2018:bay055.

[15]

Miao J, Han N, Qiang Y, Zhang T, Li X, Zhang W.16SPIP: a comprehensive analysis pipeline for rapid pathogen detection in clinical samples based on 16S metagenomic sequencing. BMC Bioinformatics, 18 (S16) (2017), p. 568

[16]

Wensel CR, Pluznick JL, Salzberg SL, Sears CL.Next-generation sequencing: insights to advance clinical investigations of the microbiome.J Clin Invest 2022; 132(7):154944.

[17]

Davis JJ, Wattam AR, Aziz RK, Brettin T, Butler R, Butler RM, et al.The PATRIC Bioinformatics Resource Center: expanding data and analysis capabilities.Nucleic Acids Res 2020; 48(D1):D606-D612.

[18]

Clausen PTLC, Aarestrup FM, Lund O.Rapid and precise alignment of raw reads against redundant databases with KMA.BMC Bioinformatics 2018; 19(1):307.

[19]

Yin XL, Zheng XW, Li LG, Zhang AN, Jiang XT, Zhang T.ARGs-OAP v3.0: antibiotic-resistance gene database curation and analysis pipeline optimization.Engineering 2023; 27:234-241.

[20]

Kim DW, Cha CJ.Antibiotic resistome from the One-Health perspective: understanding and controlling antimicrobial resistance transmission.Exp Mol Med 2021; 53(3):301-309.

[21]

Urban M, Cuzick A, Seager J, Wood V, Rutherford K, Venkatesh SY, et al.PHI-base in 2022: a multi-species phenotype database for Pathogen-Host Interactions.Nucleic Acids Res 2022; 50(D1):D837-D847.

[22]

Alcock BP, Raphenya AR, Lau TTY, Tsang KK, Bouchard M, et al.CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database.Nucleic Acids Res 2020; 48(D1):D517-D525.

[23]

Liu B, Zheng D, Zhou S, Chen L, Yang J.VFDB 2022: a general classification scheme for bacterial virulence factors.Nucleic Acids Res 2022; 50(D1):D912-D917.

[24]

Roehr JT, Dieterich C, Reinert K.Flexbar 3.0-SIMD and multicore parallelization.Bioinformatics 2017; 33(18):2941-2942.

[25]

Jiang H, Lei R, Ding SW, Zhu S.Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads.BMC Bioinformatics 2014; 15(1):182.

[26]

Buchfink B, Reuter K, Drost HG.Sensitive protein alignments at tree-of-life scale using DIAMOND.Nat Methods 2021; 18(4):366-368.

[27]

Leung CM, Li D, Xin Y, Law WC, Zhang Y, Ting HF, et al.MegaPath: sensitive and rapid pathogen detection using metagenomic NGS data. BMC Genomics, 21 (S6 Suppl 6) (2020), p. 500

[28]

Li D, Liu CM, Luo R, Sadakane K, Lam TW.MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph.Bioinformatics 2015; 31(10):1674-1676.

[29]

Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ.Prodigal: prokaryotic gene recognition and translation initiation site identification.BMC Bioinformatics 2010; 11(1):119.

[30]

Arango-Argoty G, Garner E, Pruden A, Heath LS, Vikesland P, Zhang L.DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data.Microbiome 2018; 6(1):23.

[31]

Nousias O, Montesanto F.Metagenomic profiling of host-associated bacteria from 8 datasets of the red alga Porphyra purpurea with MetaPhlAn3.Mar Genomics 2021; 59:100866.

[32]

Gruber-Vodicka HR, Seah BKB, Pruesse E.phyloFlash: rapid small-subunit rRNA profiling and targeted assembly from metagenomes.mSystems 2020; 5(5):e00920-e01020.

[33]

Jiang JP, Pang TR, Zhang FY, Men YL, Yadav H, et al.Pathway to encapsulate the surface water quality model and its applications as cloud computing services and integration with EDSS for managing urban water environments.Environ Modell Softw 2022; 148:105280.

[34]

Raza S, Shin H, Hur HG, Unno T.Higher abundance of core antimicrobial resistant genes in effluent from wastewater treatment plants.Water Res 2022; 208:117882.

[35]

Munk P, Brinch C, M FDøller, Petersen TN, Hendriksen RS, Seyfarth AM, et al.Genomic analysis of sewage from 101 countries reveals global landscape of antimicrobial resistance.Nat Commun 2022; 13(1):7251.

[36]

Wu B, Wang C, Zhang C, Sadowsky MJ, Dzakpasu M, Wang XC.Source-associated gastroenteritis risk from swimming exposure to aging fecal pathogens.Environ Sci Technol 2020; 54(2):921-929.

[37]

Kumar M, Thakur AK, Mazumder P, Kuroda K, Mohapatra S, Rinklebe J, et al.Frontier review on the propensity and repercussion of SARS-CoV-2 migration to aquatic environment.J Hazard Mater Lett 2020; 1:100001.

[38]

Bhowmick GD, Dhar D, Dibyojyoty N, Ghangrekar MM, Banerjee R, Das S, et al.Coronavirus disease 2019 (COVID-19) outbreak: some serious consequences with urban and rural water cycle.npj Clean Water 2020; 3(1):32.

[39]

Knights D, Kuczynski J, Charlson ES, Zaneveld J, Mozer MC, Collman RG, et al.Bayesian community-wide culture-independent microbial source tracking.Nat Methods 2011; 8(9):761-763.

[40]

Zhao N, Cao J, Xu J, Liu B, Liu B, Chen D, et al.Targeting RNA with next- and third-generation sequencing improves pathogen identification in clinical samples.Adv Sci 2021; 8(23):2102593.

[41]

Dai D, Brown C, Bürgmann H, Larsson DGJ, Nambi I, Zhang T, et al.Long-read metagenomic sequencing reveals shifts in associations of antibiotic resistance genes with mobile genetic elements from sewage to activated sludge.Microbiome 2022; 10(1):20.

[42]

Lu J, Breitwieser FP, Thielen P, Salzberg SL.Bracken: estimating species abundance in metagenomics data.PeerJ Comput Sci 2017; 3:e104.

[43]

Milanese A, Mende DR, Paoli L, Salazar G, Ruscheweyh HJ, Cuenca M, et al.Microbial abundance, activity and population genomic profiling with mOTUs2.Nat Commun 2019; 10(1):1014.

RIGHTS & PERMISSIONS

THE AUTHOR

PDF (4268KB)

Supplementary files

SI

5498

Accesses

0

Citation

Detail

Sections
Recommended

/