Genetic Variations of Candida glabrata Clinical Isolates from Korea using Multi-locus Sequence Typing

Background: Although Candida albicans is considered to be the major fungal pathogen of candidemia, severe infections by non-albicans Candida (NAC) spp. have been on the increase in recent years. Among NAC spp., C. glabrata has emerged as the second most common pathogen. Unlike other Candida spp., it is often resistant to various azole antifungal agents, such as fluconazole. However, few studies have been conducted to investigate its structure, epidemiology, and basic biology. Recently, multi-locus sequence typing (MLST) has been developed as a highly useful and portable molecular biology technique. Methods: In the present study, MLST was performed with a total of 102 C. glabrata clinical isolates that were isolated from various types of clinical specimens. The present study was performed with a total of 102 C. glabrata clinical isolates that were isolated from various types of clinical specimens. The fungal internal transcribed spacer (ITS) gene wad amplified and sequenced to identify and confirm C. glabrata clinical isolates. For MLST, six housekeeping genes including 1,3-beta-glucan synthase (FKS), 3-isopropylmalate dehydrogenase (LEU2),myristoyl-CoA, protein Nmyristoyltransferase (NMT1), phosphoribosyl-anthranilate isomerase (TRP1), UTP-glucose-1-phosphate uridylyltransferase (UGP1), and orotidine-5'-phosphate decarboxylase (URA3) were amplified and sequenced. The results were analyzed by using the C. glabrata database. Results: Of a total of 3,345 base-pair DNA sequences, 49 (1.5%) variable nucleotide sites were found and the results showed that a total of 12 different sequence types (STs) were identified from the 102 clinical isolates. As classified by STs, The ST138 was the most predominant sequence type (ST) in this study as a result of 52.9% (54/102), and the following most predominant ST was the ST63 as a result of 23.5% (24/102). Conclusion: In conclusion, this data demonstrated that the ST138 was the most predominant ST in Korea. Further, we found eight undetermined STs (USTs) and then seven STs among these STs were given the number by PubMLST database. The data from this study might provide a fundamental database for further studies on C. glabrata, including its epidemiology, and evolution. Furthermore, the data might also contribute to the development of novel antifungal agents and diagnostic tests.


Methods:
In the present study, MLST was performed with a total of 102 C. glabrata clinical isolates that were isolated from various types of clinical specimens. The present study was performed with a total of 102 C. glabrata clinical isolates that were isolated from various types of clinical specimens. The fungal internal transcribed spacer (ITS) gene wad amplified and sequenced to identify and confirm C. glabrata clinical isolates. For MLST, six housekeeping genes including 1,3-beta-glucan synthase (FKS), 3-isopropylmalate dehydrogenase (LEU2),myristoyl-CoA, protein Nmyristoyltransferase (NMT1), phosphoribosyl-anthranilate isomerase (TRP1), UTP-glucose-1-phosphate uridylyltransferase (UGP1), and orotidine-5 ' -phosphate decarboxylase (URA3) were amplified and sequenced. The results were analyzed by using the C. glabrata database.
Results: Of a total of 3,345 base-pair DNA sequences, 49 (1.5%) variable nucleotide sites were found and the results showed that a total of 12 different sequence types (STs) were identified from the 102 clinical isolates. As classified by STs, The ST138 was the most predominant sequence type (ST) in this study as a result of 52.9% (54/102), and the following most predominant ST was the ST63 as a result of 23.5% (24/102).

Introduction
Candida species belong to the normal flora of the vaginal tract, the gastrointestinal tract, and the oral cavity in human [1,2]. However, rarely, serious infections, ranging from mucosal infections to systemic infection have been caused by Candida spp. [1,2]. Fungal infections caused by Candida spp. have increased significantly, especially in acquired immune deficiency syndrome (AIDS) and immunocompromised individuals, including intensive care and, elderly patients [2,3]. Also, candidemia is associated with a high mortality rate approximately 30 to 40% in hospitalized patients and is difficult to treat, thus increasing the cost of medical care [2,3].
C. glabrata has emerged as the second or third most common Candida pathogen after C. albicans in the United States, depending on the site [3][4][5]. Despite its increased prevalence, there have been relatively few studies on the population structure, epidemiology, and basic biology of C. glabrata compared to those conducted on other Candida spp. [2,4,6].
As mentioned above, C. albicans is considered to be the major fungal pathogen of candidemia in the past [6,7]. However, as the number of severe infections caused by non-albicans Candida spp. (NAC) have increased, studies have shifted from C. albicans to NAC such as C. glabrata in recent years [5,8]. Furthermore, since C. glabrata infections are often resistant to azole antifungal drugs, especially fluconazole, it is important to distinguish NAC from C. albicans to ensure the appropriate antifungal therapy and clinical management [2,5,9]. Thus, the discrimination of subtypes in these species are required for investigating their epidemiology and evolutionary biology [7,10,11].
In recent years, there has been substantial progress in the development of several molecular methods for typing subspecies and strains of fungi [12]. For instance, pulsed-field gel electrophoresis (PFGE) compares total DNA band patterns with or without restriction enzyme digestion, while multilocus variable-number tandem-repeat (VNTR) analysis examines length variations in six to nine PCR-amplified loci that contain polymorphic tandem repeats. Further, the random amplification of polymorphic DNA compares banding patterns following PCR with a nonspecific primer. Finally, multilocus enzyme electrophoresis, studies the different electrophoretic mobility of multiple core metabolic enzymes. These four approaches have some limitations, such as a lower reproducibility and portability [13,14], and the results obtained in different laboratories are difficult to compare [15,16].
Among these genotyping methods, multilocus sequence typing (MLST) is a useful tool to assign single nucleotide polymorphisms as allele numbers, which are stored in a database on line (PubMLST) and determine the differences from between closely related isolates by their geographical origins, sources, and other properties [7]. Also, it is possible that database are accessed by laboratories worldwide [11,15].
In the present study, MLST targets six independent housekeeping genes including 1,3-Beta-glucan synthase (FKS), 3isopropylmalate dehydrogenase (LEU2), myristoyl-CoA, protein N-myristoyltransferase (NMT1), phosphoribosyl-anthranilate isomerase (TRP1), UTP-glucose-1-phosphate uridylyltransferase (UGP1), and orotidine-5'-phosphate decarboxylase (URA3) was performed with a total of 102 C. glabrata clinical isolates from various clinical specimens such as blood, urine, and other body fluids in Korea and results were analyzed by using the C. glabrata MLST database (http://pubmlst.org/cglabrata/). The aim of the study is to discriminate sequence types (STs) in the same C. glabrata spp. by using common MLST and investigate the most prevalent ST from the C. glabrata in Korea.

Clinical strains
A total of 102 C. glabrata clinical isolates were provided from Korean Culture Collection of Medical Fungi (KCMF) and those isolates were collected from tertiary hospitals in Korea. Clinical isolates were isolated from a wide variety of clinical samples, including blood, catheterized urine, bile and other body fluids (Table 1).

Genomic DNA extraction from fungal isolates
Genomic DNA (gDNA) of C. glabrata clinical isolates was extracted using a I-genomic BYF DNA Extraction Mini Kit (iNtRON Inc., Seongnam, Korea) according to the manufacturer's instructions [17]. The concentration and purity of the genomic DNA were checked by 260/280 optical density using a Nanodrop 2000 Spectrophotometer (Thermo Scientific, Wilmington, DE, USA). The extracted gDNA was stored at 4°C until use.

Polymerase chain reaction and sequence analysis of the fungal ITS region for precise identification of C. glabrata clinical isolates
The fungal internal transcribed spacer (ITS) region, the conserved region between the 18S and 28S ribosomal RNA (rRNA) was amplified and sequenced using each primer pairs ( Table 2). Target amplification was carried out in 20 μL reaction mixture containing 10 μL Prime Taq Primix (Genet Bio Inc., Daejeon, Korea), 5 μL of distilled ultra-pure water, 1 μL of each primer (10 pmol/μL), and 3 μL of genomic DNA template. The PCR condition was: an initial denaturation at 94°C for 1 min, 30 cycles including subsequent denaturation at 94°C for 30 sec, annealing at 57°C for 30 sec and extension at 72°C for 45 sec followed by final extension at 72°C for 7 min and holding at 4°C. The amplified products were visualized by gel electrophoresis to confirm the presence of desired product.
Total 102 (100) The resulting amplicon was purified and sequenced by Macrogen Inc. (Daejeon, Korea). All sequences with low-quality bases in the chromatogram were re-sequenced for the highquality results.
The obtained sequences were aligned with reference sequences in the Genebank database using the basic local alignment search tool (BLAST) at the National Center for Biotechnology Information (NCBI), and percent homology scores were generated to precise identification of C. glabrata clinical isolates. Table 3 shows primers for the amplification and sequence analysis of C. glabrata six housekeeping gene fragments including FKS, LEU2, NMT1, TRP1, UGP1, and URA3. For the PCR amplification, 20 μL of final mixture contained 10 μL of Prime Taq Premix, 5 μL of distilled ultra-pure water, 1 μL of each forward and reverse primer (10 pmol/μL), and 3 μL of genomic DNA template.

MLST analysis for identifying sequence type of C. glabrata clinical isolates
To amplify each six gene, the PCR reaction conditions were as follows: 7 min at 94°C, 30 cycles of 1 min at the relevant annealing temperature (Table 3), and 1 min at 74°C, followed by 10 min at 74°C. The resulting sample was analyzed by gel electrophoresis. The PCR product of all loci were purified and sequenced using reverse sequence primer at Macrogen Inc.. The obtained sequences were analyzed by using the C. glabrata MLST database (http://pubmlst.org/cglabrata/). Each unique sequence at a locus defined an allele number, and unique combinations of alleles assigned as a ST.

Data Analysis
The alignment of combined six target gene sequences and loci (3,345bp) was performed using the Molecular Evolutionary Genetics Analysis (MEGA) v. 7.0 software [20]. For relatedness of the same species, the phylogenetic tree was drawn with the Unweighted Pair Group Method using Arithmetic algorithm (UPGMA) with randomized 1,000 bootstrapping. And then the eBURST package (http://eburst.mlst.net/) was used to determine that all related isolates were grouped into clonal complexes.

Results of PCR and sequence analysis of the fungal ITS region for species identification
1.5% TBE agarose gel DNA electrophoresis data showed that the size of amplified fungal ITS region was 978 bp, and amplicons have shown one clear band (data not shown). As a analysis result of comparison by Genebank BLAST tool for verifying the amplified PCR products, all clinical isolates used in

Results of PCR and sequence analysis, and obtaining allele number of six housekeeping genes for the MLST analysis
In order to perform the MLST analysis, six housekeeping genes of 102 C. glabrata clinical isolates were amplified by PCR. The size of amplified fragments of FKS, LEU2, NMT1, TRP1, UGP1, and URA3 were 589 bp, 512 bp, 607 bp, 419 bp, 616 bp, and 602 bp, respectively as the expected size and they represented clear band (data not shown).

Sequence type and cluster of C. glabrata clinical isolates
The MLST scheme revealed a high diversity of C. glabrata isolates with a total of 12 STs, 8 of which were identified as undetermined STs (USTs) that were not discovered in the previous studies.
The data demonstrates that the ST138 among these USTs was the most predominant ST in this study as a total of 54 clinical isolates (52.9%) were contained in this ST, and the following most predominant ST was the ST63 as a total of 24 clinical isolates (23.5%) were contained in this ST. In addition, this study obtained the ST55, ST22, and ST43 were as a total of 3 (2.9%), 6 (5.9%), and 2 (1.2%) clinical isolates were contained in respective ST and the ST139 was identified in 6 isolates (5.9%). The ST140 was identified in 2 isolates (1.2%). The remaining 5 STs (UST1, ST141, 142, 143, 144) were classified only once each (1%) ( Table 5). Combined sequence (3,345 bp) of six housekeeping genes was used for the phylogenetic tree analysis. With the exception of 3 outliers, the isolates were divided into 2 major clusters: cluster 1 and cluster 2 ( Figure 1). Cluster 1 consisted of the ST138 and cluster 2 consisted of the ST63 (Figures 2 and 3).

Discussion
C. glabrata is a highly opportunistic pathogen of the urogenital tract and the bloodstream in humans [21]. It is especially prevalent in the elderly and within the human immunodeficiency virus positive population [22]. Although candidiasis is frequently treated with azole antifungal agents, treatment failure has become a serious concern with azoleresistant clinical isolates due to widespread and long-term use of these agents. Nevertheless, few studies have been conducted on the structure, epidemiology, and basic biology of C. glabrata.
Healthcare-associated infections may be endogenous in origin or nosocomially transmitted, and the only way to distinguish them is through strain typing. Recently, MLST directly investigated the DNA sequence variations in a set of housekeeping genes and characterized the strains by their unique allelic profiles. The principle of MLST is simple, involving PCR amplification followed by DNA sequence analysis. Nucleotide differences between strains can be verified at a variable number of genes depending on the desired degree of discrimination. MLST schemes now exist for a number of important bacterial pathogens including Neisseria meningitidis, Streptococcus pneumoniae, Staphylococcus aureus, Streptococcus pyogenes, and Campylobacter jejuni. The technique has also been used to assess genetic relatedness among strains of Candida spp. including C. glabrata, C. albicans, C. tropicalis, and C. krusei. However, MLST scheme for C.
Hence, in this study, the first MLST analysis with the yeast pathogen C. glabrata was performed and evaluated in Korea. 6 loci were selected for this study, as recommended by previous studies. While a ST3 was defined as prevalent ST in Dodgson et al. [6], the data in this study demonstrates that the ST138 was the most predominant ST. Additionally, the data defined a total of 12 STs among the 102 clinical isolates, and found 8 USTs as a result and these sequence was given the number except for one ST.

Conclusion
In conclusion, prevalent and novel C. glabrata STs were found in the present study. The data might provide a fundamental database for further studies on C. glabrata, including its epidemiology and evolution. Furthermore, these data might also contribute to the development of novel antifungal agents and diagnostic tests. It might even be possible to discover the virulence factors associated with disease, which population genetic studies currently struggle to monitor.