Introduction

The North Eastern Himalayan (NEH) region of India is a repository of biodiversity, covering a diverse array of germplasm and underutilized crops with high nutritional value and resilience to environmental stresses [1,2,3]. Perilla frutescence L., belonging to the mint family Lamiaceae and native to East Asia, including China, Japan, Korea, and India, is one such herbaceous plant widely cultivated for its culinary, medicinal, and industrial applications [4]. Perilla, also known as shiso, egoma, Kkaennip, suzi, or beefsteak plant is used in culinary applications such as garnishes, flavoring agents, and in pickling. It also has medicinal uses, including anti-inflammatory, anti-hyperglycemic, and anti-allergic treatments, with derived products like perilla oil and perilla seed powder being used in food and health supplements [5,6,7]. Cultivation of perilla in the NEH region, particularly in states like Arunachal Pradesh, Manipur, Nagaland, Meghalaya, and Sikkim, is important for enhancing nutritional diversity, preserving traditional practices, and uplifting economic opportunities [3].

Perilla seeds are known for their richness in polyunsaturated fatty acids (PUFAs), comprising 35–45% oil content, with exceptionally high levels of omega-3 fatty acids (alpha-linolenic acid, 18:3), ranging from 53–64% [8]. Additionally, perilla seeds contain a high protein content of 15.70%-23.90%, predominantly located in the seed kernel, surpassing conventional oil sources like mustard, cotton, linseed, sunflower, coconut, and almond in protein quality and content [9,10,11]. These seeds also contain phenolic compounds and essential minerals such as calcium, iron, zinc, magnesium, and phosphorus, enhancing their health-promoting benefits [12].

Perilla-based food products offer diversification by introducing unique flavors and nutritional profiles, enriching functional foods and nutraceuticals with omega-3 fatty acids, antioxidants, anti-inflammatory compounds, and health-promoting minerals [8]. Understanding nutritional composition patterns in diverse germplasms requires employing pattern recognition and multivariate data analysis (MVDA) techniques such as Hierarchical Cluster Analysis (HCA), Principal Component Analysis (PCA), and Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) [13,14,15,16]. However, comprehensive data on the nutritional diversity and distribution patterns of popularly grown Indian perilla genotypes are still limited. Since different germplasm can exhibit varying nutritional values due to genetic diversity, environmental influences such as soil type, climate, and cultivation practices, and genotype environment interactions. This genetic and environmental interplay impacts nutrient composition, thus making it essential to identify nutritionally rich genotypes for breeding programs for specific agroclimatic zones [17, 18].

Therefore, this study conducts a comprehensive analysis of 27 key nutritional parameters across 45 diverse perilla genotypes from the NEH region, utilizing MVDA techniques (HCA and PCA) to elucidate data structure and distribution. Additionally, correlation analysis and TOPSIS-based ranking are performed to understand nutritional relations and identify superior varieties for nutritional enhancement and sustainable agricultural advancement. This investigation provides valuable insights into nutritional composition and diversity in perilla germplasm, thus facilitating the selection of superior varieties for nutritional enhancement and sustainable agricultural development in the NEH region.

Materials and Methods

The Materials and Methods section is provided as supplementary material.

Results and Discussion

Nutritional Parameters Analysis

The descriptive statistics for 13 biochemical parameters and 14 minerals are presented in Table S1 and S2, respectively. Significant variability (in %) for biochemical parameters such as moisture, ash, oil, protein, total soluble sugar (TSS), starch, phenols, ferric reducing antioxidant power (FRAP), palmitic acid, stearic acid, oleic acid, linoleic acid, and linolenic acid was observed in perilla germplasm. Similarly, significant variability was observed in minerals such as aluminium, calcium, cobalt, chromium, copper, iron, potassium, magnesium, manganese, molybdenum, sodium, nickel, phosphorus, and zinc. This high variability can be attributed to variations in genetic backgrounds, environmental conditions such as geographical location, soil, climate, agronomic practices, and gene-environment-management interactions, all of which influence nutrient uptake and accumulation [19, 20].

Perilla seed oil (PSO) contain a total lipid content of ~ 40.00%, where neutral lipids constitute 91.20–93.90%, glycolipids range from 3.90–5.80%, and phospholipids contribute around 2.00–3.00%. PSO contains high levels of unsaturated fatty acids, especially, linolenic acid (51.20% to 64.00%), linoleic acid (12.20% to 18.60%), and oleic acid (11.90% to 23.80%) [9, 21], surpassing several conventional oil sources such as linseed, sesame, mustard, soy, and sunflower [11]. In agreement with the previous studies, we also observed very high germplasm-variability in oil content, ranging from 28.65% (NL-44) to 74.20% (NL-5), with an average of 40.87%. Significant variability was observed in palmitic acid content, ranging from 7.06% in RCPS-410 to 10.74% in MN-13, with an average of 8.27%. Stearic acid content varied from 1.96% in IC-0615390 to 2.29% in MN-13, whereas oleic acid content ranged from 8.11% in RCPS-410 to 13.30% in IC-0615372. Linoleic acid was found to be in range from 15.18% (SK-12) to 22.74% (NL-46), with an average of 18.49%. The variability in linoleic acid content provides several opportunities for industrial applications such as biodiesel production, where linoleic acid acts as a key component in the transesterification process [23]. Significant variability was also observed in linolenic acid content, ranging from 55.47% (MN-13) to 67.07% (RCPS-410), with an average of 60.60%. IC-0615368 (65.26%), and IC-0615370 (64.79%) also showed the highest linolenic acid content among the genotypes analyzed. Similar results had also been reported by other researchers [10, 1124, 25]. Linolenic acid is a key omega-3 fatty acid and RCPS-410 being rich in it (67.07%), surpasses that of all other oils, can be a preferred choice for potential cardiovascular and anti-inflammatory benefits and provides several opportunities in formulations of dietary supplements and pharmaceuticals [6, 8]. Kim et al. [26] found that ALA is the primary fatty acid in perilla, comprising 47.00% to 64.00%, with linoleic acid at 10.00% to 24.00% and oleic acid at 9.00% to 20.00%. Their results align with ours, highlighting significant PSO content diversity (17.00% to 42.70%), thus offering opportunities for targeted oil blending and crop improvement programs [2729].

A high variability was observed in protein content, ranging from 11.06% in NL-5 to 23.12% in NL-44 (average = 17.73%). Similar to our findings, previous studies had also reported that perilla seeds contain a high protein content, ranging from 15.70% to 23.90%, primarily concentrated in the seed kernel [9, 10]. In comparison to conventional oil sources; for instance, mustard (20.00%), cotton (19.40%), linseed (20.30%), sunflower (19.80%), coconut (23.90%), and almond (20.80%), perilla seeds shows superior protein quality as well as quantity. Previous studies also suggest that perilla seed by-products (meal and cake), can be utilized as natural antioxidants and incorporated into animal diets to enhance production and health [26, 30, 31]. Therefore, utilizing perilla for the production of protein-rich animal feed will contribute to improved livestock nutrition and health offers a sustainable solution. Moreover, a robust crop improvement program can capitalize on promising protein rich accessions identified in our study, such as NL-44 (23.12%), IC-0615374 (21.99%), and AR-1 (20.66%), to further enhance the nutritional value, food security, and sustainable agricultural practices (Fig. 1).

Fig. 1
figure 1

The heat map represents the relative mean nutritional composition of each perilla germplasm. The color scheme in the data represents the standardized values between + 2 and -2, each cluster is graphically represented as individual values contained in the matrix, with colors reflecting the relative abundance of nutritional attributes

Perilla seeds show potent antioxidant activity attributed to various bioactive compounds, including polyphenols and flavonoids. Polyphenols hinder protein and starch digestibility and serve as metal chelators. Nevertheless, perilla phenols exhibit potent radical scavenging activity functioning as effective hydrogen peroxide scavengers and robust antioxidants [32]. Consequently, both low and high phenol content holds distinctive significance from a nutritional point of view. In the present study, significant variability in phenol content was observed, ranging from 0.03% (NL-5) to 0.87% (IC-0615381), with an average of 0.57% (Table S1). FRAP content (in %) also varied significantly ranging from 0.42 (IC-0615391) to 1.32 (MN-13), with an average of 1.06. The genotypes with the highest FRAP content are MN-13 (1.32), IC-0615367 (1.07), and RCPS-421 (1.04). Based on earlier findings, genotypes with high phenols and FRAP content could be targeted for the development of antioxidant-rich products, including functional foods, skincare formulations, dietary supplements, herbal medicines, and natural preservatives [33–36].

Minerals are essential for maintaining human health and facilitating metabolic processes, with 23 minerals playing key roles in various biochemical and physiological functions [37]. In this study, significant variability in mineral content was observed among perilla genotypes. Interestingly, MN-13 exhibited high concentrations of Co, Cu, Fe, K, Mn, and Mo, making it a valuable candidate for food fortification to address multiple micronutrient deficiencies. Further information about mineral variability and its significance in Perilla germplasm are provided in the supplementary materials.

Multi Variate Data Analysis (MVDA)

To extract the data structure and distribution patterns of biochemical traits as well as minerals in perilla germplasm, hierarchical cluster analysis (HCA) and principal component analysis (PCA) techniques were utilized in the present study. In addition, detailed statistical analysis of 27 key nutritional traits across perilla germplasm is depicted in supplementary Table S4.

HCA Analysis

The evaluated germplasm was organized into four clusters, representing their hierarchical grouping, and visualized using a dendrogram (Fig. 2A). Initially, each sample was considered an independent cluster, and subsequent merging of clusters was performed based on agglomerative grouping. The resulting clusters along with the relative mean composition values of their nutritional attributes, are depicted in a normalised heat map (Fig. 2B) and mean values of each trait in 4 clusters are given in Table S3. Cluster I is characterized by the germplasm having higher contents of oil (40.73%), phenols (0.63%), FRAP (1.15%), linoleic (19.16%), and linolenic acids (60.44%) and low values of palmitic acid, calcium, chromium, iron, potassium, magnesium, phosphorus, and zinc. Therefore, cluster I offer numerous applications across various industries including cosmetics and skincare formulations, where the seed oil’s combination with high levels of total phenols and linoleic acid can provide antioxidant protection and hydration to the skin, contributing to overall skin health.

Fig. 2
figure 2

A. The dendrogram represents four clusters depicting perilla germplasm. Each cluster is color-coded for clarity: orange for cluster I, green for cluster II, red for cluster III, and purple for cluster IV. B. The evaluated germplasm is organized into four clusters, depicted hierarchically in a normalised heat map. The values are scaled from + 1 to -1

Furthermore, the antioxidant-rich nature of this cluster makes it an excellent candidate for dietary supplements aimed at combating oxidative stress and inflammation, as well as for developing functional foods and pharmaceuticals [38].

Cluster II is characterized by higher content of moisture (7.23%), ash (5.10%), protein (19.72%), TSS (2.64%), FRAP (1.15%), palmitic acid (9.60%), oleic acid (10.99%), and several minerals including Al, Cu, Fe, K, Mg, Mn, Ni, P, and Zn (Table S3, Fig. 2B). Despite lower oil, linoleic, and linolenic acid contents, this cluster’s high protein and oleic acid levels make it ideal for nutritional supplements. Additionally, its high TSS, moisture, essential minerals, and optimal techno-functional properties make it suitable for functional foods, value-added industries, livestock feed, and fortifying products to combat nutrient deficiencies [39].

Cluster III is distinguished by high content of oil (41.39%), linoleic (19.05%), and linolenic (60.37%) acids while it depicted lower contents of ash, protein, starch, phenols, FRAP, and oleic acid (Table S3, Fig. 2B). The germplasm of this cluster can be utilized for producing high-quality cooking oils by blending with other oils and promoting cardioprotective effects by reducing inflammation, lowering blood pressure, and improving cholesterol levels [28]. Cluster IV, with high oil (41.46%), starch (0.16%), moisture (6.10%), and linolenic acid (61.00%) contents, is ideal for diverse industrial applications. Its high oil content suits culinary uses, including blending with other oils for cooking and baking. Additionally, the linolenic acid-rich oil offers nutraceutical potential for dietary supplements and functional foods, supporting heart health and reducing inflammation [6, 8].

PCA Analysis

Dominant factors display exhibit high component loadings, indicating the significance of each variable to the respective PC. Positive loadings suggest a higher presence of the specific factor on the positive axis of the corresponding PC, in the present study, under PCA (A), oil (FL1: 0.45), phenols (FL2: 0.32), FRAP (FL2: 0.25), linolenic acid (FL1: 0.13, FL2: 0.600), starch (FL4: 0.62), and oleic acid (FL4: 0.41), showed positive factor loadings (Fig. 3A). In PCA (B), Al (FL1: 0.334, FL2: 0.271, FL4: 0.39), Co (FL2: 0.281), Cr (FL1: 0.369), Cu (FL1: 0.29), Fe (FL1: 0.333: FL4: 0.38), K (FL1: 0.320), Mg (FL1: 0.331), Mn (FL1: 0.322; FL2: 0.330), Na (FL3: 0.35), Ni (FL1: 0.26, FL2: 0.26), and Zn (FL3: 0.348) showed positive factor loadings (Fig. 3B). Conversely, negative loadings indicate a higher presence on the negative axis. Correspondingly, under PCA (A), moisture (FL1: -0.23, FL3: -0.45), ash (FL3: -0.57), oil (FL3: -0.32), protein (FL1: -0.43; FL4: -0.21), TSS (FL1: -0.36; FL4: -0.45), starch (FL1: -0.23); phenols (FL1: -0.33), FRAP (FL3: -0.370), palmitic acid (FL1: -0.38; FL2: -0.28; FL4: -0.29), oleic acid (FL3: -0.28), and linoleic acid (FL2: -0.52) showed negative loadings (Fig. 3A). Similarly, under PCA (B), minerals including, Ca (FL3: -0.39, FL4: -0.28, FL5: -0.51), Co (FL4: -0.535), Fe (FL3: -0.274), Mg (FL2: -0.35), Mn (FL4: -0.31), Mo (FL2: -0.39), Na (FL5: -0.38), P (FL2: -0.35, FL3: -0.55), and Zn (FL2: -0.32) depicted negative loadings.

Fig. 3
figure 3

PCA biplots depicting the distribution of (A) nutritional parameters and (B) minerals in perilla germplasm comprising 45 samples

Correlation Analysis

In the present study, Pearson’s correlation test was used to evaluate the relationships among all the nutritional parameters in perilla seeds. Significance levels of p < 0.05, p < 0.01, and p < 0.001, denoted by *, **, and ***, respectively, were utilized, where Pearson’s r > 0 indicated a significant positive correlation and r < 0 indicated a significant negative correlation (Fig. 4). Factors such as genetic variation, bioclimatic variables, growth conditions, agronomic practices, soil type, genotype-environment-management (GEM) interactions, and geographical location may induce considerable variations in the nutritional and biochemical composition of seeds [39].

Fig. 4
figure 4

The correlation matrix depicts the values of Pearson’s r using a heat map. Pearson’s r ranges from + 1 to -1, with darker colors representing positive correlations and warmer colors representing negative correlations. The significance levels are denoted by asterisks (*): wherein, *, **, and *** indicates p < 0.05, p < 0.01, and p < 0.001, respectively

Oil content was negatively correlated with protein content (r = -0.52), which can be attributed to the competition for carbon skeletons during biosynthesis. High protein content implies a greater demand for carbon skeletons, thus leaving fewer available for lipid biosynthesis, thus reducing oil accumulation. Additionally, enzymes involved in protein synthesis might also compete for substrates required for lipid synthesis, further diminishing oil content. Oil content was also negatively correlated with TSS (r = -0.4), starch (r = -0.39), and phenols (r = -0.56). In agreement with the earlier studies, this negative correlation may be due to high oil content altering the food matrix by binding to phenols and other bioactive components, limiting their availability [40, 41]. Additionally, high oil content can encapsulate TSS and starch molecules within the lipid phase, reducing their solubility and measured concentrations.

Linolenic acid was negatively correlated with palmitic acid (r = -0.52), oleic acid (r = -0.48), and linoleic acid (r = -0.76). These negative correlations can be explained by the competitive nature of fatty acid biosynthesis pathways and regulatory mechanisms governing lipid metabolism. Palmitic (16:1) acid is synthesized through de novo fatty acid synthesis, which competes with the elongation and desaturation steps involved in the biosynthesis of polyunsaturated fatty acids (PUFAs) such as linolenic acid. Increased synthesis of palmitic acid may divert resources and substrates away from the pathways (such as synthesis of palmitoleic acid), resulting in a negative correlation. Oleic acid (18:1) is typically formed from the desaturation of stearic acid (18:0). The desaturation process, catalyzed by enzymes such as stearoyl-CoA desaturase, can compete with the enzymes involved in the desaturation of linoleic acid to form linolenic acid. Therefore, increased production of oleic acid may inhibit the biosynthesis of linolenic acid, thus leading to a negative correlation. The negative correlation between linolenic acid and linoleic acid can be attributed to the competition for enzymatic resources and substrate availability within the fatty acid biosynthesis pathway. Linoleic acid (18:2) is synthesized through the desaturation of oleic acid, a process regulated by enzymes such as delta-12 and delta-15 desaturases. High delta-15 desaturases can lead to enhanced production of linolenic acid, consequently depleting the pool of linoleic acid. Therefore, this competition for enzymatic resources and substrate availability results in a negative correlation between linolenic acid and linoleic acid levels in perilla seeds.

Proteins showed positive correlation with calcium, copper, iron, potassium, magnesium, and manganese. These minerals are essential components and cofactors for numerous enzymatic proteins in plants. For instance, iron is integral to catalase, ferritin, SOD, cytochromes, and electron transfer proteins involved in respiration and photosynthesis. It is also crucial for heme and iron-sulfur cluster proteins, which play significant roles in respiration and nitrogen assimilation processes [42]. Calcium activates enzymes like calmodulin, copper serves as cofactor of polyphenol oxidase (PPO), cytochrome c oxidase, SOD and also helps in post-translational modifications [43]. Additionally, magnesium is vital for photophosphorylation and protein synthesis [44]. Furthermore, the TOPSIS analysis categorized germplasms into five classes based on nutritional attributes and applications, with specific genotypes highlighted for each category, as detailed in the supplementary material section (Fig. S2).

Conclusions

The present study highlights the significant potential of perilla seeds across various sectors, driven by their rich content of essential nutrients such as oils, proteins, antioxidants, beneficial fatty acids, and minerals. The analysis revealed diverse genotypes with significant variability in nutritional parameters, leading to the categorization of germplasm into four distinct clusters, each with specific applications: Cluster I is suitable for skincare formulations and dietary supplements, Cluster II for cosmetics and livestock feed, Cluster III for cooking oils and skincare products, and Cluster IV for culinary and nutraceutical uses. Notable genotypes include IC-0615386 and NL-5 for heart health, IC-0615378 and RCPS-421 for functional foods, IC-0615365 and NL-46 for anemia treatment, and IC-0615365 and RCPS-400 for combating PEM. Additionally, IC-0615367 and IC-0615391 were found optimal for designing food products based on their techno-functional properties. Future research should focus on developing novel cultivars with enhanced nutritional and functional attributes, improving yield and productivity, and addressing challenges like oil rancidity and market demand.