Articles

Access the latest knowledge in applied science, electrical engineering, computer science and information technology, education, and health.

Filter Icon

Filters article

Years

FAQ Arrow
0
0

Source Title

FAQ Arrow

Authors

FAQ Arrow

30,096 Article Results

Transformer-based Hindi image description and storytelling using enhanced attention and FastText embeddings

10.11591/ijai.v15.i2.pp1771-1782
Anjali Sharma , Mayank Aggrwal , Jitin Khanna
This work presents a novel image description generation framework that combines a Transformer-based encoder-decoder architecture with a custom squeeze-and-excitation (SE) attention block integrated into an EfficientNet feature extractor. The decoder uses FastText embeddings specifically trained for Hindi and is evaluated on the Microsoft common objects in context (MS-COCO) dataset. To improve the captioning process, the model incorporates a generative pre-trained transformer (GPT) module to generate narrative descriptions based on the initial captions and applies multiple similarity metrics to assess output quality. The proposed system significantly outperforms existing methods, achieving high bilingual evaluation understudy (BLEU) scores (BLEU-1 to BLEU-4: 83.24, 73.17, 64.56, and 58.22), a consensus-based image description evaluation (CIDEr) score of 81.41, an F1 score of 90.29, and a metric for evaluation of translation with explicit ordering (METEOR) score of 81.18, indicating strong caption accuracy. Furthermore, the model achieves low error rates, with a word error rate (WER) of 15% and a character error rate (CER) of 11%. This work highlights the challenges of applying large-scale datasets like MS-COCO to resource-limited languages and demonstrates the effectiveness of integrating FastText embeddings with transformer-based models for Hindi image captioning.
Volume: 15
Issue: 2
Page: 1771-1782
Publish at: 2026-04-01

Energy-efficient virtual machine allocation using directional and boundary-aware bobcat optimization

10.11591/ijai.v15.i2.pp1286-1299
Nida Kousar Gouse , Gopala Krishnan Chandrasekaran
Cloud computing (CC) has gained significant traction due to its ability to deliver services in a scalable and adaptable manner, catering to diverse user requirements. However, in virtualization technology, one of the primary challenges is managing the energy consumption required to maintain service quality, as it directly impacts the operational expenses of data centers. To address this challenge, this research proposes a directional movement and boundary-aware strategy-based bobcat optimization algorithm (DMBABOA) for energy-efficient virtual machine (VM) allocation aimed at minimizing energy consumption in cloud environments. The directional search and boundary-aware correction enhance convergence and ensure feasible resource distribution. This ensures effective utilization of resources, improved virtualization management, and substantial energy savings. The experimental findings establish that the proposed DMBABOA optimizer reaches a minimum execution time of 134.48 s when the number of VMs is equal to 1,200 with 200 users, compared to existing methods such as the metaheuristic VM allocation approach to power efficiency of sustainable cloud environment (MV-PESC).
Volume: 15
Issue: 2
Page: 1286-1299
Publish at: 2026-04-01

Heart disease detection and classification using grid search with random forest

10.11591/ijai.v15.i2.pp1300-1315
Ramakrishna Reddy Badveli , Nijaguna Gollara Siddappa , Sundeep Kumar Kanipakapatnam
Cardiovascular disease (CVD) is basically stated as heart disease, is a significant impact of mortality rate in worldwide. Diagnosing heart disease is challenging because of the complexity of patient data, which establishes multiple categories of the disease and also irrelevant features, making it difficult to achieve classification accuracy. This research proposed a grid search with random forest (GS-RF) approach, which effectively identifies heart disease and significantly enhances classification accuracy by fine tuning the random forest (RF) approach. It optimizes key hyperparameters like number of trees and greater number of features, improving model performance. The chaotic maps-based dwarf mongoose optimization (CMDMO) is used for feature selection, which efficiently selects the relevant feature and prevents the algorithm from getting trapped in local minima. The classification using grid search’s effectiveness ensures that resources are spent on finding the best model rather than performing random, less efficient tuning. The proposed GS-RF model demonstrates high classification performance, achieving 99.43% accuracy on Cleveland dataset, while also attaining 99.10% accuracy on Statlog dataset, thereby confirming its robustness and effectiveness across different datasets. The proposed approach is evaluated in comparison with existing classification techniques, such as support vector machine (SVM), to demonstrate its greater effectiveness with respect to accuracy.
Volume: 15
Issue: 2
Page: 1300-1315
Publish at: 2026-04-01

Enhanced VGG-19 model for rice plant disease detection and classification

10.11591/ijai.v15.i2.pp1691-1700
Aye Thida Win , Khin Mar Soe , Myint Myint Lwin
Rice is the main staple food and rice farming plays a crucial role in the agriculture sector of Myanmar. It is also an essential pillar in generating foreign income. However, rice diseases seriously reduced the rice production and quality. Early detection of rice diseases is one of the effective ways to reduce the disease spreading and increase yields. Most Myanmar farmers detect rice diseases based on visual judgment and their experience, which leads to delay in taking efficient action. To overcome this challenge, we intend to propose an enhanced rice plant disease classification model that contributes as artificial intelligence (AI) in Myanmar agriculture sector. The proposed model enhances original visual geometry group 19 (VGG-19) by integrating the algorithms: mixture of Gaussians 2 (MOG2), GrabCut, and relevance estimation with linear feature (RELIEF) for classification. It was trained on 6,326 rice plant images of Kaggle and Eastern Shan State and validated using 5-fold nested cross-validation. The training and testing of proposed model are followed as 80:20. The proposed model experimental result is (98.3%) and lowest standard deviation (0.004) across seven classes than the original VGG-19, MobileNet, Efficient Net, and RestNet50 respectively. Future work will expand dataset diversity, enhance early-stage disease prediction, and support mobile diagnostics for real-world agricultural application.
Volume: 15
Issue: 2
Page: 1691-1700
Publish at: 2026-04-01

Automated classification of apple bruises from hyperspectral images: an approach for fruit quality assessment

10.11591/ijai.v15.i2.pp1381-1389
Peddireddy Venkateswara Reddy , Alaguchamy Parivazhagan
Apple bruise detection plays a crucial role in post-harvest quality control; however, conventional manual inspection remains labor-intensive, subjective, and unsuitable for large-scale industrial deployment. This study proposes an automated classification framework for identifying bruised regions in apples using hyperspectral imaging combined with deep learning and adaptive optimization techniques. The proposed model integrates a long short-term memory (LSTM) network optimized using an adaptive sand cat swarm optimization (ASCSO) algorithm, along with a ResNet-50 feature extraction backbone. The adaptive behavior embedded within ASCSO dynamically adjusts the optimization parameters to enhance convergence and prevent premature stagnation during LSTM hyperparameter tuning. Hyperspectral images were processed to extract relevant spectral–spatial features, which were subsequently fed into the optimized classifier. Experimental evaluations demonstrate that the proposed hybrid model significantly outperforms conventional and baseline deep learning approaches, achieving a classification accuracy of 98.0% while maintaining robustness across varying bruise patterns and intensity levels. The results highlight the effectiveness of combining hyperspectral imaging with adaptive deep learning optimization for high-precision fruit quality assessment. This research contributes a reliable, scalable solution for automated bruise detection and quality grading in the fruit supply chain, offering strong potential to reduce post-harvest losses and improve operational efficiency in the agro-food industry.
Volume: 15
Issue: 2
Page: 1381-1389
Publish at: 2026-04-01

A deep learning-based approach for hearing loss detection

10.11591/ijai.v15.i2.pp1701-1708
Deepa Deepa , Manjula Gururaj Rao
Millions of people across the world are affected by hearing loss and early detection is very important for effective intervention. The traditional hearing screening methods are effective but they often rely on specialized equipment and clinical resources, making them less accessible to common people. Hearing loss is a state that affects the ability to communicate, socially interact and overall quality of life. The advancements in recent years have aimed to enhance the accessibility and efficiency of hearing tests, mainly in remote areas. The accurate classification of hearing loss is essential for effective detection and treatment in audiology. This study presents a deep learning (DL)-based approach based on a feedforward neural network (FNN). This paper focuses on common causes like cerumen impaction, otitis media, and otosclerosis. The study tries to explore ways to improve the diagnosis of hearing loss. The goal is to develop solutions that make hearing screenings more accessible and cost-effective for populations with limited access to healthcare resources. The results show the advantages of DL models in supporting automated accurate classification of hearing loss for intelligent diagnostic systems in audiological healthcare.
Volume: 15
Issue: 2
Page: 1701-1708
Publish at: 2026-04-01

Gradient-based stochastic depth with convolutional neural network for coconut tree leaf disease classification

10.11591/ijai.v15.i2.pp1155-1165
Kavitha Magadi Gopalakrishna , Raviprakash Madenur Lingaraju , Ananda Babu Jayachandra
The coconut palm (Cocos nucifera) is vital plantation crop, valued for their different uses, ranging from their fruit to its trunk. In recent times, it has been observed that many coconut trees are affected by diseases that reduce production and weaken the strength of the coconut. The classification of coconut leaf diseases is challenging because of intra-class and inter-class variability. This research introduces the gradient-based stochastic depth (GSD) with convolutional neural network (CNN) technique to coconut leaf disease classification to overcome these challenges. The GSD technique is incorporated into every layer of the CNN, where it calculates the probability using gradient magnitudes and skips layers that contribute minimally to the classification. The images are segmented using the GrabCut segmentation algorithm, which isolates the leaf from the background using graph-based segmentation, helping to differentiate between various disease classes. The GSD with CNN algorithm obtains an accuracy of 96.42%, precision of 96.15%, recall of 95.87%, and F1-score of 95.93%, while comparing with existing algorithms.
Volume: 15
Issue: 2
Page: 1155-1165
Publish at: 2026-04-01

Genetic algorithm for generalized time-window assignment problem

10.11591/ijai.v15.i2.pp1261-1274
Ali Kansou , Bilal Kanso , Houssein Wehbe , Haydar Bazzi , Ali Mcheik
This paper presents a hybrid genetic algorithm (GA) for the generalized time-window assignment problem (GTWAP), a complex artificial intelligence (AI) scheduling challenge that involves assigning agents to resources under strict temporal and capacity constraints. Our method integrates a problem specific heuristics and a repair mechanism to generate feasible and high quality solutions. We provide a mathematical formulation for GTWAP and introduce a new public benchmark set, using CPLEX to obtain exact solutions. Computational experiments demonstrate that our GA is highly competitive with CPLEX, often matching its performance. This effectiveness makes our method a practical and scalable AI-driven tool for complex scheduling in domains like logistics and healthcare.
Volume: 15
Issue: 2
Page: 1261-1274
Publish at: 2026-04-01

Revolutionising essay writing: a systematic review of Google Gemini

10.11591/ijai.v15.i2.pp1839-1850
Shirley Ling Jen , Abdul Rahim Salam , Hamidah Mat , Wong Wei Lun
The emergence of generative artificial intelligence (GenAI) has significantly impacted the education sector in essay writing. This study focuses on Google Gemini as a viable alternative to ChatGPT. A systematic literature review (SLR) was conducted using preferred reporting items for systematic reviews and meta-analyses (PRISMA) method to investigate existing research on Gemini and its application in essay writing. The review examined articles published from 2022 to August 2024. It focuses on the years, research design, population, and learning theories involved in the use of Gemini. Several stages of the PRISMA method were implemented to filter and collect relevant information, resulting in a comprehensive analysis of articles discussing Gemini’s role in essay writing across various publication platforms. The findings highlight the functions of Gemini in essay writing. It provides valuable insights for researchers and practitioners in language teaching and learning. This research aims to enhance understanding and promote the effective use of Google Gemini in education.
Volume: 15
Issue: 2
Page: 1839-1850
Publish at: 2026-04-01

Development of rough set based machine learning approach to screen breast cancer

10.11591/ijai.v15.i2.pp1982-1998
Sangeetha Sivakumar , Shakeela Sathish , Debabrata Datta
One of the major causes of death for women is breast cancer. A substantial number of women diagnosed with breast cancer die due to inaccuracies in diagnosis and delays in treatment. Cancer prediction must be accurate in order to improve treatment quality and patient survival rates. This study evaluates logistic regression (LR), decision tree algorithm (DTA), and adaptive boosting (AdaBoost) (AB ensemble learning algorithm) in conjunction with rough set theory (RST) to enhance breast cancer classification using the Wisconsin diagnosis breast cancer dataset (WDBC). By employing rough set approximations, including the upper and lower bounds of features, this study introduces a novel rough AdaBoost (Rough AB) algorithm to improve classification accuracy. Various performance indices are compared across algorithms. The proposed Rough AB algorithm demonstrated superior performance, particularly in prediction accuracy for both benign and malignant cases. It incorporates roughness to determine the starting node of the decision stump, offering a significant improvement in ensemble learning techniques for medical diagnostics. It gives practical implications for clinical decision-making, potentially enabling more reliable and timely breast cancer diagnoses, which can significantly impact patient outcomes. The proposed method leverages rough set approximations to refine feature selection and improve prediction accuracy. Also, it positions RST as an explainable artificial intelligence (XAI) technique, highlighting its interpretability, ethical transparency, and potential integration with deep learning for clinical deployment.
Volume: 15
Issue: 2
Page: 1982-1998
Publish at: 2026-04-01

Semantic-syntactic graph network for aspect-based sentiment analysis

10.11591/ijai.v15.i2.pp1814-1824
Rekha Bdurga Harish , Neelambike Siddalingaiah
Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task that identifies sentiment polarities toward specific aspects within a sentence. While conventional models have achieved progress, they often neglect to jointly consider both semantic context and syntactic structure, limiting performance in complex linguistic scenarios. Nevertheless, most existing graph convolutional network (GCN)-based approaches have recently focused on either semantic or syntactic information individually, leading to suboptimal sentiment classification accuracy. Hence, this work aims to design an effective ABSA model that simultaneously captures both semantic relationships and syntactic dependencies for enhanced aspect-level sentiment analysis. For solving issues of GCN-based approaches, this work proposed a model called sentiment semantic syntactic network (SentSemSynNet), which constructs a unified graph by integrating semantic and syntactic features and applies graph neural networks to learn rich, aspect-specific representations. The model was evaluated on the SemEval2014 restaurant and laptop datasets. It achieved 88.25% accuracy and 82.95% macro-F-score for restaurant, and 84.52% accuracy and 80.26% macro-F-score for laptop. The model’s unique integration of both semantic and syntactic importance through a unified graph structure improved sentiment detection accuracy.
Volume: 15
Issue: 2
Page: 1814-1824
Publish at: 2026-04-01

Efficient text detection and recognition in natural scene images using novel blended ensemble deep learning

10.11591/ijai.v15.i2.pp1664-1679
Rajeswari Reddy Patil , Aradhana Dammergidda
Text detection and recognition in natural scene images is a critical task in computer vision, with applications ranging from document analysis to autonomous navigation. This work presents a robust and efficient pipeline that integrates YOLOv8 for text detection and EasyOCR for recognition, enhanced by an adaptive preprocessing mechanism between the two stages. The YOLOv8 model is trained on a custom dataset with polygonal annotations converted into YOLO format ensures precise bounding box formations around the text regions. An adaptive preprocessing module dynamically optimizes the detected regions adjusting resolution, noise reduction, and orientation before passing them to EasyOCR, significantly improving robustness. The lightweight yet powerful EasyOCR engine then recognizes text across diverse fonts, styles, and orientations. Evaluated on the benchmark Total-Text dataset, the proposed method demonstrates superior performance in detection accuracy, recognition precision, and computational efficiency. Additionally, this work provides a detailed analysis of training metrics, to validate the model’s robustness. The proposed system is scalable and can be integrated into real-time applications such as license plate recognition, document digitization, and assistive technologies for the visually impaired.
Volume: 15
Issue: 2
Page: 1664-1679
Publish at: 2026-04-01

Session click sentiment behavior aware personalized recommendations system

10.11591/ijai.v15.i2.pp1539-1547
Suraj Bevinahalli Suresh , Padma Muthalambikashettahally Cheluvegowda
Session-based recommendations use short-term behavior of users to provide personalized suggestions to consumers in ecommerce platform. However, cold start, considering newly joined users and sparsity issues, where not enough short-term behavior is available, and the performance of traditional session-based recommendations is significantly impacted. Deep learning (DL) like recurrent neural networks (RNNs), long short-term memory networks (LSTMs), and graph neural networks (GNNs) have been employed to capture session-clicks and enhance product recommendation accuracy. However, the current method is significantly affected due to the gradient descent problem in meeting convergence for top-K product recommendation. Further, the current method failed to capture product sentiment for session-clicks between inter-session and intra-session clicks. In addressing the research problems, the current research work introduced a session click sentiment behavior aware (SCSBA) personalized recommendation system using novel inter and intra session (IIS)-LSTM model. Finally, the objective function to recommend top K items to users is done using optimized Bayesian personalized ranking (OBPR) algorithm. Experiment outcome shows the SCSBA model achieves much better performance than state of art model, considering standard Tmall dataset.
Volume: 15
Issue: 2
Page: 1539-1547
Publish at: 2026-04-01

A sequential attention-enhanced deep learning framework for robust potato leaf disease diagnosis under real field conditions

10.11591/ijai.v15.i2.pp1790-1803
Watcharkorn Yoochomboon , Nithizethe Mhuadthongon , Piyaporn Krachodnok
Diagnosing potato leaf diseases from images collected in real-life field settings is challenging, mainly because of uneven lighting, complex backgrounds, and disease symptoms that are often subtle or visually inconsistent. In this study, a deep learning-based framework was developed to support potato leaf disease diagnosis, with particular attention given to improving generalization and interpretation. Several convolutional neural network (CNN) architectures were first examined under the same experimental conditions, and ResNeXt-50 showed the most stable overall performance. The model was then extended by applying efficient channel attention (ECA), followed by spatial attention adapted from the convolutional block attention module (CBAM). Test results indicate that this sequential attention design performs better than the baseline model as well as variants using only a single attention mechanism. Additional evaluation using 300 real-field images collected under different field conditions suggests improved robustness, while visualization results from gradient weighted class activation mapping (Grad-CAM) show clearer focus on lesion-related regions. Overall, the findings suggest that combining channel wise and spatial attention can improve both prediction reliability and interpretability, making the approach suitable for practical agricultural use.
Volume: 15
Issue: 2
Page: 1790-1803
Publish at: 2026-04-01

Unimodal and multimodal techniques for depression diagnosis: a comprehensive survey

10.11591/ijai.v15.i2.pp1947-1954
Swathy Jayasree , Yashawini Sridhar
Depression is a common and major mental health condition that affects individuals across all age groups and any backgrounds, severely reducing their physical, emotional, and cognitive functioning. It goes beyond typical mood swings and requires a timely and accurate diagnosis to prevent severe consequences such as suicidal tendencies, self-harm, and long-term mental decline. The improving performance of deep learning and machine learning techniques has significantly enhanced the speed and accuracy of depression diagnosis using both unimodal and multimodal features. This comprehensive study gives a complete overview of the unimodal and multimodal methods used to diagnose depression in its early stages. Additionally, this survey summarizes the dataset, methods, and limitations of previous work presented in the domain of depression diagnosis and serves as a suitable reference for future analysis.
Volume: 15
Issue: 2
Page: 1947-1954
Publish at: 2026-04-01
Show 17 of 2007

Discover Our Library

Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.

Explore Now
Library 3D Ilustration