Articles

Access the latest knowledge in applied science, electrical engineering, computer science and information technology, education, and health.

Filter Icon

Filters article

Years

FAQ Arrow
0
0

Source Title

FAQ Arrow

Authors

FAQ Arrow

30,376 Article Results

Javanese and Sundanese speech recognition using Whisper

10.11591/csit.v6i3.p253-261
Alim Raharjo , Amalia Zahra
Automatic speech recognition (ASR) technology is essential for advancing human-computer interaction, particularly in a linguistically diverse country like Indonesia, where approximately 700 native languages are spoken, including widely used languages like Javanese and Sundanese. This study leverages the pre-trained Whisper Small model an end‑to‑end transformer pretrained on 680,000 hours of multilingual speech, fine tuning it specifically to improve ASR performance for these low resource languages. The primary goal is to increase transcription accuracy and reliability for Javanese and Sundanese, which have historically had limited ASR resources. Approximately 100 hours of speech from OpenSLR were selected, covering both reading and conversational prompts, the data exhibited dialectal variation, ambient noise, and incomplete demographic metadata, necessitating normalization and fixed‑length padding. with model evaluation based on the word error rate (WER) metric. Unlike approaches that combine separate acoustic encoders with external language models, Whisper unified architecture streamlines adaptation for low‑resource settings. Evaluated on held‑out test sets, the fine‑tuned models achieved Word Error Rates of 14.97% for Javanese and 2.03% for Sundanese, substantially outperforming baseline systems. These results demonstrate Whisper effectiveness in low‑resource ASR and highlight its potential to enhance transcription accuracy, support language preservation, and broaden digital access for underrepresented speech communities. 
Volume: 6
Issue: 3
Page: 253-261
Publish at: 2025-11-01

Hybrid feature fusion from multiple CNN models with bayesian-optimized machine learning classifiers

10.11591/csit.v6i3.p315-325
Dewi Rismawati , Sugiyarto Surono , Aris Thobirin
Information technology advancements have created big data, necessitating efficient techniques to retrieve helpful information. With its capacity to recognize and categorize patterns in data, especially the growing amount of picture data, deep learning is becoming a viable option. This research aims to develop a medical image classification model using chest X-Ray with four classes, namely Covid-19, Pneumonia, Tuberculosis, and Normal. The proposed method combines the advantages of deep learning and machine learning. Three pre-trained CNN models, VGG16, DenseNet201, and InceptionV3, extract features from images. The features generated from each model are fused to enhance the relevant information. Furthermore, principal component analysis (PCA) was applied to reduce the dimensionality of the features, and Bayesian optimization was used to optimize the hyperparameters of the machine learning algorithms support vector machine (SVM), decision tree (DT), and k-nearest neighbors (k-NN). The resulting classification model was evaluated based on accuracy, precision, recall, and F1-score. The results showed that FF-SVM, which is the proposed model, achieved an accuracy of 98.79% with precision, recall, and F1-score of 98.85%, 98.82%, and 98.84%, respectively. In conclusion, fusing feature extraction from multiple CNN models improved the classification accuracy of each machine-learning model. It provided reliable and accurate predictions for lung image diagnosis using chest X-Ray.
Volume: 6
Issue: 3
Page: 315-325
Publish at: 2025-11-01

Optimizing energy distribution efficiency in wireless sensor networks using the hybrid LEACH-DECAR algorithm

10.11591/csit.v6i3.p262-273
Muhammad Abyan Nizar Muntashir , Vera Noviana Sulistyawan , Noor Hudallah
Wireless sensor network (WSN) is a network system consisting of various supporting components that integrate information to the base station. In its operation, delivery is greatly influenced by energy usage because limited battery supply causes variability in energy consumption on node activity factors, communication distance, and environmental conditions. So, in order to increase performance and energy efficiency, a routing protocol is required by selecting the best path through cluster head. The technique of determining the cluster head (CH) based on energy is used to avoid irregularity (randomness). In this study, the hybrid routing protocol selects CH based on the remaining energy, considering distance, coverage radius, and energy metrics. The system test evaluation compares the implementation of low-energy adaptive clustering hierarchy (LEACH) and hybrid LEACH- Distributed, energy and coverage-aware routing (DECAR). The results of 300 rounds show that the hybrid achieves a packet delivery ratio close to 100% and a throughput of 78.22 Kbps, while LEACH achieves a packet delivery ratio of 92.18% and a throughput of 247.15 Kbps. The average energy consumption of LEACH is 99.27%, while the hybrid shows much greater efficiency at 30.55%. This study emphasizes the significance of maintaining equilibrium performance and energy consumption in the development of future routing protocols.
Volume: 6
Issue: 3
Page: 262-273
Publish at: 2025-11-01

Predictive model for high-risk healthcare clients and claims frequency

10.11591/csit.v6i3.p346-354
Lenias Zhou , Mainford Mutandavari , Lucia Matondora
Global healthcare spending surged to approximately USD 9.8 trillion in the aftermath of the COVID-19 pandemic, intensifying the need for effective risk management strategies in healthcare insurance. This study proposes a predictive model designed to identify high-risk clients for timely targeted interventions and to forecast claims frequency for optimized resource allocation. A real-world claims dataset from a healthcare insurance provider was utilized. Bayesian optimization was employed to enhance data labelling. A deep learning (DL) model with sigmoid activation was used to classify high-risk clients, while a regression model forecasted claims frequency. The model was trained and validated, and gave an accuracy of 97%, a precision of 95.2%, a recall of 98.1% and an F1-score of 96.6%. The results confirmed the model’s accuracy in identifying high-risk clients and its ability to provide reliable forecasting of future claims frequency. Importantly, the model also provided the reason behind its classification decision, enhancing transparency and trust. This research provides valuable data-driven insights to both the healthcare insurers and clients, giving them the power to stay ahead in managing key risks, which ultimately reduces the cost of healthcare insurance. This work contributed a scalable and interpretable solution for risk prediction in healthcare insurance.
Volume: 6
Issue: 3
Page: 346-354
Publish at: 2025-11-01

Multi-visual modality for collaborative filtering-based personalized POI recommendations

10.11591/ijeecs.v40.i2.pp978-987
Sudarat Arthan , Kreangsak Tamee
Point-of-interest (POI) recommendation systems help users discover locations that match their interests. However, these systems often suffer from data sparsity due to limited user check-in history. To address this challenge, this study proposed a novel user profiling framework that incorporates multiple visual modalities derived from user-generated photos. Three types of visual-based user profiles were constructed: image label-based, image feature-based, and a fused profile, combining both modalities through score-level fusion. We conducted extensive experiments on two real-world datasets. The results demonstrate that visual-based profiles, particularly the image feature-based profile, consistently improve recommendation performance under sparse data conditions. Although the fused profile offered stable results, it did not consistently outperform the single modality. Furthermore, performance was sensitive to the number of nearest neighbors and the amount of training data. These findings highlight the importance of modality selection and fusion strategy in visual-based POI recommendation systems.
Volume: 40
Issue: 2
Page: 978-987
Publish at: 2025-11-01

Automated defect detection in submersible pump impellers using image classification

10.11591/ijeecs.v40.i2.pp1158-1166
Deepa Somasundaram , V. Pramila , G. Ezhilarasi , D. Lakshmi , P. Kavitha , R. Kalaivani
Casting is a crucial manufacturing process used to produce complex metal parts, but it is often plagued by defects such as cracks, porosity, shrinkage, and cold shuts, which can compromise quality and lead to financial losses. Traditional visual inspection methods for detecting these defects are slow and prone to human error, making them inefficient for large-scale production. This project proposes automating the defect detection process using advanced AI-powered non-destructive testing (NDT) techniques. Specifically, convolutional neural networks (CNNs), a deep learning model, are employed for real-time visual inspection of castings. CNNs, trained on high-resolution images, can accurately identify surface defects such as cracks, scratches, and dimensional irregularities, significantly improving inspection speed and accuracy. The performance metrics of the system include defect detection accuracy, false positive and false negative rates, processing time, and scalability for high-volume production environments. By minimizing human intervention, this automated system reduces error rates, enhances product quality, and lowers production costs. Furthermore, the real-time capabilities of CNNs allow for rapid feedback, preventing defective parts from advancing through the production line. Overall, the integration of AI-based vision systems boosts efficiency, sustainability, and profitability in manufacturing, ensuring castings meet customer specifications with minimal errors.
Volume: 40
Issue: 2
Page: 1158-1166
Publish at: 2025-11-01

Enhancing the ternary neural networks with adaptive threshold quantization

10.11591/ijeecs.v40.i2.pp700-706
Son Ngoc Truong
Ternary neural networks (TNNs) with weights constrained to –1, 0, and +1 offer an efficient deep learning solution for low-cost computing platforms such as embedded systems and edge computing devices. These weights are typically obtained by quantizing the real weight during the training process. In this work, we propose an adaptive threshold quantization method that dynamically adjusts the threshold based on the mean of weight distribution. Unlike fixed-threshold approaches, our method recalculates the quantization threshold at each training epoch according to the distribution of real valued synaptic weights. This adaptation significantly enhances both training speed and model accuracy. Experimental results on the MNIST dataset demonstrates a 2.5× reduction in training time compared to conventional methods, with a 2% improvement in recognition accuracy. On Google Speech Command dataset, the proposed method achieves an 8% improvement in recognition accuracy and a 50% reduction in training time, compared to fixed-threshold quantization. These results highlight the effectiveness of adaptive quantization in improving the efficiency of TNNs, making them well-suited for deployment on resource constrained edge devices.
Volume: 40
Issue: 2
Page: 700-706
Publish at: 2025-11-01

Maximizing QoS in railway radio networks: leaky cable and ray-tracing for optimal BER on bridges

10.11591/ijeecs.v40.i2.pp678-686
Maksim Sidorovich , Ponomarchuk Yulia
The future railway mobile communication system (FRMCS) standard is crucial for advancing railway communication and implementing intelligent train control systems. This research focuses on development of an efficient modeling method to evaluate and optimize FRMCS performance on railway bridges, particularly under high-density modulation and radio noise interference. The key aspect of this study involves computer modeling of the deployment of a leaky coaxial cable (LCX) and comparison of its performance to traditional methods of radio coverage modeling. Using the single-slot radiation pattern, we evaluate the quality of radio communication by comparison of the bit error rate (BER) metrics for the Ray Tracing propagation model with and without the use of LCX. The results show that the use of LCX significantly reduces BER values, providing a much clearer and more reliable signal. This improvement is crucial for the safety and reliability of railway operations, ensuring effective communication for train control and reducing the risk of accidents in complex and high-demanding transport networks. This research contributes to the optimization of railway information infrastructure, with the aim of ensuring safe, reliable, and efficient operations.
Volume: 40
Issue: 2
Page: 678-686
Publish at: 2025-11-01

MQTT live performance on the INA-CBT communication system: a measurement-based evaluation

10.11591/ijeecs.v40.i2.pp687-699
A. A. N. Ananda Kusuma , Tahar Agastani , Rifqi F. Giyana , Sakinah P. Anggraeni , Arfan R. Hartawan , Toto B. Palokoto , Widrianto S. Pinastiko
Cable-based tsunameters have been deployed in Indonesia under the name of the INA-CBT project. Currently, the system operated at the Labuan Bajo landing station works well and sends aggregated data from the seafloor sensors to a central or read down station in Jakarta for further processing. The current scheme makes use of a publish and subscribe indirect communication among the landing station (LS) as the publisher and various clients as subscribers for the sensor data. Message queue telemetry transport (MQTT) was selected as the application-layer protocol for implementing this communication scheme. This paper presents a measurement-based evaluation of the MQTT live performance by observing the MQTT messages’ latencies received at the subscriber of the INA-CBT’s MQTT broker. The results give insight on the general achievable performance of the INA-CBT communication system in providing reliable data for the tsunami detection system. Furthermore, the results obtained can be used as communication parameters for making a more realistic virtual testbed for designing a more appropriate and scalable CBT system.
Volume: 40
Issue: 2
Page: 687-699
Publish at: 2025-11-01

Exploring stock price portfolio clusters in foreign exchange markets

10.11591/ijeecs.v40.i2.pp735-744
Challa Madhavi Latha , S. Bhuvaneswari , K. L. S. Soujanya , A. Poongodai
This study explores a novel portfolio management approach dividing the currency pairs into clusters of periodic returns. The primary purpose is to improve diversification and risk-return ratios with currencies. This research studied USD, Euro, and Chinese Yuan to collect historical data from April 2012 to March 2022. The present study makes use of K-means clustering to find clusters of assets with similar return patterns, which constitute diversified portfolios. Optimized portfolio vs. benchmark portfolio performance was also evaluated based on critical performance measures like cumulative return, Sharpe ratio, and volatility. The clustering approach was also tested through sensitivity analysis to check how market-specific it is. The results suggest that more clustered portfolios outperform traditional benchmarks and provide a better risk-adjusted return. The conclusion drawn here from the findings is that portfolio segmentation is a superior approach because of risk management in ever-changing volatile markets and identifying situations that link currency pairs. This is beneficial for those investors and portfolio managers looking to maximize their foreign exchange (FOREX) investments by allowing greater visibility into how the market is functioning, which can, in turn, improve decision-making processes. According to the study, portfolio clustering substantially enhances a portfolio's return for the foreign exchange market.
Volume: 40
Issue: 2
Page: 735-744
Publish at: 2025-11-01

Development and integration of a privacy computing gateway for enhanced interoperability

10.11591/ijeecs.v40.i2.pp1011-1022
Akhila Reddy Yadulla , Vinay Kumar Kasula , Bhargavi Konda , Mounica Yenugula , Supraja Ayyamgari
A new design of privacy computing gateway stands as the solution to secure efficient interoperability between heterogeneous platforms. The growing importance of data privacy, along with rising collaborative data analysis operations, creates an immediate need for standardized privacy-preserving frameworks that are adaptable to diverse situations. A three-layered architecture consisting of application protocol and communication layers receives support from an Adaptation mechanism designed for compatibility between separate privacy computing systems. Testing of the framework uses standard machine learning methods together with horizontal and vertical federated learning using diverse data quantities and feature distribution patterns. The gateway achieves satisfactory model performance and protects data privacy integrity in combination with platform interoperability. area under the curve (AUC) along with F1 score metrics, proves that the proposed system reaches performance equivalence with centralized models when operating within privacy-limited environments. The research introduces an effective solution for securing cross-platform data sharing that will enable secure inter-sector collaboration in finance, healthcare, and government applications.
Volume: 40
Issue: 2
Page: 1011-1022
Publish at: 2025-11-01

Impact of artificial light color on microgreen green spinach growth in an IoT-controlled environment

10.11591/ijeecs.v40.i2.pp619-628
Fadhil Azmi Ihsan , Devi Fitrianah
This study investigates the effect of different artificial colors red-blue and white on the growth of green spinach microgreens under an internet of things (IoT) based controlled environment and integrated sensors: DHT22 for temperature and humidity, and YL-69 for soil moisture. The experiment compared plant growth in two lighting scenarios over 10 days evaluating parameters including plant height and number of leaves. Results indicate that spinach microgreens grown under red-blue LED light achieved a slightly higher average height of 4.6cm and more leaves of 50 compared to white LED light with an average height of 4.5cm and 36 leaves. Although the difference between the two lighting conditions appears minor, a t-test was conducted to determine statistical significance. The results show that the difference in the number of leaves is statistically significant, suggesting that morphological responses particularly leaf growth take precedence over vertical steam elongation as an adaptive strategy to optimize environmental conditions.
Volume: 40
Issue: 2
Page: 619-628
Publish at: 2025-11-01

Improved YOLOv8 for rail squat detection and identification

10.11591/ijeecs.v40.i2.pp1129-1140
Van-Dinh Do , Phuong-Ty Nguyen , Minh-Tuan Ha
Rail transport plays a vital part in the country's economy by ensuring the safe movement of both goods and passengers. Therefore, maintaining rail safety through consistent surface defect inspection is extremely importan. However, squat defect detection on rail surfaces faces considerable difficulties due to weather impacts, lighting changes, and variations in image contrast. These challenges hinder the accuracy and reliability of traditional inspection methods. To solve this problem, this study proposes an improved YOLOv8 model for the identification and classification of squat defects. The methodology involves capturing images of the rail track, preprocessing them to enhance image quality, labeling squat defects for training purposes, and training the proposed model using the labeled dataset. The improved YOLOv8 model incorporates enhancements such as multi-scale convolution modules and attention mechanisms to improve feature extraction and defect recognition. Experimental results demonstrate the effectiveness of the proposed method, achieving an impressive accuracy of 0.92 in detecting and categorizing squat defects. These findings highlight the potential of the proposed approach to enhance railway safety by providing a reliable and efficient solution for rail surface inspection.
Volume: 40
Issue: 2
Page: 1129-1140
Publish at: 2025-11-01

Precision in 3D positional forecasting with machine learning and deep temporal architectures

10.11591/ijeecs.v40.i2.pp601-609
P. Sirish Kumar , V. B. S. Srilatha Indira Dutt , Sai Kiran Oruganti
We present a comparative analysis of traditional machine learning (ML) models, random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB), and deep learning (DL) architectures, convolutional neural networks (CNN), and bidirectional long short-term memory (BiLSTM) for high-precision 3D positional forecasting. Conventional approaches often underperform when modeling complex spatiotemporal dependencies, limiting their use in dynamic systems such as robotics and autonomous vehicles. This study highlights BiLSTM's advantage in learning bidirectional temporal features, achieving superior R² scores and stable prediction intervals compared to both classical ML and spatially-focused CNN models. Uncertainty metrics, prediction interval coverage probability (PICP), and mean prediction interval width (MPIW) provide additional insight into model reliability. Experiments on a 22-hour GPS dataset confirm that BiLSTM achieves both high accuracy and predictive confidence, underscoring its suitability for real-world trajectory forecasting.
Volume: 40
Issue: 2
Page: 601-609
Publish at: 2025-11-01

Deep-learning-based hand gestures recognition applications for game controls

10.11591/ijeecs.v40.i2.pp883-897
Huu-Huy Ngo , Hung Linh Le , Man Ba Tuyen , Vu Dinh Dung , Tran Xuan Thanh
Hand gesture recognition is among the emerging technologies of human computer interaction, and an intuitive and natural interface is more preferable for such applications than a total solution. It is also widely used in multimedia applications. In this paper, a deep learning-based hand gesture recognition sys tem for controlling games is presented, showcasing its significant contributions toward advancing the frontier of natural and intuitive human-computer interac tion. It utilizes MediaPipe to get real-time skeletal information of hand land marks and translates the gestures of the user into smooth control signals through an optimized artificial neural network (ANN) that is tailored for reduced com putational expenses and quicker inference. The proposed model, which was trained on a carefully selected dataset of four gesture classes under different lighting and viewing conditions, shows very good generalization performance and robustness. It gives a recognition rate of 99.92% with much fewer param eters than deeper models such as ResNet50 and VGG16. By achieving high accuracy, computational speed, and low latency, this work addresses some of the most important challenges in gesture recognition and opens the way for new applications in gaming, virtual reality, and other interactive fields.
Volume: 40
Issue: 2
Page: 883-897
Publish at: 2025-11-01
Show 129 of 2026

Discover Our Library

Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.

Explore Now
Library 3D Ilustration