Information Technology and Control https://itc.ktu.lt/index.php/ITC <p>Periodical journal <em>Information Technology and Control / Informacinės technologijos ir valdymas</em> covers a wide field of computer science and control systems related problems. All articles should be prepared considering the requirements of the journal. Please use <a style="font-size: normal; text-decoration: underline;" href="https://itc.ktu.lt/public/journals/13/Guidelines for Preparing a Paper for Information Technology and Control (5).doc.rtf">„Article Template“</a> to prepare your paper properly. Together with your article, please submit a signed <a href="https://itc.ktu.lt/public/journals/13/info/Authors_Guarantee_Form_ITC.DOCX">Author's Guarantee Form</a>.</p> en-US <p>Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.</p> robertas.damasevicius@ktu.lt (Prof. Robertas Damaševičius) itc@ktu.lt (Vilma Sukackė) Wed, 25 Jun 2025 00:00:00 +0300 OJS 3.2.1.1 http://blogs.law.harvard.edu/tech/rss 60 ADFN: Adaptive Dynamic Fusion Network for Real-time Multispectral Object Detection https://itc.ktu.lt/index.php/ITC/article/view/39803 <p><span class="TextRun SCXW262861959 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW262861959 BCX0" data-ccp-parastyle="Body Text">Multispectral object detection </span><span class="NormalTextRun SCXW262861959 BCX0" data-ccp-parastyle="Body Text">leverages</span><span class="NormalTextRun SCXW262861959 BCX0" data-ccp-parastyle="Body Text"> the complementary strengths of infrared (IR) and visible (VIS) modalities to improve detection accuracy. However, existing approaches often lack adaptability to dynamic lighting </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW262861959 BCX0" data-ccp-parastyle="Body Text">conditions, or</span> <span class="NormalTextRun SCXW262861959 BCX0" data-ccp-parastyle="Body Text">fail to</span><span class="NormalTextRun SCXW262861959 BCX0" data-ccp-parastyle="Body Text"> achieve real-time performance due to complexity. We propose the Adaptive Dynamic Fusion Network (ADFN), a novel architecture that integrates adaptive multi-path computation and attention-guided feature fusion to address these challenges. ADFN incorporates the Collaborative and Alternating Attention (CAA) modules for efficient feature alignment and the Adaptive Dynamic Pathway (ADP) strategy to dynamically adjust computational pathways based on lighting conditions, </span><span class="NormalTextRun SCXW262861959 BCX0" data-ccp-parastyle="Body Text">optimizing</span><span class="NormalTextRun SCXW262861959 BCX0" data-ccp-parastyle="Body Text"> the balance between accuracy and efficiency. Experiments on the FLIR2 and LLVIP datasets </span><span class="NormalTextRun SCXW262861959 BCX0" data-ccp-parastyle="Body Text">demonstrate</span><span class="NormalTextRun SCXW262861959 BCX0" data-ccp-parastyle="Body Text"> that ADFN achieves superior mAP@50-95 and real-time performance, </span><span class="NormalTextRun SCXW262861959 BCX0" data-ccp-parastyle="Body Text">showcasing</span><span class="NormalTextRun SCXW262861959 BCX0" data-ccp-parastyle="Body Text"> its robustness and efficiency across diverse environments. ADFN offers a practical solution for dynamic lighting conditions and resource-constrained multispectral object detection tasks.</span></span><span class="EOP SCXW262861959 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:862,&quot;335559737&quot;:132,&quot;335559738&quot;:131,&quot;335559740&quot;:219}"> </span></p> Lin Yang, Gangzhu Qiao Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/39803 Mon, 14 Jul 2025 00:00:00 +0300 A Prediction Method for Highway Traffic Flow Based on the IHPO-VMD-LSTM-Informer Model https://itc.ktu.lt/index.php/ITC/article/view/39228 <p>Accurate and timely predictions of highway traffic flow are crucial for implementing intelligent highway management. This paper introduces a novel prediction approach for highway traffic flow by employing the IHPO-VMD-LSTM-Informer model, aiming at enhancing prediction accuracy. Initially, key indicators measuring highway traffic are identified, and Nonlinear Principal Component Analysis (NPCA) is applied to minimize the dimensionality and interdependence among these indicators. This reduction process replaces the original complex indicators with fewer numbers of principal components, thereby simplifying the feature matrix's structure. Subsequently, Variational Modal Decomposition (VMD) processes historical highway traffic flow data, enhanced by the strategically improved Hunter-Prey Optimization (HPO) algorithm. This optimization facilitates adaptive parameter adjustments for the VMD, enabling effective decomposition of highway traffic flow time series data. The Sample Entropy (SE) of Intrinsic Modal Functions (IMFs) from this decomposition is used with the substantial indicators to form a comprehensive feature matrix. Then, the predictive module combines a Long Short-Term Memory (LSTM) network with the Informer architecture to accurately predict highway traffic flow from the feature matrix. The effectiveness of the proposed model is verified using a public motorway traffic dataset KDD CUP 2017. The results indicate that the proposed model outperforms available ones in terms of prediction accuracy, where MAPE and RMSE have 8.09 and 2,84, thus significantly advancing intelligent highway management.</p> Ruinan Wang, Yan Cao, Xingyu Ji, Di Qiao Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/39228 Mon, 14 Jul 2025 00:00:00 +0300 Improved Agricultural Machinery Navigation Algorithm Based on Machine Learning and Machine Vision Technology https://itc.ktu.lt/index.php/ITC/article/view/37912 <p>The automatic navigation of agricultural machinery is one of the important directions in intelligent agriculture research. To realize the automatic production of agricultural machinery, the automatic planning of the navigation route for agricultural machinery is the key. Considering the complexity of the agricultural production environment, the agricultural machinery navigation model is constructed based on binocular vision technology, and the optimized BP network is used to calibrate the binocular vision model. Considering the difficulty in crop identification by traditional machine vision technology, RGB space technology is used to complete image segmentation and noise processing. The optimized S-RANSAC algorithm is used to extract image features. The experimental results showed that in the multi-algorithm agricultural rice field image feature matching test, the S-RANSAC algorithm accurately identified the color difference, shape difference, and hydrological environment difference of seedlings. In contrast, other algorithms were unable to identify complex environmental features. At the same time, in the complex agricultural environment positioning test, the maximum error of the S-RANSAC algorithm was 4.16m, which was better than 5.17m of SURF and had the best positioning performance. It can be seen that the proposed technology has excellent application effects in practical scenarios, providing important technical references for the intelligent development of agriculture and the innovation of visual navigation technology.</p> Fengwu Zhu, Weijian Zhang, Qinglai Zhao, Xianzhang Meng, Chunkai Zhao, Weizhi Feng Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/37912 Mon, 14 Jul 2025 00:00:00 +0300 Multi-strategy Hybrid Improved Intelligent Algorithm for Solving UAV-MTSP https://itc.ktu.lt/index.php/ITC/article/view/40640 <p><span class="TextRun SCXW196582674 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text">Unmanned aerial vehicles (UAVs) have been increasingly used in fire monitoring and rescue operations, offering flexibility and efficiency. However, </span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text">d</span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text">etermining</span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text"> the shortest path for all UAVs to visit all regions is a crucial issue, known as the Multiple Traveling Salesman Problem (MTSP), which aims to save time and energy. This paper proposes a novel hybrid heuristic algorithm, MCPWOA, to solve MTSP wi</span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text">th a focus on UAV path planning applications. The algorithm integrates the Whale Optimization Algorithm (WOA), Crested Porcupine Optimizer (CPO), Chaotic Mapping Strategy (CMS), Arcsine Control Strategy (ACS) and Reverse Learning Strategy (RLS) to diversif</span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text">y the </span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text">initial</span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text"> population and achieve rapid exploration. The algorithm's performance is evaluated using the CEC2022 benchmark function set and TSPLIB dataset for function minimization and UAV-MTSP experimental solution finding. Results </span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text">indicate</span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text"> that MCPWOA </span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text">outperforms existing WOA, CPO, and other advanced algorithms on most tests, showing higher convergence accuracy. Moreover, MCPWOA's effectiveness is </span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text">demonstrated</span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text"> in actual UAV fire monitoring and rescue path planning, enhancing fire response efficiency thr</span><span class="NormalTextRun SCXW196582674 BCX0" data-ccp-parastyle="Body Text">ough optimized UAV configuration and task allocation.</span></span><span class="EOP SCXW196582674 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:861,&quot;335559737&quot;:132,&quot;335559738&quot;:131,&quot;335559740&quot;:218}"> </span></p> Zixin Wang, Danqing Wang, Jiguang Yu Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/40640 Mon, 14 Jul 2025 00:00:00 +0300 Gas Hydrate Pipeline Is Optimized: Levy Flight, Cauchy Mechanism, and Perception Probability https://itc.ktu.lt/index.php/ITC/article/view/38663 <p><span class="TextRun SCXW124078234 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text">Pipelines used for the hydraulic lifting of gas hydrate particles in deep-sea gas hydrates consume a large quantity of energy, so the level of efficient resource exploitation is very </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW124078234 BCX0" data-ccp-parastyle="Body Text">low</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text"> and it is challenging to meet an efficient gas supply. Therefore, the article aims to </span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text">optimize</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text"> and </span><span class="NormalTextRun SpellingErrorV2Themed SCXW124078234 BCX0" data-ccp-parastyle="Body Text">analyze</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text"> a process used for rigid pipe hydraulic lifting, an essential part of a deep-sea gas hydrate extraction system. First, the objective function is constructed considering the relationship between the extraction system’s parameters</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text">,</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text"> and a specific energy consumption is set when the deep-sea gas hydrate extraction is under consideration. Then, the range of each parameter is </span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text">determined</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text"> according to the extraction system's actual situation. Secondly, the improved crow search algorithm with a hybrid strategy covering dynamic </span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text">perception</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text"> probability, Levy flight, and Cauchy variation mechanism is employed to solve the optimization model. Finally, the improved crow search algorithm is applied to the experimental settings and compared with other optimization algorithms. The experimental results show that the proposed method, which is, the improved crow search algorithm</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text">,</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text"> has a good computational efficiency, can effectively realize the optimization of the parameters of the deep-sea natural gas hydrate system, and is robust to numerical fluctuations of the parameters. Thus, the performance of the pipeline is </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW124078234 BCX0" data-ccp-parastyle="Body Text">improved</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text"> and the energy consumption of the system is effectively reduced. Eventually, a theoretical reference is provided for the development of deep-sea gas hydrate. </span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text">The proposed algorithm, I-CSA, can effectively deal with larger sample data and </span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text">maintain</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text"> high computational efficiency with fewer MAPE results when the sample sizes increase.</span></span> <span class="TextRun SCXW124078234 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text">Eventually, it is helpful for the deep exploitation and </span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text">utilization</span><span class="NormalTextRun SCXW124078234 BCX0" data-ccp-parastyle="Body Text"> of deep-sea gas hydrate.</span></span><span class="EOP SCXW124078234 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:861,&quot;335559737&quot;:132,&quot;335559738&quot;:131,&quot;335559739&quot;:240,&quot;335559740&quot;:218}"> </span></p> Dawei Qin, Lanlan Chen Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/38663 Mon, 14 Jul 2025 00:00:00 +0300 MEA-IFE: An Improved Multi-modal Fusion Framework Based on DCNN-BERT-BiLSTM and Its Application in Sentiment Analysis https://itc.ktu.lt/index.php/ITC/article/view/39960 <p>In the real world, emotional data often comes from multiple heterogeneous sources, making it difficult for unimodal approaches to capture emotional information fully. Existing sentiment analysis models struggle with accuracy when handling complex emotional expressions. Accordingly, this paper proposes a multi-modal sentiment analysis framework, MEA-IFE, which is characterized by effective feature extraction and high predictive accuracy. To mitigate potential information loss and expression limitations in BERT-BiLSTM during text feature extraction, MEA-IFE introduces a parallel structure of SK-Net and BiLSTM, enhancing the ability to extract multi-dimensional text features. Additionally, it integrates the ECA mechanism to improve the capture of essential information in text. For image-related challenges, MEA-IFE incorporates Vision Transformer better to capture both global and detailed features of images, combining CNN and Transformer architectures. During the feature fusion phase, MEA-IFE employs a multi-head attention mechanism to dynamically integrate text and image features, exploring the interactive potential between different modalities. Experiments performed using the Kaggle text dataset and the FER2013 image dataset demonstrate an impressive accuracy of up to 98.00%, validating its effectiveness. When compared with models like AM-MF, AMSAER, HAN-CA-SA, and TBGAV, MEA-IFE shows outstanding performance across accuracy, precision, recall, and F1 score, with respective improvements of 0.40%, 0.20%, 0.75%, and 0.52%. The model also excels in the AUC metric, further confirming its advantages. The proposed MEA-IFE model possesses high predictive accuracy and strong feature integration capabilities, meeting the precision demands of complex multi-modal sentiment tasks.</p> Hongfei Ye, Xiaochen Xiao Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/39960 Mon, 14 Jul 2025 00:00:00 +0300 Method of Ship Target Oblique Frame Detection in Lightweight SAR Image Based on Recurrent Neural Network https://itc.ktu.lt/index.php/ITC/article/view/37944 <p><span class="TextRun SCXW242435529 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW242435529 BCX0">When ship targets appear in SAR images at different angles, their shapes and contours may change significantly. At present, target box detection algorithms often match and recognize based on templates with fixed shapes and directions. When the angle of ship targets changes, these templates may no longer be applicable, leading to the decline of detection algorithm performance, and it is difficult to accurately </span><span class="NormalTextRun SCXW242435529 BCX0">identify</span><span class="NormalTextRun SCXW242435529 BCX0"> and </span><span class="NormalTextRun SCXW242435529 BCX0">locate</span><span class="NormalTextRun SCXW242435529 BCX0"> targets. </span><span class="NormalTextRun SCXW242435529 BCX0">Therefore, for the purpose of solving the problem of angle sensitivity, the method of ship </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW242435529 BCX0">target</span><span class="NormalTextRun SCXW242435529 BCX0"> oblique frame detection in lightweight SAR image based on recurrent neural network is studied to improve the effect of ship target oblique frame detection. Using recurrent neural network, the framework of ship target oblique frame detection in lightweight SAR images is </span><span class="NormalTextRun SCXW242435529 BCX0">established</span><span class="NormalTextRun SCXW242435529 BCX0"> to ensure the detection accuracy, significantly reduce the demand for computing resources, and achieve more efficient detection</span><span class="NormalTextRun SCXW242435529 BCX0">.</span><span class="NormalTextRun SCXW242435529 BCX0"> In this framework, SAR images are input in the input layer and transmitted to the hidden layer</span><span class="NormalTextRun SCXW242435529 BCX0">.</span><span class="NormalTextRun SCXW242435529 BCX0"> The lightweight convolutional neural network is used as the hidden layer, and channel attention mechanism is introduced to improve the extraction effect of useful ship target features</span><span class="NormalTextRun SCXW242435529 BCX0">.</span><span class="NormalTextRun SCXW242435529 BCX0"> The output layer processes the ship target characteristics, predicts the ship target </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242435529 BCX0">center</span><span class="NormalTextRun SCXW242435529 BCX0"> point heat map, and calculates the oblique frame vertex coordinates of the </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242435529 BCX0">center</span><span class="NormalTextRun SCXW242435529 BCX0"> point heat map, so as to have better adaptability to the ship targets that tilt or rotate in the SAR image, solve the angle sensitivity problem, and complete the ship target oblique frame detection</span><span class="NormalTextRun SCXW242435529 BCX0">.</span><span class="NormalTextRun SCXW242435529 BCX0"> The volume Kalman filter algorithm is used to train the recurrent neural network, optimize the network weight, and improve the detection accuracy of ship target oblique frame. Experiments show that this method can effectively extract ship target features</span><span class="NormalTextRun SCXW242435529 BCX0">.</span><span class="NormalTextRun SCXW242435529 BCX0"> Under different background, this method can accurately detect the slant frame of ship target</span><span class="NormalTextRun SCXW242435529 BCX0">.</span><span class="NormalTextRun SCXW242435529 BCX0"> Under different occlusion rates, the robustness of the method is better.</span></span><span class="EOP SCXW242435529 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:862,&quot;335559737&quot;:-11,&quot;335559738&quot;:107,&quot;335559739&quot;:0,&quot;335559740&quot;:218}"> </span></p> Liang Huang, Xufang Zhu, Bing Luo Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/37944 Mon, 14 Jul 2025 00:00:00 +0300 Bi-Encoder Polyp Net: A Novel Architecture for Enhanced Polyp Segmentation in Endoscopic Images https://itc.ktu.lt/index.php/ITC/article/view/41107 <p>Automatic polyp segmentation in endoscopic images holds critical clinical value for early colorectal cancer diagnosis. While existing segmentation models have achieved notable progress, two key challenges persist in algorithmic performance improvement. First, dynamic adjustments of colonoscope tip orientation during examinations induce viewpoint variations, which amplify polyp appearance diversity and hinder robust feature learning. Second, the inherent similarity between polyps and surrounding tissues leads to blurred boundaries. Although convolutional neural networks (CNNs) have demonstrated significant advancements, their limitations in modeling global dependencies and reliance on aggressive downsampling operations often cause redundant network structures and local detail loss. To address these bottlenecks, we propose Bi-Encoder Polyp Net – a novel parallel architecture integrating Pyramid Vision Transformer and ResNet. This dual-branch design effectively captures global contextual dependencies while preserving low-level spatial details. A feature alignment module bridges the semantic gap between dual-branch feature maps, and an iterative semantic embedding unit further injects high-level semantic information into aligned low-level features. Extensive experiments across five public polyp segmentation benchmarks validate the network’s effectiveness, demonstrating superior capability in processing real-world colonoscopy images.</p> Qiqiang Duan, Cong Gu Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/41107 Mon, 14 Jul 2025 00:00:00 +0300 Yolov5-based Intelligent Detection Method for Retail Goods https://itc.ktu.lt/index.php/ITC/article/view/40728 <p><span class="TextRun SCXW110788693 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW110788693 BCX0" data-ccp-parastyle="Body Text">In the current context, intelligent unmanned retail checkout systems offer the prospect of efficient and innovative development. This study proposes an enhanced lightweight YOLOv5 merchandise detection and recognition method. The method introduces </span><span class="NormalTextRun SpellingErrorV2Themed SCXW110788693 BCX0" data-ccp-parastyle="Body Text">SELayer</span><span class="NormalTextRun SCXW110788693 BCX0" data-ccp-parastyle="Body Text"> and a multi-headed self-attentive module of Transformer in YOLOv5 to enable the network to focus more on essential factors such as commodities when performing retail merchandise </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW110788693 BCX0" data-ccp-parastyle="Body Text">detection, and</span><span class="NormalTextRun SCXW110788693 BCX0" data-ccp-parastyle="Body Text"> improve the recognition performance of the model. Also, the Ghost module is introduced to reduce network parameters and computation, increase computation speed and reduce latency. We </span><span class="NormalTextRun SCXW110788693 BCX0" data-ccp-parastyle="Body Text">validated</span><span class="NormalTextRun SCXW110788693 BCX0" data-ccp-parastyle="Body Text"> the performance of the approach on a public dataset. Compared with the existing YOLOv5 model, the model achieves a 0.9% improvement in detection accuracy and a 27.7% reduction in GFLOPs. With this study, we optimise the problem of small batch identification of retail goods, providing a basis for automated processing of intelligent retail supply and marketing systems with practical implications.</span></span><span class="EOP SCXW110788693 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:861,&quot;335559737&quot;:132,&quot;335559738&quot;:131,&quot;335559740&quot;:218}"> </span></p> Zixin Jiang Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/40728 Mon, 14 Jul 2025 00:00:00 +0300 Neural Networks and Ensemble Model to Automatic Music Coordination: A Performance Comparison https://itc.ktu.lt/index.php/ITC/article/view/36737 <p><span class="TextRun SCXW133869063 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text">In order to</span><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text"> solve the problems of low classification accuracy, </span><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text">poor quality</span><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text"> of generated music, and insufficient consideration of the order and duration of notes in music coordination, this paper adopts a long short-term memory network (LSTM) and ensemble model based on the combination of timing and self-attention mechanism. The experimental model uses the LSTM network to automatically learn the </span><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text">important features</span><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text"> of </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW133869063 BCX0" data-ccp-parastyle="Body Text">notes, and</span><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text"> introduces the timing and self-attention mechanism to enhance the model's ability to pay attention to the note sequence and features, and better capture the long-distance dependencies and emotional changes in music. Compared with the traditional model, the model used in this paper is more detailed in considering the order and duration of </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW133869063 BCX0" data-ccp-parastyle="Body Text">notes, and</span><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text"> combines emotional labels with audio data to improve the quality of music generation. The experiment is verified by the three music datasets of Lim, </span><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text">Rhyu</span><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text"> and Lee. The ensemble model combined with LSTM and self-attention mechanism in this paper performs well in comprehensive evaluation scores and chord classification accuracy, which is significantly improved compared with the traditional LSTM model. The novelty lies in the better integration of the timing relationship and emotional information of the note sequence, which improves the performance of music coordination. The model in this paper achieved 43 points (out of 50 points) and 95.6% in comprehensive evaluation score and chord classification accuracy, respectively. The chord classification accuracy was significantly improved by 3.3% compared with LSTM. It also has unique advantages in model structure design and feature integration, especially in the introduction of timing and self-attention mechanisms, and the combination of emotional labels. It has achieved better results and brought </span><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text">new ideas</span><span class="NormalTextRun SCXW133869063 BCX0" data-ccp-parastyle="Body Text"> and methods to the field of music generation.</span></span><span class="EOP SCXW133869063 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:861,&quot;335559737&quot;:132,&quot;335559738&quot;:131,&quot;335559740&quot;:218}"> </span></p> Lu Wang Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/36737 Mon, 14 Jul 2025 00:00:00 +0300 A Two-stage Cattle Face Recognition Method Based on Target Detection and Recognition Network https://itc.ktu.lt/index.php/ITC/article/view/35918 <p><span class="TextRun SCXW181036564 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract" data-ccp-parastyle-defn="{&quot;ObjectId&quot;:&quot;fe0017e7-5150-5d7c-ac64-5f509ec11579|1&quot;,&quot;ClassId&quot;:1073872969,&quot;Properties&quot;:[469775450,&quot;Abstract&quot;,201340122,&quot;2&quot;,134234082,&quot;true&quot;,134233614,&quot;true&quot;,469778129,&quot;Abstract&quot;,335572020,&quot;1&quot;,268442635,&quot;18&quot;,335559705,&quot;1033&quot;,335551547,&quot;1033&quot;,335559739,&quot;200&quot;,335559738,&quot;600&quot;,335551550,&quot;6&quot;,335551620,&quot;6&quot;,469777841,&quot;Times New Roman&quot;,469777842,&quot;Times New Roman&quot;,469777843,&quot;SimSun&quot;,469777844,&quot;Times New Roman&quot;,469769226,&quot;Times New Roman,SimSun&quot;]}">T</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">raditional methods </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">of cattle management</span> <span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">have problems such as high error rates, easy failure of tags, and the need to consume a lot of time and </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">manpower</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> costs. </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">However</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">,</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> a</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">s one of the biological characteristics, the recognition of cattle face is one of the important technical means to achieve inte</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">lligent farming, </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">accurate</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> feeding, and health management of cattle. </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">Thus,</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> the article proposed improved algorithms based on YOLOv7 and </span><span class="NormalTextRun SpellingErrorV2Themed SCXW181036564 BCX0" data-ccp-parastyle="Abstract">VoVNet</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> for cattle face detection and recognition using a contactless approach. For the improved YOLOv7 cattle face detection</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> model, the </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">efficient layer aggregation networks (</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">ELAN</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">)</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> structures in the backbone and neck networks were replaced with the </span><span class="NormalTextRun SpellingErrorV2Themed SCXW181036564 BCX0" data-ccp-parastyle="Abstract">ConvNeXt</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> network and </span><span class="NormalTextRun SpellingErrorV2Themed SCXW181036564 BCX0" data-ccp-parastyle="Abstract">CoTNet</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> Transformer module, respectively, aiming to improve the detection speed and robustness while reducing co</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">mputation. The </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">SimAM</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> (A Simple, Parameter-Free Attention Module)</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> attention mechanism, considering both spatial and channel dimensions, was introduced in the neck network to enhance feature representation without adding extra parameters to the original netw</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">ork. Experimental results on the constructed facial detection dataset of Holstein and Simmental beef cattle showed that the improved CCS-YOLOv7 cattle face detection model achieved a precision of 99.43% and a recall rate of 99.10%, with significantly impro</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">ved detection speed and reduced model size. As for the improved </span><span class="NormalTextRun SpellingErrorV2Themed SCXW181036564 BCX0" data-ccp-parastyle="Abstract">VoVNet</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> cattle face recognition model, </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">residual connections</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> (RC)</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> were added from the input to the output of the </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">One-Shot Aggregation (</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">OSA</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">)</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> modules of </span><span class="NormalTextRun SpellingErrorV2Themed SCXW181036564 BCX0" data-ccp-parastyle="Abstract">VoVNet</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> to enhance the representation of deep features. The </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">Efficient Channel Attention (</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">ECA</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">)</span> <span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">was added to the final feature extraction layer of the OSA modules to improve the feature extraction capability for cattle face image classification. E</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">xperimental results on the facial recognition dataset of Holstein dairy cows and Simmental beef cattle, built upon the improved CCS-YOLOv7 cattle face detection model, </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">demonstrated</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> that the </span><span class="NormalTextRun SpellingErrorV2Themed SCXW181036564 BCX0" data-ccp-parastyle="Abstract">VoVNet</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">-ECA-RC model achieved a precision of 99.37% for cattle face</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> recognition with a final model size of 41.4MB. Therefore, the proposed research structures can provide a reference for non-contact individual recognition </span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract">in the process of intelligent</span><span class="NormalTextRun SCXW181036564 BCX0" data-ccp-parastyle="Abstract"> farming.</span></span><span class="EOP SCXW181036564 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:454,&quot;335559737&quot;:130,&quot;335559738&quot;:131,&quot;335559739&quot;:0,&quot;335559740&quot;:219}"> </span></p> Piaoyi Zheng, Minghui Deng, Junjie Gong, Guiping Li, Yanling Yin Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35918 Mon, 14 Jul 2025 00:00:00 +0300 Integration of Explainable AI with Deep Learning for Breast Cancer Prediction and Interpretability https://itc.ktu.lt/index.php/ITC/article/view/39443 <p><span class="TextRun SCXW256411650 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW256411650 BCX0">The present paper proposes an integrated breast cancer diagnosis that includes ML, DL, and Explanatory AI methods using the Breast Cancer Wisconsin (Diagnostic) Data Set. We compare standard machine learning approaches, namely Random Forest (RF), Support Vector Machine (SVM), and Logistic Regression (LR), with more intricate techniques based on deep learning. Although ML models help understand the problem, a DL model may be more </span><span class="NormalTextRun SCXW256411650 BCX0">appropriate when</span><span class="NormalTextRun SCXW256411650 BCX0"> the data’s dimensionality and complexity are huge. Addressing these limitations, we present a new Hybrid Explainable Attention Mechanism (HEAM) for DL models that utili</span><span class="NormalTextRun SCXW256411650 BCX0">se attention performance. This method is used in CNNS</span><span class="NormalTextRun SCXW256411650 BCX0"> with saliency maps and Grad-CAM methods to provide clinical users with attention on parts of the input that the model is based upon in its predictions, such as characteristics of cell nuclei in images. Using the Breast Cancer Wisconsin dataset, the novel deep learning model with HEAM enhancement is tested against traditional ML models concerning breast cancer classification. The findings of this investigation provide evidence that HEAM not only boosts the prediction accuracy by 99.5% but also enhances the model by allowing for the provision of sound and visual attention that explicates the prediction made, thereby improving the clinical relevance of the model. </span></span><span class="EOP SCXW256411650 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:720}"> </span></p> A. Rhagini, S. Thilagamani Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/39443 Mon, 14 Jul 2025 00:00:00 +0300 Single-Pulse Detection Method of Radar Weak Target Based on a Two-Stage Deep Neural Network https://itc.ktu.lt/index.php/ITC/article/view/40167 <p><span class="TextRun SCXW40850234 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">With the increasing prevalence of drones in low</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">-</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">altitude airspace, the radar detection of weak targets with a low signal</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">-</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">to</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">-</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">noise ratio (SNR) still poses a crucial challenge. Traditional constant false alarm rate (CFAR) methods </span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">encounter</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text"> issues of high false alarms and low accuracy when the SNR is below</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">-</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">15dB.</span> <span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">This paper puts forward a two</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">-</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">stage deep neural network to improve weak target detection by emulating human visual perception. In the first stage (coarse detection), potential targets are rapidly localized through grid</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">-</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">based regression. In the second stage (fine detection), depth</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">-</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">wise separable convolution (DSC) and residual connections are </span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">utilized</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text"> for </span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">accurate</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text"> classification.</span> <span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">Experimental results show that, at an SNR of -20dB, the detection rate of the proposed method is 20% higher than that of CFAR methods, and the inference speed is 3.66 times faster than that of single</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">-</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">stage networks. Ablation studies confirm the efficiency improvements brought by the coarse detection network. This approach offers a robust solution for real</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">-</span><span class="NormalTextRun SCXW40850234 BCX0" data-ccp-parastyle="Body Text">time drone surveillance in complex and cluttered environments.</span></span><span class="EOP SCXW40850234 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:861,&quot;335559737&quot;:132,&quot;335559738&quot;:131,&quot;335559739&quot;:0,&quot;335559740&quot;:218}"> </span></p> Mingjie Qiu, Jianming Wang, Guangxin Wu Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/40167 Mon, 14 Jul 2025 00:00:00 +0300 SAEDF: A Synthetic Anomaly-Enhanced Detection Framework for Detection of Unknown Network Attacks https://itc.ktu.lt/index.php/ITC/article/view/40247 <p><span class="TextRun SCXW236665459 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW236665459 BCX0" data-ccp-parastyle="Body Text">Detecting unknown cyber-attacks (i.e., zero-day) is difficult because network environments change </span><span class="NormalTextRun SCXW236665459 BCX0" data-ccp-parastyle="Body Text">frequently</span><span class="NormalTextRun SCXW236665459 BCX0" data-ccp-parastyle="Body Text"> and there are few </span><span class="NormalTextRun SpellingErrorV2Themed SCXW236665459 BCX0" data-ccp-parastyle="Body Text">labeled</span><span class="NormalTextRun SCXW236665459 BCX0" data-ccp-parastyle="Body Text"> examples of anomalies. Traditional methods for detecting anomalies often struggle to handle unknown attack types and work effectively with complex, high-dimensional data. To overcome these problems, we propose </span><span class="NormalTextRun SCXW236665459 BCX0" data-ccp-parastyle="Body Text">a new approach</span><span class="NormalTextRun SCXW236665459 BCX0" data-ccp-parastyle="Body Text"> called the synthetic attack-enhanced detection framework (SAEDF). SAEDF combines synthetic anomaly generation, flexible feature extraction, and unsupervised anomaly detection. The framework employs a model known as the adaptive and dynamic generative variational autoencoder (ADGVAE). This model generates realistic synthetic attacks and adapts its structure to work effectively with datasets of varying complexity. This helps the model work well with a wide range of attack patterns while still being efficient. Tests on benchmark datasets show that SAEDF performs better than other methods. It achieves higher scores for F1, Recall, and has a much lower rate of false positives. These results show that SAEDF is effective in finding unknown attacks, improving detection accuracy, and handling complex and changing network traffic.</span></span><span class="EOP SCXW236665459 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:862,&quot;335559737&quot;:130,&quot;335559738&quot;:131,&quot;335559740&quot;:218}"> </span></p> Kai Liang, Chuanfeng Li, Qiong Duan Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/40247 Mon, 14 Jul 2025 00:00:00 +0300 Learn from Adversarial Examples: Learning-Based Attack on Time Series Forecasting https://itc.ktu.lt/index.php/ITC/article/view/37758 <p><span class="TextRun SCXW125103179 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW125103179 BCX0">Adversarial</span> <span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW125103179 BCX0">attack</span> <span class="NormalTextRun SCXW125103179 BCX0">in</span> <span class="NormalTextRun SCXW125103179 BCX0">Time</span> <span class="NormalTextRun SCXW125103179 BCX0">Series</span> <span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW125103179 BCX0">Forecasting(</span><span class="NormalTextRun SCXW125103179 BCX0">TSF) has been a topic of growing interest in recent years. While some black box attack methods have been proposed for TSF, they require continuous query to</span> <span class="NormalTextRun SCXW125103179 BCX0">the</span> <span class="NormalTextRun SCXW125103179 BCX0">target</span> <span class="NormalTextRun SCXW125103179 BCX0">model.</span> <span class="NormalTextRun SCXW125103179 BCX0">And</span> <span class="NormalTextRun SCXW125103179 BCX0">the</span> <span class="NormalTextRun SCXW125103179 BCX0">computational</span> <span class="NormalTextRun SCXW125103179 BCX0">costs</span> <span class="NormalTextRun SCXW125103179 BCX0">increase as </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW125103179 BCX0">model</span><span class="NormalTextRun SCXW125103179 BCX0"> and data complexity grows. In fact, </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW125103179 BCX0">The</span><span class="NormalTextRun SCXW125103179 BCX0"> perturbations</span> <span class="NormalTextRun SCXW125103179 BCX0">generated</span> <span class="NormalTextRun SCXW125103179 BCX0">by</span> <span class="NormalTextRun SCXW125103179 BCX0">these</span> <span class="NormalTextRun SCXW125103179 BCX0">methods</span> <span class="NormalTextRun SCXW125103179 BCX0">have</span> <span class="NormalTextRun SCXW125103179 BCX0">certain patterns, especially constrained in </span></span><span class="TextRun SCXW125103179 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW125103179 BCX0">L</span></span><span class="TextRun SCXW125103179 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW125103179 BCX0">0 </span></span><span class="TextRun SCXW125103179 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW125103179 BCX0">norm. Those patterns can be captured and learned by a model. In this study, we proposed Learning-Based </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW125103179 BCX0">Attack(</span><span class="NormalTextRun SCXW125103179 BCX0">LBA), a novel black box adversarial attack method for TSF tasks, focusing on adversarial </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW125103179 BCX0">example</span><span class="NormalTextRun SCXW125103179 BCX0">, the perturbed data. By </span><span class="NormalTextRun SCXW125103179 BCX0">utilizing</span><span class="NormalTextRun SCXW125103179 BCX0"> a model to learn adversarial ex- </span><span class="NormalTextRun SpellingErrorV2Themed SCXW125103179 BCX0">amples</span> <span class="NormalTextRun SCXW125103179 BCX0">and</span> <span class="NormalTextRun SCXW125103179 BCX0">generate</span> <span class="NormalTextRun SCXW125103179 BCX0">a</span> <span class="NormalTextRun SCXW125103179 BCX0">similar</span> <span class="NormalTextRun SCXW125103179 BCX0">one,</span> <span class="NormalTextRun SCXW125103179 BCX0">we</span> <span class="NormalTextRun SCXW125103179 BCX0">can</span> <span class="NormalTextRun SCXW125103179 BCX0">achieve a comparable performance with the original attack methods while significantly reducing the number of queries to the target model, ensuring </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW125103179 BCX0">high</span><span class="NormalTextRun SCXW125103179 BCX0"> efficient</span> <span class="NormalTextRun SCXW125103179 BCX0">and stealthiness. We evaluate our method through several public datasets. In this paper, we learn the adversarial samples attacked by n-Values Time Series Attack(</span><span class="NormalTextRun SpellingErrorV2Themed SCXW125103179 BCX0">nVITA</span><span class="NormalTextRun SCXW125103179 BCX0">),</span> <span class="NormalTextRun SCXW125103179 BCX0">a</span> <span class="NormalTextRun SCXW125103179 BCX0">sparse</span> <span class="NormalTextRun SCXW125103179 BCX0">black</span> <span class="NormalTextRun SCXW125103179 BCX0">box</span> <span class="NormalTextRun SCXW125103179 BCX0">attack</span> <span class="NormalTextRun SCXW125103179 BCX0">for</span> <span class="NormalTextRun SCXW125103179 BCX0">TSF.</span> <span class="NormalTextRun SCXW125103179 BCX0">The results show that we can effectively learn the attack information and generate similar adversarial samples with</span> <span class="NormalTextRun SCXW125103179 BCX0">lower</span> <span class="NormalTextRun SCXW125103179 BCX0">computational</span> <span class="NormalTextRun SCXW125103179 BCX0">overhead,</span> <span class="NormalTextRun SCXW125103179 BCX0">thus</span> <span class="NormalTextRun SCXW125103179 BCX0">achieving</span> <span class="NormalTextRun SCXW125103179 BCX0">the stealthiness and efficiency of the attack. Furthermore, we also </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW125103179 BCX0">verify</span><span class="NormalTextRun SCXW125103179 BCX0"> the transferability of our method and found</span> <span class="NormalTextRun SCXW125103179 BCX0">its</span> <span class="NormalTextRun SCXW125103179 BCX0">applicability</span> <span class="NormalTextRun SCXW125103179 BCX0">to</span> <span class="NormalTextRun SCXW125103179 BCX0">attack</span> <span class="NormalTextRun SCXW125103179 BCX0">other</span> <span class="NormalTextRun SCXW125103179 BCX0">models.</span> <span class="NormalTextRun SCXW125103179 BCX0">Our</span> <span class="NormalTextRun SCXW125103179 BCX0">code is available on </span><span class="NormalTextRun SpellingErrorV2Themed SCXW125103179 BCX0">Github</span><span class="NormalTextRun SCXW125103179 BCX0">. </span></span></p> Youbang Xiao, Zhongguo Yang, Qi Zou, Peng Zhang Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/37758 Mon, 14 Jul 2025 00:00:00 +0300 An Early Warning Model for Industrial Network Security Issues: A Crafted Strategy for High Accuracy Based on Machine Learning Approach https://itc.ktu.lt/index.php/ITC/article/view/39543 <p>An industrial network has become an important infrastructure. As industrial networks develop, their cybersecurity problems become more and more prominent. The attacks currently realized to networks turn out to be advancing quicker than ever, and their destructive force also continuously gets bigger. Thus, the available early warning technology for industrial network security issues requires more accuracy and timeliness since a serious amount of delays occurs in real cases. The article proposes a strategy with high accuracy based on a machine-learning algorithm. Nonlinear high-dimensional data with different feature characteristics in cyber-attacks and low training efficiency of conventional early warning models to predict attacks are underlined as a significant part of the problem to deal with. Thus, the manuscript suggests a feature selection method based on the Tuna Swarm Optimization (TSO) algorithm to filter out redundant features and reduce the data’s dimensionality. Then, the Extreme Learning Machine (ELM) and Auto-Encoder (AE) are combined to construct the model called Extreme Learning Machine-Auto Encoder (ELM-AE) to be implemented as the basis of the early warning model for industrial network security. Afterward, the improved Whale Optimization Algorithm (I-WOA) is used to optimize the parameters of the ELM, to construct the obtained optimization model. Finally, the obtained optimization model is applied to detect attacks on industrial cyber security systems as an early warning method. Eventually, the proposed model is tested by constructing an evaluation index system on how effective the early warning system functions. The experimental results show that the proposed warning model for industrial network security issues has high warning accuracy and efficiency concurrently, which provides an advanced early warning model for network attacks. The proposed model with 92.64% precision and 51.84 s average execution time excels over other methods.</p> Xiang Le, Yong Zhao Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/39543 Mon, 14 Jul 2025 00:00:00 +0300 Hybrid Attention Approach for Source Code Comment Generation https://itc.ktu.lt/index.php/ITC/article/view/36699 <p><span class="TextRun SCXW190754959 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">Currently, developers are often </span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">obligated</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text"> to enhance code quality. High-quality code is often accompanied with comprehensive summaries, including code documentation and function explanations, which are invaluable for maintenance and further development. Regrettably, few software projects provide sufficient code comments owing to the </span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">high costs</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text"> associated with human </span><span class="NormalTextRun SpellingErrorV2Themed SCXW190754959 BCX0" data-ccp-parastyle="Body Text">labeling</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">. Contemporary researchers in software engineering concentrate on the methods for automated comment generating. Initial algorithms depended on handwritten templates or information retrieval methods. With the advancement of machine learning, researchers construct automated models for machine translation instead. Nonetheless, the produced code comments </span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">remain</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text"> inadequate owing to the significant disparity between code structure and normal language. This study introduces a unique deep learning model, At-</span><span class="NormalTextRun SpellingErrorV2Themed SCXW190754959 BCX0" data-ccp-parastyle="Body Text">ComGen</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">, which </span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">utilizes</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text"> hybrid attention for the automated creation of source code comments. Utilizing two separate LSTM encoders, our approach integrates essential tokens from source code functions with the code structure, represented by a corresponding Abstract Syntax Tree. In contrast to earlier data-driven models, our </span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">methodology</span> <span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">utilizes</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text"> code syntax and semantics in the generation of comments. The hybrid attention method, used for comment creation for the first time to our knowledge, enhances the quality of code comments. The tests </span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">demonstrate</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text"> that At-</span><span class="NormalTextRun SpellingErrorV2Themed SCXW190754959 BCX0" data-ccp-parastyle="Body Text">ComGen</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text"> is efficacious and surpasses other prevalent methodologies. </span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">Machine comments from Seq2Seq and CODE-NN disregard code structure underlying </span><span class="NormalTextRun SpellingErrorV2Themed SCXW190754959 BCX0" data-ccp-parastyle="Body Text">DeepCom</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text"> and At-</span><span class="NormalTextRun SpellingErrorV2Themed SCXW190754959 BCX0" data-ccp-parastyle="Body Text">ComGen</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text">. At-</span><span class="NormalTextRun SpellingErrorV2Themed SCXW190754959 BCX0" data-ccp-parastyle="Body Text">ComGen</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text"> has 59.3%, 36.4%, 43.3%, and 13.1% higher comment BLEU values than baseline models for a 5-line function. Even though model performance reduces with comment length, At-</span><span class="NormalTextRun SpellingErrorV2Themed SCXW190754959 BCX0" data-ccp-parastyle="Body Text">ComGen's</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text"> comments often outperform others. 5–10-word machine comments work best. For reference length 10, At-</span><span class="NormalTextRun SpellingErrorV2Themed SCXW190754959 BCX0" data-ccp-parastyle="Body Text">ComGen</span><span class="NormalTextRun SCXW190754959 BCX0" data-ccp-parastyle="Body Text"> has 38.2%, 23.7%, 9.3%, and 4.4% greater BLEU values than the other baseline models. </span></span><span class="EOP SCXW190754959 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:861,&quot;335559737&quot;:132,&quot;335559738&quot;:131,&quot;335559740&quot;:218}"> </span></p> Yao Meng Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/36699 Mon, 14 Jul 2025 00:00:00 +0300 Embedding Numerical Features and Meta-Features in Tabular Deep Learning https://itc.ktu.lt/index.php/ITC/article/view/39134 <p><span class="TextRun SCXW135651063 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">Tabular data is ubiquitous in real-world applications, and an increasing number of deep learning approaches have been developed for tabular </span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">data prediction. Among these approaches, embedding techniques serve as both a common and essential </span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">component</span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">. However, the design of tabular embedding paradigms </span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">remains</span> <span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">relatively limited</span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">, and there is a lack of systematic evaluation </span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">regarding</span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text"> the performa</span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">nce of many existing methods in specific scenarios. In this paper, we focus on embedding numerical features and meta-features. To enrich the embedding methods for numerical features, we propose an ordering-oriented regularization technique applicable to pi</span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">ecewise linear embeddings, along with an unsupervised feature grouping method to </span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">facilitate</span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text"> partial embedding sharing. We </span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">demonstrate</span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text"> that these methods contribute to building more efficient and lightweight embedding modules. Importantly, we highlight orde</span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">ring and sharing as two promising directions in the design of embeddings for numerical features. Additionally, we address several evaluation gaps: we assess the robustness of existing embeddings for numerical features and evaluate a set of general designs </span><span class="NormalTextRun SCXW135651063 BCX0" data-ccp-parastyle="Body Text">separately for data type embeddings and positional embeddings, providing insights into their practical applications and further developments.</span></span><span class="EOP SCXW135651063 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:2,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:862,&quot;335559737&quot;:130,&quot;335559738&quot;:131,&quot;335559740&quot;:18}"> </span></p> Xingyu Ma, Bin Yao Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/39134 Mon, 14 Jul 2025 00:00:00 +0300 ORPTQ: An Improved Large Model Quantization Method Based on Optimal Quantization Range https://itc.ktu.lt/index.php/ITC/article/view/40573 <p><span class="TextRun SCXW244387556 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="none"><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">Quantization reduces model storage by </span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">representing</span> <span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">model</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> in low bits. It can help to improve the application capability of transformer-based large models </span></span><span class="TextRun SCXW244387556 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">and make them possible to be deployed on resource-limited systems such as PCs</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> and mobile devices. </span></span><span class="TextRun SCXW244387556 BCX0" lang="LT-LT" xml:lang="LT-LT" data-contrast="auto"><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">The</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> best </span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">weight-only</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">quantization</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">method</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">currently</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">is</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> to </span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">use</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">second-order</span> <span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">information</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> to fine-tune </span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">the</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">weight</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">step</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">by</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">step</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">during</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">the</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">quantization</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">process</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">, </span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">compensating</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> for </span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">the</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">quantization</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">errors</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">that</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">have</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">occurred</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">.</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">The</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">method</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">can</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">min</span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">imize</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">the</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">functional</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">loss</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> of </span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">weight</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">due</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> to </span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">quantization</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">by</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">adjusting</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">the</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">remaining</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">elements</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">through</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">algebraic</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">transformations</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> in </span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">each</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">step</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">.</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">However</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">, </span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">the</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">performance</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> of </span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">this</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">quantization</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">method</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">will</span> </span><span class="TextRun SCXW244387556 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">deteriorate rapidly</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> when t</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">he adjustment for weight deviate</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">s too far from the </span></span><span class="TextRun SCXW244387556 BCX0" lang="LT-LT" xml:lang="LT-LT" data-contrast="auto"><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">starting</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">point</span></span><span class="TextRun SCXW244387556 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">,</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> especially in low</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">-</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">bit quantization</span> <span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">(</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">e.g.</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> 4 bits or fewer). </span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">To meet the mathematical prerequisite of this method </span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">in the quantization</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">,</span> <span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">t</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">his paper introduces </span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">two </span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">parameters</span></span><span id="MathJax-Element-1-Frame" class="MathJax SCXW244387556 BCX0" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text; display: inline-table; font-style: normal; font-weight: normal; line-height: normal; font-size: 14.6667px; font-size-adjust: none; text-indent: 0px; text-align: center; text-transform: none; letter-spacing: normal; word-spacing: normal; overflow-wrap: normal; white-space: pre !important; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; position: relative;" tabindex="0" role="presentation" data-mathml="&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot; display=&quot;block&quot;&gt;&lt;mo&gt;&amp;#xA0;&lt;/mo&gt;&lt;mi&gt;&amp;#x3B1;&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;&amp;#x3B2;&lt;/mi&gt;&lt;/math&gt;"><span id="MathJax-Span-1" class="math SCXW244387556 BCX0"><span class="SCXW244387556 BCX0"><span id="MathJax-Span-2" class="mrow SCXW244387556 BCX0"><span id="MathJax-Span-3" class="mo SCXW244387556 BCX0"> </span><span id="MathJax-Span-4" class="mi SCXW244387556 BCX0">α</span><span id="MathJax-Span-5" class="mo SCXW244387556 BCX0">, </span><span id="MathJax-Span-6" class="mi SCXW244387556 BCX0">β</span></span></span></span> </span><span class="TextRun SCXW244387556 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">to adjust the quantization range </span></span><span class="TextRun SCXW244387556 BCX0" lang="LT-LT" xml:lang="LT-LT" data-contrast="auto"><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">based</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> on</span></span><span class="TextRun SCXW244387556 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> the </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">second-order</span> <span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">method, and</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> presents three approaches to seek their </span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">optimal</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> values. The experimental results show that the performance of the proposed method significantly outperforms the origina</span></span><span class="TextRun SCXW244387556 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="none"><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">l second-order method in low</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">-</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">bit quantization.</span> <span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">The code </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">of</span> <span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">this paper is available on github.com/t-</span><span class="NormalTextRun SpellingErrorV2Themed SCXW244387556 BCX0" data-ccp-parastyle="Body Text">scen</span><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text">/ORPTQ.</span></span><span class="TextRun SCXW244387556 BCX0" lang="LT-LT" xml:lang="LT-LT" data-contrast="auto"><span class="NormalTextRun SCXW244387556 BCX0" data-ccp-parastyle="Body Text"> </span></span><span class="EOP SCXW244387556 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:861,&quot;335559737&quot;:132,&quot;335559738&quot;:131,&quot;335559740&quot;:218}"> </span></p> Shicen Tian, Kejie Huang Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/40573 Mon, 14 Jul 2025 00:00:00 +0300 WNASNet: Wavelet-Guided Neural Architecture Search for Efficient Single-Image De-raining https://itc.ktu.lt/index.php/ITC/article/view/40643 <p><span class="TextRun SCXW155690292 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">On rainy days, the uncertainty of the shape and distribution of rain streaks can cause the images captured by RGB image-based measu</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">rement equipment to be blurred and distorted. The wavelet transform is extensively </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">utilized</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> in conventional image-enhancing techniques because of its </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">capacity</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> to deliver spatial and frequency domain information and its multidirectional and multiscale chara</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">cteristics. In image de-raining, the distribution of rain streaks is intricately linked to both spatial domain characteristics and frequency domain spatial attributes. Nonetheless, deep learning-based rain removal models </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">predominantly depend</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> on the spatial</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> characteristics of the image, and RGB data is sometimes insufficient to differentiate rain marks from image details, resulting in the loss of essential image information during the rain removal process. To overcome this limitation, we have created a light</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">weight single-image rain removal model named the wavelet-enhanced neural architecture search network (</span><span class="NormalTextRun SpellingErrorV2Themed SCXW155690292 BCX0" data-ccp-parastyle="Body Text">WNASNet</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">). This technique isolates image features from rain-affected images and can more efficiently </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">eliminate</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> rain artifacts. The proposed </span><span class="NormalTextRun SpellingErrorV2Themed SCXW155690292 BCX0" data-ccp-parastyle="Body Text">WNASNet</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> present</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">s three notable contributions. Initially, it </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">utilizes</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> wavelet transform to extract multi-frequency feature components. It </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">allocates</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> a distinct feature search block (FSB) to each </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">component</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">, </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">facilitating</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> the identification of task-specific feature extraction</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> networks to enhance </span><span class="NormalTextRun SpellingErrorV2Themed SCXW155690292 BCX0" data-ccp-parastyle="Body Text">deraining</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> efficacy. Secondly, we present a straightforward yet efficient wavelet </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW155690292 BCX0" data-ccp-parastyle="Body Text">feature</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> fusion technique </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">(SFF) that selectively employs high- and low-frequency features during the inverse wavelet transformation. This method maintains </span><span class="NormalTextRun SpellingErrorV2Themed SCXW155690292 BCX0" data-ccp-parastyle="Body Text">deraining</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> efficacy while </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">substantially decreasing</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> computational complexity </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">relative</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> to conventional frequency blending techniques. Comprehensive studies on four synthetic and two real-world datasets illustrate the better performance of </span><span class="NormalTextRun SpellingErrorV2Themed SCXW155690292 BCX0" data-ccp-parastyle="Body Text">WNASNet</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> across many </span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text">evaluation measures, including PSNR, SSIM, LPIPS, NIQE, and BRISQUE, thereby verifying its efficacy and robustness for </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW155690292 BCX0" data-ccp-parastyle="Body Text">single-image</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW155690292 BCX0" data-ccp-parastyle="Body Text">deraining</span><span class="NormalTextRun SCXW155690292 BCX0" data-ccp-parastyle="Body Text"> tasks.</span></span><span class="EOP SCXW155690292 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:861,&quot;335559737&quot;:132,&quot;335559738&quot;:131,&quot;335559740&quot;:218}"> </span></p> Wenyin Tao, Qiang Chen, Chunjiang Yu Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/40643 Mon, 14 Jul 2025 00:00:00 +0300 Enhancing Open-Set Few-Shot Object Detection with Limited Visual Prompts https://itc.ktu.lt/index.php/ITC/article/view/41078 <p><span class="TextRun SCXW53325701 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">The text-prompt-based open-vocabulary object detection model effectively </span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">encapsulates the abstract concepts of common objects, thereby overcoming the limitations of pre-trained models, which are restricted to detecting a fixed, predefined set of categories. However, due to data scarcity and the constraints of textual descriptio</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">ns, </span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">representing</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text"> rare or complex objects solely through text </span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">remains</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text"> challenging. In this study, we propose an open-set detection model that supports both visual and textual prompt queries (VTP-OD) to enhance few-shot object detection. A small number of vi</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">sual prompts not only provide rich class-wise visual features, which enhance class textual representations, but also enable flexible extension to new classes for different downstream tasks. Specifically, we incorporate two adaptation modules based on cross</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">-attention to adapt the pre-trained vision-language model, allowing it to support both text and visual queries. These modules </span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">facilitate</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text"> (</span><span class="NormalTextRun SpellingErrorV2Themed SCXW53325701 BCX0" data-ccp-parastyle="Body Text">i</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">) visual fusion between a limited number of visual prompts and query images and (ii) visual-language fusion between c</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">lass-aware visual features and textual representations of the classes. Subsequently, the model undergoes prompt tuning using the available few-shot downstream data to adapt to target detection tasks. Experimental results </span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">demonstrate</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text"> that our model outperfo</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">rms the pre-trained model on the LVIS and COCO benchmarks. Furthermore, we </span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text">validate</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text"> its effectiveness on the real-world </span><span class="NormalTextRun SpellingErrorV2Themed SCXW53325701 BCX0" data-ccp-parastyle="Body Text">CoalMine</span><span class="NormalTextRun SCXW53325701 BCX0" data-ccp-parastyle="Body Text"> dataset.</span></span><span class="EOP SCXW53325701 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:861,&quot;335559737&quot;:132,&quot;335559738&quot;:131,&quot;335559740&quot;:218}"> </span></p> Qinghua Yang, Yan Tian, Jing Sun, Fangyuan He Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/41078 Mon, 14 Jul 2025 00:00:00 +0300 Towards Real-World Power Grid Scenarios: Video Action Detection with Cross-scale Selective Context Aggregation https://itc.ktu.lt/index.php/ITC/article/view/41005 <p><span class="TextRun SCXW73769680 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text">In this study, we propose a single-stage model for video action detection and a real-world action detection dataset </span><span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text">POWER collected from real power operation scenarios. While </span><span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text">previous</span><span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text"> studies have made significant progress in overall classification and localization performance, they often struggle with the actions that have short duration, hindering the application of t</span><span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text">hese approaches. To address this, we introduce the Cross-scale Selective Context Aggregation Network</span> <span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text">(CSCAN), which focuses on improving the detection of short actions. This network integrates three key components: 1) a cross-scale feature </span><span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text">conduction structure combined with a tailored alignment mechanism; 2) a selective context aggregation module based on gating mechanism; and 3) an effective scale-invariant consistency training strategy to enable the model to learn scale-invariant action re</span><span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text">presentation. We evaluated our method on the self-collected dataset POWER and on the most widely used action detection benchmarks THUMOS14 and </span><span class="NormalTextRun SpellingErrorV2Themed SCXW73769680 BCX0" data-ccp-parastyle="Body Text">ActivityNet</span><span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text"> v1.3. The extensive results show that our model outperforms other approaches, especially in detecting</span><span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text"> real-world short actions, </span><span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text">demonstrating</span><span class="NormalTextRun SCXW73769680 BCX0" data-ccp-parastyle="Body Text"> the effectiveness of our approach.</span></span><span class="EOP SCXW73769680 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:861,&quot;335559737&quot;:132,&quot;335559738&quot;:131,&quot;335559740&quot;:218}"> </span></p> Lingwen Meng, Siwu Yu, Shasha Luo, Anjun Li Copyright (c) 2025 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/41005 Mon, 14 Jul 2025 00:00:00 +0300