Information Technology and Control https://itc.ktu.lt/index.php/ITC <p>Periodical journal <em>Information Technology and Control / Informacinės technologijos ir valdymas</em> covers a wide field of computer science and control systems related problems. All articles should be prepared considering the requirements of the journal. Please use <a style="font-size: normal; text-decoration: underline;" href="https://itc.ktu.lt/public/journals/13/Guidelines for Preparing a Paper for Information Technology and Control (5).doc.rtf">„Article Template“</a> to prepare your paper properly. Together with your article, please submit a signed <a href="https://itc.ktu.lt/public/journals/13/info/Authors_Guarantee_Form_ITC.DOCX">Author's Guarantee Form</a>.</p> en-US <p>Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.</p> vacius.jusas@ktu.lt (Prof. Vacius Jusas) itc@ktu.lt (Vilma Sukackė) Tue, 25 Jun 2024 00:00:00 +0300 OJS 3.2.1.1 http://blogs.law.harvard.edu/tech/rss 60 Optimization of Sewing Equipment Based on Improved Genetic-ant Colony Hybrid Algorithm https://itc.ktu.lt/index.php/ITC/article/view/35943 <p>The optimization of the cutting path of the sample can effectively reduce the cutting time, thereby improving the production efficiency of numerical control processing. This paper comprehensively considers the impact of the cutting order and the position of the knife entry point on the cutting path, converts the cutting path problem into a type of traveling salesman problem (TSP), and proposes an improved genetic-particle swarm optimization algorithm. The selection mechanism of the algorithm combines the elitist retention strategy and roulette wheel selection method to accelerate the search for the optimal solution; the mutation strategy designs a linear decreasing mutation rate, which enhances the global search ability; at the same time, introduces the ant colony optimization algorithm to process the fitness function, adjusts the population evolution difference, and speeds up the optimization process. Through this hybrid algorithm, the cutting order of the sample can be quickly optimized, and the nearest neighbor algorithm is used to determine the position of the knife entry point. Tests are conducted on clothing patterning charts and standard examples. Compared with several commonly used algorithms, experimental results verify the feasibility and effectiveness of this algorithm</p> Ning Rao, Wenbing Jin, Yuemei Yang, Yihui Liao, Liangjing OuYang Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35943 Wed, 26 Jun 2024 00:00:00 +0300 Unsupervised Anomaly Detection of Industrial Images Based on Dual Generator Reconstruction Networks https://itc.ktu.lt/index.php/ITC/article/view/36018 <p>At present, deep learning techniques are increasingly utilized in computer vision and anomaly detection. To address the limitations of inadequate reconstruction capability and subpar performance in reconstruction-based anomaly detection, this study enhances the existing algorithm and introduces an unsupervised anomaly detection of industrial images algorithm based on dual generator reconstruction networks-DGRNet. The network consists of two generators and a discriminator, introducing a widely recognized denoising diffusion probabilistic model (DDPM) as one of the generators, an autoencoder (AE) as the other generator, and a decoder as the discriminator. The model is tested on the MVTec AD dataset, and in the case of no additional training data,<br />the anomaly detection AUC result of DGRNet exceeds the baseline method based on reconstruction by 19.6 percentage points. The experimental results show that DGRNet can improve the detection performance in the anomaly detection algorithm based on unsupervised and reconstructed networks.</p> Cong Gu, Siyv Ren, Qiqiang Duan Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/36018 Wed, 26 Jun 2024 00:00:00 +0300 Improved Glowworm Swarm Optimization for Parkinson’s Disease Prediction Based on Radial Basis Functions Networks https://itc.ktu.lt/index.php/ITC/article/view/33368 <p>Parkinson’s disease is caused by a disruption in the chemical products that enables the communication between brain cells. The brain’s dopamine cells are responsible for movement control, adaptability, and fluidity. Parkinson’s motor symptoms manifest when 60–80% of these cells are damaged due to insufficient dopamine. Researchers are working to find a way to identify the non-motor symptoms that manifest early detection in the disease to stop the disease’s progression because it is believed that the disease starts many years before the motor symptoms. This research presents Parkinson’s disease diagnosis based on deep learning. Processes for feature selection and classification encompass the suggested diagnosis technique. The proposed model searches for the best subset of characteristics using the Improved Glowworm Swarm Optimization (IGSO) algorithm. Radial Basis Functions Networks (RBFN) classifiers evaluate the chosen features. The suggested model is tested using datasets from Parkinson’s Handwriting samples and Parkinson’s Speech and voice with various sound recordings. With an accuracy of about 95.78%, the suggested algorithm forecasts Parkinson’s disease using the VoicePD dataset more precisely.</p> M. Sivakumar, K. Devaki Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/33368 Wed, 26 Jun 2024 00:00:00 +0300 Efficient Guided Grad-CAM Tuned Patch Neural Network for Accurate Anomaly Detection in Full Images https://itc.ktu.lt/index.php/ITC/article/view/34525 <p>Deep learning-based anomaly detection in images has recently gained popularity as an investigative field with many global submissions. To simplify complex data analysis, researchers in the deep learning subfield of machine learning employ Artificial Neural Networks (ANNs) with many hidden layers. Finding data occurrences that significantly differ from generalizable to most data sets is the primary goal of anomaly detection. Many medical imaging applications use convolutional neural networks (CNNs) to examine anomalies automatically. While CNN structures are reliable feature extractors, they encounter challenges when simultaneously classifying and segmenting spots that need removal from scans. We suggest a separate and integration system to solve these issues, separated into two distinct departments: classification and segmentation. Initially, many network architectures<br />are taught independently for each abnormality, and these networks’ main components are combined. A shared<br />component of the branched structure functions for all abnormalities. The final structure has two branches: one<br />has distinct sub-networks, each intended to classify a particular abnormality, and the other for segmenting various abnormalities. Deep CNNs training directly on high-resolution images necessitate input layer image compression, which results in the loss of information necessary for detecting medical abnormalities. A guided GradCAM (GCAM) tuned patch neural network is applied to full-size images for anomaly localization. Therefore, the suggested approach merges the pre-trained deep CNNs with class activation mappings and area suggestion systems to construct abnormality sensors and then fine-tunes the CNNs on picture patches, focusing on medical abnormalities instead of training on whole images. A mammogram data set was used to test the deep patch classifier, which had a 99% overall classification accuracy. A Brain tumor image data set was used to test the integrated<br />detector’s ability to detect abnormalities, and it did so with an average precision of 0.99.</p> R. Rajkumar, D. Shanthi, K. Manivannan Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/34525 Wed, 26 Jun 2024 00:00:00 +0300 Multi-strategy Improved Pelican Optimization Algorithm for Mobile Robot Path Planning https://itc.ktu.lt/index.php/ITC/article/view/35955 <p>In response to the problems of easily falling into local optima, low path planning accuracy, and slow convergence speed when applying the traditional pelican optimization algorithm to the mobile robot path planning problem, a multi-strategy improved pelican optimization algorithm (MPOA) is proposed. In the initialization stage, chaotic mapping is used to increase the diversity of the pelican population individuals. In the exploration stage, an adaptive feedback adjustment factor is proposed to adjust the local optima of pelican individuals’ positions and balance the algorithm’s local development capability. In the development stage, the Lévy flight strategy is introduced to adjust the domain radius of the pelican population individuals, and the Gaussian mutation mechanism is used to disturb individuals that have fallen into local optima. Simulation experimental results show that the improved algorithm has significantly improved and effectively shortened the length of the planned path.</p> Chun Qing Li, Zheng Feng Jiang, Yong Ping Huang Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35955 Wed, 26 Jun 2024 00:00:00 +0300 YOLOv7-PD: Incorporating DE-ELAN and NWD-CIoU for Advanced Pedestrian Detection Method https://itc.ktu.lt/index.php/ITC/article/view/35569 <p>In the pedestrian detection task, the excessive depth of the convolutional network in YOLOv7 results in an abundance of background feature information, thereby posing challenges for the model to accurately locate and detect pedestrians, particularly in small-scale or heavily occluded scenarios. To handle this problem, we propose a pedestrian detection model called YOLOv7-PD, to strengthen the accuracy of detecting small-scale pedestrians and occluded pedestrians. First of all, we propose an improved module called DE-ELAN, an improvement on the existing E-ELAN module, which is based on Omni-Dimensional Dynamic Convolution (ODConv). This module leverages four complementary attention types to enhance feature extraction, capturing rich contextual information. Then, we propose a lightweight receptive field enhancement module called light-REFM, which constructs a pyramid structure and acquires fine-grained multi-scale information through dilated convolutions of different sizes. Finally, we propose an improved regression loss function based on the Normalized Wasserstein Distance (NWD) that combines NWD with Complete-IoU (CIoU), enabling precise position and feature capture for small targets. On the Citypersons dataset, YOLOv7-PD outperforms YOLOv7, improving the average precision (AP) by 7% and reducing the miss rate by 2.58%. Experiments on three challenging pedestrian detection datasets demonstrate a balance between precision and speed, achieving excellent performance.</p> Yu He, Liang Wan Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35569 Wed, 26 Jun 2024 00:00:00 +0300 GAN-Generated Face Detection Based on Multiple Attention Mechanism and Relational Embedding https://itc.ktu.lt/index.php/ITC/article/view/35590 <p>The rapid development of the Generative Adversarial Network (GAN) makes generated face images more and more visually indistinguishable, and the detection performance of previous methods will degrade seriously when the testing samples are out-of-sample datasets or have been post-processed. To address the above problems, we propose a new relational embedding network based on “what to observe” and “where to attend” from a relational perspective for the task of generated face detection. In addition, we designed two attention modules to effectively utilize global and local features. Specifically, the dual-self attention module selectively enhances the representation of local features through both image space and channel dimensions. The cross-correlation attention module computes similarity between images to capture the global information of the output in the image. We conducted extensive experiments to validate our method, and the proposed algorithm can effectively extract the correlations between features and achieve satisfactory generalization and robustness in generating face detection. In addition, we also explored the design of the model structure and the inspection performance on more categories of generated images (not limited to faces). The results show that RENet also has good detection performance on datasets other than faces.</p> Junlin Ouyang, Jiayong Ma, Beijing Chen Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35590 Wed, 26 Jun 2024 00:00:00 +0300 Application Simulation Research Based on Visual Image Capture Technology in Sports Injury Rehabilitation https://itc.ktu.lt/index.php/ITC/article/view/35021 <p>To capture and analyze the motion state of patients in real time and improve the evaluation effect of sports injury, the research is based on image recognition in visual image capture technology. Firstly, multiscale attention mechanism was introduced into U-Net image segmentation model to improve the pre-processing of image recognition. Then, the image recognition model of convolutional neural network is optimized by gradient class weighted activation mapping. The combination of the two is applied to the sports injury image processing to verify the effect. The results show that the F1 score and Precision values of the improved segmentation model in the database reach 98.85% and 98.74%, respectively. The segmentation accuracy is obviously improved. The accuracy of the optimized image recognition method in the training set and the test set is about 96% and 98%, respectively. After the combination of the two methods, the processing accuracy of sports injury medical images is 97%, and the running time is within 4s. It has high accuracy and processing efficiency, providing a technical and methodological basis for sports injury rehabilitation training.</p> Xun Tang Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35021 Wed, 26 Jun 2024 00:00:00 +0300 Research on Autonomous Mobile Robot Path Planning Based on M-RRT Algorithm https://itc.ktu.lt/index.php/ITC/article/view/36240 <p>The Rapidly-Exploring Random Tree (RRT) algorithm has demonstrated proficiency in adapting to path search challenges within high-dimensional dynamic environments. However, a notable limitation of the RRT algorithm lies in its inability to fulfill the criteria for achieving the shortest and smoothest path for mobile sensing nodes. To address the limitations of the conventional RRT algorithm and enhance the path planning for mobile robots, this paper proposed an innovative approach named M-RRT, designed to overcome the aforementioned shortcomings and optimize the path planning process for mobile sensing nodes. First, the search area is constructed according to the defined coverage density. After searching the path in the search area, the RRT algorithm uses the greedy method to delete the intermediate nodes in the path, and obtains the uniquly optimal path. Finally, the Bezier curve is used to optimize the path, which makes the path shortest and meets the dynamic requirements of the mobile node. Simulation results show that M-RRT has better path and faster convergence speed than traditional RRT, which can better meet the planning requirements of mobile nodes.</p> Zhuozhen Tang, Hongzhong Ma, Bin Xue Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/36240 Wed, 26 Jun 2024 00:00:00 +0300 Helmet Detection Based on Context Enhancement Pyramid Under Surveillance Images https://itc.ktu.lt/index.php/ITC/article/view/35273 <p>Helmet detection is of great significance for realizing the automated management of industrial safety. To address the problem that existing object detection methods have insufficient ability to detect helmet small objects under surveillance images, this paper proposes a helmet detection based on context enhancement pyramid under surveillance images to realize the automatic detection task of helmet objects. The method helps the network improve position localization for small-scale helmet objects by adding a high-resolution detection layer to YOLOv5. Also, the proposed context enhancement pyramid reduces the semantic differences between different scale features and generates rich contextual features to enhance the network’s discriminative learning ability for helmet small object features. In addition, the proposed multi-scale attention module improves the<br />feature fusion effect in the pyramid network to further capture multi-scale features and expand the receptive field to enhance the network’s detection precision of helmet objects under surveillance images. The experimental analysis shows that the proposed method has good detection effect compared to existing object detection methods on the Safety Helmet Wearing Dataset (SHWD) as well as the customized dataset.</p> Zhigang Xu, Yugen Li Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35273 Wed, 26 Jun 2024 00:00:00 +0300 Small Sample Time Series Classification Based on Data Augmentation and Semi-supervised Learning https://itc.ktu.lt/index.php/ITC/article/view/35797 <p>Realistic scenarios produce labeled data and unlabeled data, however, there are significant challenges in labeling time series data. It is imperative to effectively integrate the relationship between labeled and unlabeled data within semi-supervised classification model. This paper presents a novel semi-supervised classification method, namely Data Augmentation-Fast Shapelet Semi-Supervised Classification, which employs a data augmentation module to enhance the diversity of data and improve the generalization ability of the model, as well as a feature fusion module to enhance the semi-supervised network. A conditional generative adversarial network is used to synthesize excellent labeled time series samples to enhance the homogeneous data in the sample space, the fast shapelets method is used to quickly extract the important shape feature vectors in the time series, self-supervised and supervised learning are combined to fully learn the unlabeled and labeled data of the time series dataset. The<br />joint loss function combines the loss functions of the two networks to optimize multiple objectives. Reinforcement learning is used to determine the weight coefficients of the joint loss function, at the same time, the reward function is modified to bias the supervisory loss, which improves the classification performance of the model under limited labeled data, and the model can also better achieve the semi-supervised classification task. The proposed method is validated on the UCR benchmark dataset, Electrocardiogram dataset, and Electroencephalogram dataset, the results show that the semi-supervised classification method can perform a more accurate semi-supervised classification of the time series, with an accuracy better than the comparison methods. Meanwhile, we use the plant electrical signal dataset obtained from actual measurements for testing, the visualization<br />analysis can clearly show the model role in the semi-supervised classification task, and the experimental results fully demonstrate the effectiveness and applicability of the proposed method.</p> Jing-Jing Liu, Jie-Peng Yao, Zhuo Wang, Zhong-Yi Wang, Lan Huang Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35797 Wed, 26 Jun 2024 00:00:00 +0300 A Novel Risk-Perception Model Based on Blockchain for Supply Chain Finance of China Real Estate https://itc.ktu.lt/index.php/ITC/article/view/35774 <p>More than 220 enterprises in China's real estate industry have gone bankrupt, causing serious losses. The National Bureau of Statistics of China showed that the country's investment in property development fell by 8.5% year-on-year, while domestic lending dropped by 11.5% and the use of foreign capital fell by 43%. Upon this, the development of supply chain finance can alleviate the pressure on enterprise funds and stabilize the real estate market. However, risk in supply chain finance is the biggest obstacle to the development of supply chain finance and current researches on risk assessment of supply chain finance face problems such as imprecise classification, slow assessment speed, a small number of samples, and data that is easily tampered with. Therefore, this study integrated graph convolutional neural networks into the smart contracts of the contract layer of blockchain. This integration established a novel intelligent perception model for supply chain finance risk. Based on a consortium chain with the government and enterprises as nodes, the model was established, including risk monitoring, assessment, and categorized early warnings. In the risk assessment part, we compared the graph convolutional neural network with multilayer perceptron and support vector machine finding that the accuracy rate of the graphic convolutional neural network is 94%, which is higher than the above models. The intelligent risk-perception model proposed in this paper operates faster than expert judgment assessments used by banks. It also provides accurate risk levels and quantifies the probability of enterprises being classified as high-risk, offering technical support to regulatory authorities in controlling supply chain financial risk.</p> Qiang Fu, Mingxia Li, Weiqiang Li Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35774 Wed, 26 Jun 2024 00:00:00 +0300 An Improved YOLOv5x-Based Algorithm for IC Pin Welding Defects Detection https://itc.ktu.lt/index.php/ITC/article/view/35652 <p>This study suggests an integrated circuit (IC) pin welding defect detection algorithm based on improved YOLOv5x to address the issues of low detection accuracy caused by small target size and dense pin arrangement in IC pin welding defects identification. The ability of the network to extract features is improved by effective fusing of features with various receptive fields through the inclusion of the D-SPP module to merge different channel information. The introduction of the mask self-attention mechanism module increases the network’s capacity to recognize global feature correlations and raises the algorithm’s detection precision. In order to speed up the model’s convergence and tackle the issue of sample imbalance in BBox regression, the Focal-EIoU loss function is applied. The detection accuracy and speed are increased by using the k-means++ clustering algorithm to create nine clustering centers to figure out the size of the prior box. According to the results of the experiment, the new method achieves average precisions for short-circuit, missing pin, pin-cocked, and little tin faults in IC pin welding of 96.7%, 94.5%, 95.6%, and 93.3%, respectively. The mean average precision increases to 95.0% at a threshold of 0.5, which is 13.3% and 8.9% greater than YOLOv3 and YOLOv5x, respectively. A reference value for IC pin welding defect identification is provided by the improved algorithm, which has a detection time of 0.142 seconds per image. This meets the speed requirements of IC quality inspection.</p> Xueying Wang, Mengyun Li, Xiaofeng Hu, Bin Guo Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35652 Wed, 26 Jun 2024 00:00:00 +0300 Automated Retinal Image Analysis to Detect Optic Nerve Hypoplasia https://itc.ktu.lt/index.php/ITC/article/view/35152 <p>Identification of the optic disc and fovea is crucial for automating the diagnosis and screening of retinal diseases. Based on quantitative calculations, this study presents a decision support system for doctors that automatically detect optic nerve hypoplasia. For disease diagnosis, U-Net architecture is used, which uses a pre-trained ResNet encoder to segment the optic disc and fovea structures. An important aspect of the proposed technique is that pretrained ResNet and U-Net are used together, providing robust performance in the detection of optic nerve hypoplasia. Our proposed architecture was tested on retinal images from Messidor, Diaretdb1, DRIVE, HRF, APTOS, and IDRID. In addition, a special database called ONH-NET was created based on 189 retinal images obtained from Düzce University, Department of Ophthalmology. Messidor database test images showed,<br />0. 9069 IOU Score, 0.9626 Sensitivity, 0.9411 Precision, 0.9974 Accuracy and 0.9505 dice-coefficient values in optic disc detection, and 0.8282 IOU score, 0.8442 sensitivity, 0.8252 precision, 0.8992 Accuracy, 0.7873 dice coefficient values were obtained in fovea detection. We computed diameter optic disc to macula radius ratios from segmented optic disc and fovea for screening optic nerve hypoplasia and achieved 100% success.</p> Canan Celik, İbrahim Yücadag, Hanife Tuba Akçam Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35152 Wed, 26 Jun 2024 00:00:00 +0300 Optimization of Human Posture Recognition based on Multi-view Skeleton Data Fusion https://itc.ktu.lt/index.php/ITC/article/view/36044 <p>This research introduces a novel method for fusing multi-view skeleton data to address the limitations encountered by a single vision sensor in capturing motion data, such as skeletal jitter, self-pose occlusion, and the reduced accuracy of three-dimensional coordinate data for human skeletal joints due to environmental object occlusion. Our approach employs two Kinect vision sensors concurrently to capture motion data from distinct viewpoints extract skeletal data and subsequently harmonize the two sets of skeleton data into a unified world coordinate system through coordinate conversion. To optimize the fusion process, we assess the contribution of each joint based on human posture orientation and data smoothness, enabling us to fine-tune the weight ratio during data fusion and ultimately produce a dependable representation of human posture. We validate our methodology using the FMS public dataset for data fusion and model training. Experimental findings demonstrate a substantial enhancement in the smoothness of the skeleton data, leading to enhanced data accuracy and an effective improvement in human posture recognition following the application of this data fusion method.</p> Yahong Xu, Shoulin Wei, Jibin Yin Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/36044 Wed, 26 Jun 2024 00:00:00 +0300 Design and Implementation of an English Learning System Based on Intelligent Recommendation https://itc.ktu.lt/index.php/ITC/article/view/35290 <p>The Internet has led to the rapid development of online education, but it has also caused redundancy in educational information. How to choose appropriate courses from a large number of online education resources has become a major problem for current learners. Therefore, the study proposes an English learning system based on efficient and deep Matrix decomposition. The results of the experiments showed that, in practical teaching applications, about 57.5% of students with good grades have improved their grades due to the use of the English learning system proposed by the research institute, with only about 17.5% of their grades decreasing. 67.5% of students with average grades have improved their grades after using the system, with only 10% decreasing. Among the students with poor grades, about 50% of them improved their academic performance through the system, while about 27.5% of them experienced a decrease. Meanwhile, the experiment also tested the efficient deep Matrix decomposition model in the learning system: the minimum absolute average errors of the model on different data sets are about 0.61, 0.69, 0.77 and 0.82, respectively. The minimum Root-mean-square deviation is about 0.91, 0.98, 1.06 and 1.1, which is far lower than other recommended models. The above results show that the system constructed in this paper can recommend courses according<br />to students ‘actual learning level, and can effectively improve students’ academic performance in the actual teaching process.</p> Jinli Yuan Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35290 Wed, 26 Jun 2024 00:00:00 +0300 Semantic-Enhanced Variational Graph Autoencoder for Movie Recommendation: An Innovative Approach Integrating Plot Summary Information and Contrastive Learning Strategy https://itc.ktu.lt/index.php/ITC/article/view/35689 <p>This study introduces a novel movie recommender system utilizing a Semantic-Enhanced Variational Graph Autoencoder for Movie Recommendation (SeVGAER) architecture. The system harnesses additional information from movie plot summaries scraped from the internet, transformed into semantic vectors via a large language model. These vectors serve as supplementary features for movie nodes in the SeVGAER-based recommender. The system incorporates an encoder-decoder structure, operating on a user-movie bipartite graph, and employs GraphSAGE convolutional layers with modified aggregators as the encoder to extract latent representations of the nodes, and a Multi-Layer Perceptron (MLP) as the decoder to predict ratings with additional graph-based features. To address overfitting and improve model generalization, a novel training strategy is introduced. We employ a random data splitting approach, dividing the dataset into two halves for each training instance. The model then generates outputs on each half of the data, and a new loss function is introduced to ensure consistency between these two outputs, a strategy that can be seen as a form of contrastive learning. The model’s performance is optimized using a combination of this new contrastive loss, graph reconstruction loss, and KL divergence loss. Experiments conducted on the Movielens100k dataset demonstrate the effectiveness of this approach in enhancing movie recommendation performance</p> Mingye Wang, Xiaohui Hu, Pan Xie, Yao Du Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35689 Wed, 26 Jun 2024 00:00:00 +0300 A Construction Optimization for Laser SLAM Based on Odometer Constraint Fusion https://itc.ktu.lt/index.php/ITC/article/view/32897 <p>The traditional laser SLAM (Simultaneous Localization and Mapping) algorithm uses the global relative poses and local ones to form residual blocks. Its constructed map is not smooth enough and the constraint construction is too simplex under some special scenarios. Thus, this paper proposes an odometer constraint fusion method called FOSLAM (Fusion Odometer SLAM) to construct residual blocks between constrains and solve the nonlinear least squares by Ceres. The effectiveness and accuracy of this method have been verified through comparative experiments. Experimental results showed that without increasing the time and space complexity, by involving the odometer constraint into the SLAM optimization process, the convergence of scan matching scores can be improved and the constructed grid map edges are smoother and the jagged phenomenon can be reduced. Under sophisticated scene, FOSLAM is able to acquire more accurate maps and laser odometer trajectory than Cartographer method. Therefore, it is suitable to be used on indoor robot for cleaning and inspection and can be further deployed on autonomous unmanned vehicles involving spatial visualization and neuro-heuristic guidance.</p> Haojun Huang, Puxian Yang, Shengqing Cai, Jixiang Li, Yuda Zheng, Tengyue Zou Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/32897 Wed, 26 Jun 2024 00:00:00 +0300 Elderly Fall Detection Algorithm Based on Improved YOLOv5s https://itc.ktu.lt/index.php/ITC/article/view/36336 <p>The indoor fall detection for the elderly can effectively help the treatment after falling, but many existing detection methods have the problems of inconvenient use, high misjudgement rate and slow speed. Using deep learning methods can effectively solve these problems, and YOLOv5s is a kind of deep learning algorithm that can perform real-time fall detection. In order to achieve a more lightweight and higher detection accuracy, this paper proposes a fall detection algorithm for the elderly based on improved YOLOv5s, called YOLOv5s-GCC. Firstly, the original Conv and C3 structures are replaced by GhostConv and C3GhostV2 structures in backbone to achieve model lightweight, which reduces model computation and improves accuracy. Secondly, the lightweight upsampling operator CARAFE is introduced to expand the receptive field for data feature fusion and reduce the loss of feature information in upsampling. Finally, the deepest C3 is integrated with CBAM attention mechanism in the neck, because the deepest neck receives more abundant feature information, and CBAM can increase the efficiency of the algorithm in extracting important information from the feature map. Experimental results show that YOLOv5s-GCC has increased by 1.2% to 0.935 on the hybrid open source fall dataset mAP@0.5; FLOPs decreased by 29.1%. Params are reduced by 27.5% and have obvious advantages over<br />similar object detection algorithms.</p> Zhongze Luo, Siying Jia, Hongjun Niu, Yifu Zhao, Xiaoyu Zeng, Guanghui Dong Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/36336 Wed, 26 Jun 2024 00:00:00 +0300 An Overview of Behavioral Recognition https://itc.ktu.lt/index.php/ITC/article/view/35769 <p>Human behavior recognition has become a popular research topic in the field of computer vision. With the introduction of deep learning and attention mechanisms, this field has been further promoted. However, issues such as dataset acquisition and preprocessing operations on multimodal datasets, modeling of long time information in videos, and fusion of temporal and spatial information still exist. In this paper, we first outline some video action recognition datasets and related preprocessing techniques, including frame extraction, optical flow extraction, and skeletal feature acquisition. Then, the relevant models are classified and parsed according to their characteristics and the types of input data modalities. In addition, we evaluate the performance of the models on several benchmark datasets to gain a deeper understanding of the model development process. Finally, we summarized the current challenges faced in the field of video behavior recognition, including model timeliness, data set subjectivity and effective fusion of multi-modal features, and proposed possible future improvement directions in order to provide more ideas and methods for subsequent research.</p> Yunjie Xie, Jian Xiang, Xiaoyong Li, Jiawen Duan, Zhiqiang Li Copyright (c) 2024 Information Technology and Control https://itc.ktu.lt/index.php/ITC/article/view/35769 Wed, 26 Jun 2024 00:00:00 +0300