# IMPLEMENTATION OF CORRELATION ANALYSIS TASK IN THE MULTICHANNEL STRUCTURE

# Arūnas Žvironas, Egidijus Kazanavičius

Department of Computer Engineering, Kaunas University of Technology Studenty 50-214<sup>c</sup>, LT-51368 Kaunas, Lithuania

**Abstract**. The digital signal processing in general case may be implemented on multi-channel structures. In most cases such structures have a heterogeneous architecture where the Kahn network and correlation are used to process the data flows. In this paper the methodology of the design of heterogeneous systems is presented. The methodology was tested on the design of the real devices controlling large data flows. Multi-channel structures were used to estimate the influence of the number of channels on the speed of data and the cost of the task, and to estimate an optimal number of channels.

## 1. Introduction

The design and implementation of heterogeneous systems is a complicated process, because the individual methodology is applied for the projecting of each functional device. The designing and implementation steps must be completed effectively, minimizing the design time and expenses. Ptolemy [8], Match [12], Cameron [5] scientific projects are related to the problem of the amelioration of design efficiency. The topic of the modeling of heterogeneous systems and the design of various natures is analyzed in these projects.

The digital signal processing tasks are implemented in a few stages: the formulation of the global task, the investigation and modeling of the respective physical expression, the task specification of digital signal processing, the creation of the functional structure of digital signal processing, the system coordination and the experimental research. Such tasks have to meet the requirements for the operating speed, for the minimal number of hardware, and the criteria of "time-tomarket".

Only a few task implementation stages, mentioned above, can be performed individually. The specialization of a company is often quite narrow so several of them must be subordinated in the corporate project to solve a particular task. This makes the design process more expensive. Moreover, it is difficult to coordinate the activity of project members effectively. In this case, different design methods, methodologies and tools, created hardware and software modules, the implementation algorithms are used for the design of heterogeneous digital signal processing (DSP) system. Actually, a unanimous methodology is required to easy the design and implementation of such heterogeneous digital signal processing systems.

Looking from the point of efficiency, the design of physical system (specialized schemas and modules) requires a lot of time, knowledge and complex design tools and is especially problematic. Usually it is worth of attention only in case of mass system production, whereas is not effective for the scientific research purpose and local orders (long time-to-market).

The coarse functional elements, used for the creation of the system, can shorten time-to-market time. Various designed modules, architectures for programmable logical areas, the routines of parametric processors can be found in the market. A significant problem arises to select the most appropriate architecture and elements for its implementation according to the manufacturer's characteristic. It is necessary to take into account the requirements for the task: the precision of signal processing, the running speed and the distribution costs. As mentioned above, the time-to-market and the costs must be minimized.

General purpose methodologies can be used for the systems design. They use number of iterations and various heuristic methods at each design stage. However, such methodologies are time consuming. SPADE methodology [11] uses different architecture model for each task and evaluation of the performance of architecture is done. Hence, the methodology requires additional design time because application is mapped into the model and later it should be mapped into the architecture anyhow. An Energy Conscious Methodology for Early Design Exploration of Heterogeneous DSPS [15] estimates the energy consumption and disregards other parameters, which can be of high importance. The design time of dedicated devices can be reduced using dedicated methodologies, which are efficient for the particular tasks. Such a methodology is proposed by authors in section 2.

#### 2. Methodology

The heterogeneous structure design methodologies for the digital signal processing tasks which are used for experimental, scientific purposes or small individual orders are not effective enough. Sometimes design objects in scientific exploration projects are not the final products, sometimes these are as additional means (testing environment for the design object, the tool for the evaluation of a specific hypothesis) to reach the goal. It is significant that the object (device) could be reused. It is important that the designer could be able to modify only some parts of the object, change some functionality and denote the interconnections. This can minimize the cost of the system.



Figure 1. The methodology of mapping of DSP task into architecture

The methodology proposed in [16], [17] is dedicated to multi-channel DSP task implementation in the architecture. It may be used to speed up the design process of the new hardware. The methodology consists of four stages (Figure 1):

- *Description/specification of DSP task:* the analysis of the task, the definition of the requirements, the system feasibility study.
- *Functional design:* selection of an algorithm and functional structure. The algorithm is mapped in this structure.
- Selection and acceptance of architecture: the algorithm is mapped into the described interpreter of Kahn network, which is implemented using one of DSP, FPGA or heterogeneous architecture.

• *Evaluation*: The task requirements (such as processing time, number of resources and cost) are verified.

The formal description of Kahn process network is proposed. The designer should describe the algorithm like Kahn process network [16], [17]. The running speed, cost and amount of devices of the designed object are evaluated. The designer can decide to refine the algorithm or to select other architectural solutions after system evaluation.

## 3. Functional design

The selection of functional structure of the DSP module is one of the design stages of digital signal processing. Recently, the amount of DSP hardware manufacturers, which can offer various DSP systems, has been increasingly growing.

The task of digital signal processing may be implemented on the multi-channel structure in general case. The module  $\Phi_{SA}$  is presented in Figure 2. It can consist of the digital signal processor ( $\Phi_{SSP}$ ), the programmable architecture ( $\Phi_{PA}$ ) or the architecture ( $\Phi_{LSAP}$ ) of parallel digital signal processor. Also, it may be the heterogeneous architecture  $\Phi_H$ , which is a combination of  $\Phi_{SSP}$  and  $\Phi_{PA}$ . Small cost and average running speed are typical for modules with  $\Phi_{SSP}$ . The modules with  $\Phi_{LSAP}$  are fast, but they are expensive and with bigger probability of unused resources. The modules with  $\Phi_{PA}$  are characterize by the maximal functional flexibility and running speed. However, they are rather expensive.



Various methods are used for the analysis of the DSP tasks. An algorithm may be described by equations, graphically, by signal processing description language, by general purpose high level or low level language. All choices are suitable but not everyone DSP system's designer can freely use high or low level languages. The task description by equation – comprehensible and explicit, but in this case the designer has only functional description.

The graph (G) is used for the graphical analysis of the digital signal processing task. Each node of the graph corresponds to separate process (V), the edges

between them represent communication channels (*E*). Depending on granularity, the node performs a specific DSP function (such as correlation, filtering, discretization and others). The task is described by the graph G = (X, Y, V, E): *V* is the set of nodes, which transforms the input flow *X* into the output flow *Y*;  $E \subseteq V \times V$  is the set of edges representing channels of data flows. In general, the digital signal processing system may be treated as a multichannel. Then the graph with *M* inputs and *M* outputs  $-G = (X_M, Y_M, E_M)$ , where  $V_M$  is a set of *M* (for instance,  $M = 1 \div 10$ ) channel nodes and  $E_M$  is the set of edges, respectively.

The parameters of the algorithm  $A = \langle X, Y, T, V, C \rangle$  are sets of inputs X and outputs Y, processing time T of the algorithm (the maximum value is specified in the task requirements), a set of control vectors C which defines the execution order of the functions, and V = (F, x, y, c) is a set of decisions (where F is the function executed on the node and  $x \in X$  and  $y \in Y$ ). One functional structure  $S = \langle G^{\Phi}, C^{\Phi} \rangle$  of digital signal processing is selected for the algorithm A, which consists of the Kahn network of modules of type  $\Phi$ , described by graph  $G^{\Phi}$  with control vector  $C^{\Phi}$ . The type of modules  $\Phi = \{\Phi_{PA}, \Phi_{SSP}, \Phi_{H}\}$  was described earlier. The graph of Kahn network  $G^{\Phi}(X^{\Phi}, Y^{\Phi}, V^{\Phi}, E^{\Phi}, C^{\Phi})$  consists of the sets of inputs  $X^{\Phi}$ , outputs  $Y^{\Phi}$ , nodes  $V^{\Phi}$ , interconnections  $E^{\Phi}$  and control vectors  $C^{\Phi}$  of type  $\Phi$ . Later, the algorithm is mapped into the interpreter of Kahn process network.

#### 4. Model of Kahn process network

Selecting of functional structure is a part (stage 2) of functional module design of digital signal processing. The model of the Kahn process network is used for the algorithm simulation (Figure 3). In this model, the behavior of the algorithm is verified. The Kahn network model is defined using control vector C, the set of decisions V, the number of channels Mneeded for the implementation of the algorithm. As shown in Figure 3, the model of Kahn network consists of the functions library, the parametric Kahn network with channels 1...M (the number depends on the task's specification). The control block is needed for the definition of interconnections between nodes of Kahn network and for the definition of the decisions set. In the next stage, the algorithm is implemented on the interpreter of Kahn process network in consideration of cortege of parameters  $\langle V, C, M \rangle$ . The interpreter is implemented in DSP module. The Matlab tool with standard functions and functions, described by the designer, is used for the simulation.



Figure 3. The model of Kahn process network

The prototype of design lets explore the task in the real environment. The created model of Kahn network ensures quicker transition to the prototype of the task implementation.

# 5. Exploration of signal processing tasks using correlation

The methods of correlation analysis in the communication tools [4], telecommunication, ultrasonic systems and elsewhere are used. In the receiver of such devices, the matched filters are used for the signal detection. The matched filter outputs the complete correlation function.

The calculation of impulse spread duration is used in the radar and ultrasonic systems for the estimation of the distance to an object (Fig. 4) [12] and for location detection in wireless systems.



Figure 4. Example of the estimation of distance of an object to barrier

The time duration  $T_r$  of cross-correlation function (CCF) is equal to the sum of impulse duration  $T_p$ and time interval  $T_{skl}$ :

$$T_r = T_{skl} + T_p , \qquad (1)$$

where  $T_{skl} = \frac{2 \cdot D}{v_{skl}}$ , here  $v_{skl}$  – speed of an impulse spread, D – the distance between an object and a barrier. Here  $\frac{v_{skl}}{2} = \alpha$ , then the distance  $D = T_{skl} \cdot \alpha$ . The task of detection of the binary code sequence. The information transfer model, used in the various communication systems, is presented in Figure 5. The model consists of transmitter/receiver and communication channel, in which a spreading signal is exposed to environment noises and reflections  $\xi$ . The impulse sequences, modulated by the Barker and *M*-sequences binary codes, are used in the ultrasonic and radio measurements [7]:

$$s(t) = \sum_{n=1}^{N_s} c[n] \cdot p(t - n \cdot \tau_s) , \qquad (2)$$

here:

 $N_s$  – the number of impulses in a sequence,

 $p(t - n \cdot \tau_s) - n^{th}$  impulse of sequence p,

 $c[n] - n^{th}$  bit of code c,

 $\tau_s$  – period of impulse.



Figure 5. Communication channel

The recurrence period of the impulse sequence is  $N_T = \tau_s / \Delta t_d$ , where  $\Delta t_d$  is the discretization period.

Barker code is the binary (phase) code where the peak of autocorrelation function is equal to  $N_c$ , where  $N_c$  is the code length. The duration of Barker code is equal to  $N_c\delta$ , where  $\delta$  is the moment width.

At the beginning of a signal, which is sent from a transmitter, there is the Barker code of 13 digits [-1,-1,-1,-1,-1,1,-1,1,-1,1,-1] multiplied by Gauss radio frequency (RF) impulse (Fig. 6). Gauss RF pulse center frequency is 40 *kHz* sampled at rate of 4 *MHz*. The reference signal consists of  $N_e = N_e \cdot N_p$  samples, here  $N_p$  is the number of Gauss pulse samples. The period of pulse is  $\tau_s = 0.2ms$ .



Figure 6. Reference signal

The windows  $N_w$  [9] are used for detection of Barker code sequence. The length of every window must meet the inequality  $N_w \leq N_r$ , where  $N_r$  is the number of samples of the observational signal. The windows consist of  $T_k = 1 \div T_w, T_w + 1 \div 2 \cdot T_w, \dots, r$   $(m-1) \cdot T_w + 1 \div T_r$  time slots. Correlation function of the reference signal and the signal on each window is calculated (Figure 7).



Figure 7. Calculation of signal correlation function using windows

The signal correlation function is calculated using equation (3) as follows [1],[3],[6]:

$$r_{yp}[k] = \sum_{i=1}^{N_p} p[i] \cdot y[i+k], k = \overline{1..N_T - N_p},$$
  

$$r_{xy}[k] = \sum_{i=1}^{N_r} c[i] \cdot r_{yp}[(i-1) \cdot N_p + k], k = \overline{1..K},$$
(3)

here:

- $p[N_p]$  pulse,
- *r<sub>yp</sub>*[N<sub>T</sub> N<sub>p</sub>] cross-correlation function of signals y[N<sub>T</sub>] and p[N<sub>p</sub>],
- $c[N_c]: c[i] \in \{-1,1\}$  digital code sequence (Barker code) [4].

The correlation function has peaks which allow to determine spread duration  $\Delta t$ . The method of correlation function maximum peak detection is described by the function of "maximum argument":

$$\Delta t = \arg\max(r_{xv}(\tau)) \tag{4}$$

here  $\Delta t$  – spread duration of signal [10]. In some tasks the number of peaks, known in advance, or peaks, exceeding the threshold, may be sought [2].

We used  $N_e = 703$  samples of reference signal for the experiment. Number of samples of the observational signal  $N_r = 8651$  was used, the measurement range was  $T_e = 0.032s$ .

#### 6. Selecting of functional structure

For the task described earlier, the various structures are selected using model of Kahn process network. The maximum number of channels was determined using this model. The number of channels depends on  $N_w$  size of the window. The various multichannel structures for the implementation of the task are proposed. One of multichannel structures is presented in Figure 8.



Figure 8. Multichannel structure with *M* number of Maximum blocks

For instance, for the detection of the distance to a possible barrier in the indoors, the signal must be received from various sides. For this task, the multichannel structure is used only for receiving the signal from one side. At the moments  $T_k = 1 \div T_w, T_w + 1 \div 2 \cdot T_w, \dots, (M-1) \cdot T_w + 1 \div T_r$  every channel receives a signal. In such way the number of samples used for calculation is decreased, resulting higher calculation speed.

The result of maximum peak detection of correlation function is shown in Figure 9. The peak was detected on second channel (circled part in Figure 9).



Figure 9. Multi-channel architecture of detection for Barker coded sequence

After the algorithm is mapped and functional testing is done, the cortege of parameters  $\langle V, C, M \rangle$  (where V – network of nodes, C – control vector, M – number of channels (when  $N_w = 3751$ , M = 3)), used by the interpreter of Kahn process network for the implementation of the task is determined. The experiment results showed that for the solution of the described task it is expedient to implement the multichannel structure with M channels using digital signal processors. In this case, the additional design is needed, because the peak maximum value max( $\Delta t$ ) must be found from each channel maximum values (Figure 8). For this task, we propose to use the module with FPGA.

Dependencies of correlation maximum peak level according to the noise level and length of Barker code are shown in Figure 10.



Figure 10. The influence of noise level on *CCF* maximum value  $max(r_w)$ 

In the practice the rule of  $2\delta$  is used. It means that the normal value deviates from mean no more than  $2\delta$  with probability of 95% [13]. The maximum value of the received and reference signal correlation function max $(r_{xy})$  and the standard deviation of the correlation function are calculated. The ratio between standard deviation of correlation function and max $(r_{xy})$  is equal to  $\frac{\max(r_{xy})}{2\delta} = 2,4066$  when the noise level is twice as normal and Barker code length is  $N_c = 5$ . When the noise level is four times bigger than normal with same Barker code, then the ratio is  $\frac{\max(r_{xy})}{2\delta} = 1,9266$ . Further enlarging the noise level the variation of ratio was  $\pm 0.5$ .

The multichannel structure for peak detection with one input channel is shown in Figure 11. The total observed signal is received with period  $T_r$ . The received signal is splited into segments  $1:(N_w - N_e),...,(M-1)(N_w - N_e):M(N_w - N_e)$ , which overlaps with windows  $N_w$ . Further, the correlation function  $R_{xy}$  of received and reference signals is calculated and the maximum value is found. The threshold level depends on the maximum search method.



Figure 11. The multichannel structure with one input channel for CCF peak detection



Figure 12. The one channel structure for Barker code detection

For such structure, the modules with M parallel connected digital signal processors are more suitable. The implementation of the one channel structure for correlation function maximum peak detection is presented in Figure 12. In this case, the signal is received with period  $T_w$  and consists of  $N_w$  samples.

The correlation function of received and reference signal is presented in Figure 13.



In this case, the structure is not redundant. The programmable logic devices for such tasks implementation are suitable.

#### 7. Conclusions

The proposed DSP methodology is used for the design of heterogeneous systems using multichannel structure and evaluates the parameters, which impacts the signal processing characteristics (the duration of task solving, the cost of task implementation).

Kahn network is purposefully used for the data flows processing. The correlation analysis is a task of such type. The proposed methodology was verified for such type of tasks.

Solving the correlation analysis problem, was observed that increasing of the number of channels causes the decreasing of speed in proportion to the number of channels.

### References

- J.G. Ackenhusen. Real time signal processing: design and implementation of signal processing systems. ISBN 0-13-631771-5, *Prentice-Hall*, 1999.
- [2] K. Audenaert, H. Peremans, Y. Kawahara, J. Van Campenhout. Accurate ranging of multiple objects using ultrasonic sensors. *Proceeding of the IEEE, Robotics and Automation*,1992.
- [3] J.S. Bendat, A.G. Piersol. Engineering Applications of Correlation and Spectral Analysis. *John Wiley & Sons*, ISSN 1099-1190, 1993.
- [4] E.B. Carne. A Professional's Guide To Data Communication in a TCP/IP World. Artech House, ISBN 1580539092, 2004.
- [5] B. Draper, W. Böhm, J. Hammes, B. Rinker, C. Ross, M.Chawathe, J.Bins. Compiling and optimizing image processing algorithms for FPGAs. Proc. of the IEEE International Workshop on Computer Architecture for Machine Perception (CAMP), September 2000.
- [6] E.C. Ifeachor, B.W. Jervis. Digital Signal Processing: A Practical Approach. ISBN 0-201-54413-X, Addison-Wesley, 1997.
- [7] M. Johanesson. SIMD Architectures for Range and Radar Imaging. *PhD Thesis*, ISBN 91-7871-609-8, ISSN 0345-7524, *University of Lund*, 1995.

- [8] A. Kalavade, E.A. Lee. Hardware/SoftwareCodesign Methodology for DSP applications. *IEEE Design & Test of Computers, Vol.*10, *No.*4, *December* 1993, 16-28.
- [9] E. Kazanavičius, A. Mikuckas, I. Mikuckienė. Sliding windows in non-destructive testing systems. IEEE 2004: CCA, ISIC, CACSD: International Conference on Control Applications, International Symposium on Intelligent Control, International Symposium on Computer Aided Control Systems Design, September 2-4, 2004, Tapei, Taiwan. ISBN 0-7803-8634-5. Tapei, 2004, 448-453
- [10] X. Lai, H. Torp. Interpolation methods for Time-Delay Estimation Using Cross Correlation Method for Blood Velocity Measurement. *IEEE transactions on Ultrasound, Ferroelectric, and Frequency Control*, ISSN 0885-3010, Vol.46, No.2, 1999.
- [11] P. Lieverse, P. Wolf, E. Deprettere, K. Vissers. A Methodology for architecture exploration of heterogeneous signal processing systems. *Journal of VLSI Signal Processing Systems*, ISSN 0922-5773, *Vol.*29, *Issue* 3, 2001, 197 – 207.

- [12] S. Periyayacheri, A. Nayak, A. Jones, N. Shenoy, A. Choudhary, P. Banerjee. Library Functions in Reconfigurable Hardware for Matrix and Signal Processing Operations in Matlab. Proc. 11th IASTED Parallel and Distributed Computing and Systems Conference (PDCS'99), Cambridge, MA, 1999.
- [13] S.M. Ross. Introduction to Probability and Statistics for Engineers and Scientists. *Wiley*, 1987.
- [14] R. Venteris. Exploration of DSP architectures in ultrasonic measurement applications. *Ultragarsas*, *Nr*.1(42), *KTU*, *Kaunas*, 2002.
- [15] M. Wan, Y. Ichikawa, D. Lidsky, J. Rabaey. An Energy Conscious Methodology for Early Design Exploration of Heterogeneous DSPS. *Proceedings of the Custom Integrated Circuit Conference, Santa Clara, CA, USA*, 1998.
- [16] A. Žvironas, E. Kazanavičius. Partitioning of DSP tasks to Kahn network. Ultragarsas Nr.2(43), ISSN 1392-2114 KTU, Kaunas, 2002, 34-37.
- [17] A. Žvironas, E. Kazanavičius, M. Šurnienė. Task Mapping to DSP Processor. *Information technology* and control, No.4(25), ISSN 1392-124X, Kaunas, Technologija, 2002, 49 - 53.

Received June 2006.