Boewie wichislitel'nie kompleksi (продолжение)

Форум » Дискуссии » Boewie wichislitel'nie kompleksi (продолжение) » Ответить

Boewie wichislitel'nie kompleksi (продолжение)

milstar: http://drops.dagstuhl.de/opus/volltexte/2006/732/pdf/06141.AthanasPeter.Paper.732.pdf Although an FPGA’s clock rate rarely exceeds one-tenth that of a PC, hardware implemented digital filters can process data at ###################################################################### many times that of software implementations [4] ################################### . Additional performance gains have been described for cryptography [5], network packet filtering [6], target recognition [7] and pattern matching [8], among other ########################################################################## applications. A. Present Day Cost-Performance Comparison Owing to the prevalence of IEEE standard floating-point in a wide range of applications, several researchers have designed IEEE 754 compliant floating-point accelerator cores constructed out of the Xilinx Virtex-II Pro FPGA’s configurable logic and dedicated integer multipliers [16-18]. Dou et al published one of the highest performance benchmarks of 15.6 GFLOPS by placing 39 floating-point processing elements on a theoretical Xilinx XC2VP125 FPGA [19]. Interpolating their results for the largest production Xilinx Virtex-II Pro device, the XC2VP100, produces 12.4 GFLOPS, compared to the peak 6.4 GFLOPS achievable for a 3.2 GHz Intel Pentium processor. Assuming that the Pentium can sustain 50% of its peak, the FPGA outperforms the processor by a factor of four for matrix multiplication. One of the earlier projects demonstrated a 23x speedup on a 2-D FFT through the use of a custom 18-bit floating-point format [26]. More recent work has focused on parameterizible libraries of floating-point units that can be tailored to the task at hand [27-29]. By using a custom floating-point format sized to match the width’s of the FPGA’s internal integer multipliers, a speedup of 44 was achieved for a hydrodynamics simulation [30] using four large FPGAs. Nakasato and Hamada’s 38 GFLOPS of performance is impressive, even from a cost-performance standpoint. For the cost of their PROGRAPE-3 board, estimated at $15,000, it is likely that a 15-node processor cluster could be constructed producing 196 single precision peak GFLOPS. Even in the unlikely scenario that this cluster could sustain the same 10% of peak performance obtained by Nakasato and Hamada’s for their software implementation, the PROGRAPE-3 design would still achieve a 2x speedup. As in many FPGA to CPU comparisons, it is likely that the analysis unfairly favors the FPGA solution. Hardware implementations require specialized skills in digital design and vendor-specific tool flows. Development time and costs are significantly higher than for software. Many comparisons in literature spend significantly more time optimizing the hardware implementations than they do optimizing their software implementations. Previous research has demonstrated significant compiler inefficiency for common HPCfunctions [31]. For the DGEMM matrix multiplication function, a hand-coded version outperformed the ############################################### compiler by greater than eight times. ############################ A to- tal of 39 PEs can be integrated into the xc2vp125-7 FPGA, reaching performance of, e.g., 15.6 GFLOPS with 1600 KB local memory and 400 MB/s external memory bandwidth 1 is s 1700 nozkami i wisokoj stoimost'ju porjadka 8000 $ segodnja http://ce.et.tudelft.nl/~george/publications/Conf/FPGA05/FPGA05Dou.pd http://www.xilinx.com/publications/matrix/virtexmatrix.pd Xilinx Vertex FPGA

Ответов - 163, стр: 1 2 3 4 5 6 7 8 9 All

milstar: Reducing Doppler Filtering Processing in STAP Implementations https://www.embedded.com/reducing-doppler-filtering-processing-in-stap-implementations/?utm_source=eetimes&utm_medium=relatedcontent

milstar: The simulation result is shown in Fig. 4. In this simulation, RF frequency f0 is 24 GHz with bandwidth B=50 MHz, and the sampling frequency fb of ADC is 200 MHz. We choose 128 points per frame, and save 64 frames of data for 2-D FFT. An IF signal containing Doppler-shifted received signal of 3 targets is generated. The position and velocity information of the targets are: https://ieeexplore.ieee.org/document/6174020 Conventional 2-D FFT performs good results for multi-target recognition when detected objects are of relative small range variation and nearly constant velocity. Time-frequency analysis methods have advantages in analyzing objects with non-constant velocity.

milstar: The two-dimensional FFT process gives a 2D range-velocity image (FFT heatmap)•Typically, detection of objects is done on this image•After detection, the range and relative speed of the objects are easily calculated https://training.ti.com/sites/default/files/docs/Mmwave_webinar_Dec2017.pdf

milstar: The Doppler shift can be determined after performing the range Fourier transform (range FFT) first. For a target of interest, we can repeat the range FFT until we have enough data to perform the second level of FFT. The result of this second FFT is a two dimensional complex valued matrix, whose spectral peak corresponds to the Doppler shift of the moving target. This method is known as Doppler FFT.

milstar: PRF tradeoffs Different PRF frequencies have different advantages and disadvantages. The following discussion summarizes the trade-offs. Low PRF operation is generally used for maximum range detection. It usually requires a high power transmit power, in order to receive returns of sufficient power for detection at a long range. To get the highest power, long transmit pulses are sent, and correspondingly long matched filter processing (or pulse compression) is used. This mode is useful for precise range determination. Strong sidelobe returns can often be determined by their relatively close ranges (ground area near radar system) and filtered out. Disadvantages are that Doppler processing is relatively ineffective due to so many overlapping Doppler frequency ranges. This limits the ability to detect moving objects in the presence of heavy background clutter, such as moving objects on the ground. High PRF operation spreads out the frequency spectrum of the receive pulse, allowing a full Doppler spectrum without aliasing or ambiguous Doppler measurements. A high PRF can be used to determine Doppler frequency and therefore relative velocity for all targets. It can also be used when a moving object of interest is obscured by a stationary mass, such as the ground or a mountain, in the radar return. The unambiguous Doppler measurements will make a moving target stand out from a stationary background. This is called mainlobe clutter rejection or filtering. Another benefit is that since more pulses are transmitted in a given interval of time, higher average transmit power levels can be achieved. This can help improve the detection range of a radar system in high PRF mode. Medium PRF operation is a compromise. Both range and Doppler measurements are ambiguous, but each will not be aliased or folded as severely as the more extreme low or high PRF modes. This can provide a good overall capability for detecting both range and moving targets. However, the folding of the ambiguous regions can also bring a lot of clutter into both range and Doppler measurements. Small shifts in PRFs can be used to resolve ambiguities, as has been discussed, but if there is too much clutter, the signals may be undetectable or obscured in both range and Doppler. https://www.eetimes.com/radar-basics-part-2-pulse-doppler-radar/

milstar: Digital beamforming can also be used in another capacity. In some systems, it is desired to receive and transmit separate signals in different directions simultaneously. This can be accomplished by using the FFT algorithm. Normally, FFTs are used to take a time domain signal and separate it into its different frequency components. In this case, the FFT will separate the incoming signal into its different spatial components or angle of arrival components. The input signals are sorted by the FFT into bins corresponding to different angles of arrival, as shown in Figure 3. Similarly, in the transmit direction, a signal fed into each FFT bin input will be transmitted in a specific direction, corresponding to a specific antenna lobe. If the input to a FFT bin is zero, no energy will be transmitted in that direction; the transmit lobe will be “missing”. https://www.eetimes.com/radar-basics-part-3-beamforming-and-radar-digital-processing/

milstar: FPGAs can also offer a much lower processing latency than a GPU, even independent of I/O bottlenecks. It is well known that GPUs must operate on many thousands of threads to perform efficiently, due to the extremely long latencies to and from memory and even between the many processing cores of the GPU. In effect, the GPU must operate many, many tasks to keep the processing cores from stalling as they await data, which results in very long latency for any given task.The FPGA uses a “coarse-grained parallelism” architecture instead. It creates multiple optimized and parallel datapaths, each of which outputs one result per clock cycle. The number of instances of the datapath depends upon the FPGA resources, but is typically much less than the number of GPU cores. However, each datapath instance has a much higher throughput than a GPU core. The primary benefit of this approach is low latency, a critical performance advantage in many applications. Another advantage of FPGAs is their much lower power consumption, resulting in dramatically lower GFLOPs/W. FPGA power measurements using development boards show 5-6 GFLOPs/W for algorithms such as Cholesky and QRD, and about 10 GFLOPs/W for simpler algorithms such as FFTs. GPU energy efficiency measurements are much hard to find, but using the GPU performance of 50 GFLOPs for Cholesky and a typical power consumption of 200 W, results in 0.25 GFLOPs/W, which is twenty times more power consumed per useful FLOPs. https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/wp/wp-01197-radar-fpga-or-gpu.pdf

milstar: There are several resource-efficient, high-throughput im-plementations of 2D DFTs. Most FPGA based 2D FFT im-plementations rely upon repeated invocations of 1D FFTs byrow and column decomposition (RCD) with efficient use ofexternal memory [2][3][4]. Many of these achieve real-timeor near real-time performance (≥23 frames per second for astandard512×512image). https://arxiv.org/pdf/1603.05154.pdf

milstar: https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ds/FFT-Beamforming-datasheet-Intel.pdf Beamforming in the frequency domain has less stringent requirements on the input data sample rate comparedto the time domain technique.The input sampling frequency does not have an impact on the beam steering angle resolution.With FFT beamforming, there is potential reduction in computation workload as the size of the phase array increases. This translates to better performance which is an essential requirement for real-time systems.The benefit becomes more apparent with radar applications requiring large phase array systems.The multibeam beamformer involves two-dimensional (2D) FFT, which can be decomposedinto two separate one-dimensional (1D) FFT processes: temporal FFT, followed by spatial FFT.For planar arrays, the process can be decomposed further to 2D FFT in the x-axis, followed by another 2D FFT in the y-axis.In order to support wider bandwidth, the analog-to-digital converter (ADC) for these phase array systems are running in the giga samples per second (GSPS) rate. In order to support fast fourier transform (FFT) beamforming with data sample rates in the GSPS, this requires a super sample rate architecture that is able to process multiple phases in parallel. This demo highlights the highly parameterizable super sample rate FFT IP in DSP Builder Advanced Blockset. This allows the designer to select the number of phases, size of the FFT, and fixed or floating point implementation.

milstar: http://www.curtistech.co.uk/papers/beamform.pdf PRINCIPLES OF SONAR BEAMFORMINGThis note outlines the techniques routinely used in sonar systems toimplement time domain and frequency domain beamforming systems. Ittakes a very simplistic approach to the problem and should not be consideredas definitive in any sense.

milstar: A comparison of FFT processor designs https://essay.utwente.nl/72179/1/FFT_Comparison_Simon_Dirlik.pdf

milstar: For example, an ASIC processor potentially has a 10-1,000X performance advantage over its FPGA and GPPcounterparts, but it is expensive and inflexible MIT Lincoln Laboratory https://apps.dtic.mil/dtic/tr/fulltext/u2/1032251.pdf

milstar: https://ieeexplore.ieee.org/document/8548582 A high-throughput programmable fast Fourier transform (FFT) processor is designed supporting 16- to 4096-point FFTs and 12- to 2400-point discrete Fourier transforms (DFTs) for 4G, wireless local area network, and future 5G. A 16-path data parallel memory-based architecture is selected as a tradeoff between throughput and cost.

milstar: FFT Size/Accuracy ConsiderationsThe size of the FFT (number of data points) utilized deter-mines the frequency resolution and the accuracy of the FFT.The table shows that taking more samples improves theaccuracy of the FFT, however note that the square rootrelationship prevents a dramatic accuracy gain. The 4096point FF yields a 0.06% accuracy which is more suitable for12-bit A/Ds. http://www.datel.com/data/ads/adc-an4.pdf

milstar: https://www.jhuapl.edu/Content/techdigest/pdf/V22-N03/22-03-Cole.pdf AM/FM Noise in the Target Illumination Signal for Semi-Active Missiles Low-frequency (approximately10 to 400 Hz) noise limits are established such that target energy spreading out of the fast Fourier transform (FFT) bin occupied by the target does not adversely affect the missile’s target coherency test. Mid-frequency (approximately ≥400 Hz to ≤5 kHz) noise should not allow clutter to mask a crossing or slow target. High-frequency (>5 kHz) noise should not permit maximum clutter or spillover from degrading target sensitivity. When specifying noise, a specification bandwidth is also required. An industry-standard term for quantifying phase noise, denoted by L(f), is defined as decibels rela-tive to the carrier per hertz of bandwidth. (The terms phase noise and FM noise are used interchangeably in this article.) The noise specifications discussed in this article are given in various bandwidths as a function of frequency offset from the carrier. At the lower fre-quencies, a bandwidth that is 10 times smaller than the mid- and high-frequency ranges is typically used. We have some specifications where the high-frequency bandwidth is 100 times larger than the low-frequency bandwidth. The use of different bandwidths for differ-ent areas of the Doppler spectrum is a trade-off between two factors: (1) the need to detect narrowband signals in white Gaussian noise, which requires narrowband fil-ters, and (2) the need to complete the measurement in a timely fashion, which requires a filter with a bandwidth that is at least 10 times wider than the low-frequency bandwith

milstar: https://www.dataq.com/data-acquisition/general-education-tutorials/fft-fast-fourier-transform-waveform-analysis.html Figure 1 — The Fourier transform illustrated

milstar: https://www.sjsu.edu/people/burford.furman/docs/me120/FFT_tutorial_NI.pdf

milstar: The Fast Fourier Transform – Numerical Example What frequencies make up the following signal? This signal has 16 samples in it so we are going to run an 16-point FFT to find out the answer. http://www.themobilestudio.net/the-fourier-transform-part-13

milstar: https://www.ti.com/lit/ds/symlink/tci6638k2k.pdf?ts=1608214468851&ref_url=https%253A%252F%252Fwww.ti.com%252Fproduct%252FTCI6638K2K

milstar: https://www.ti.com/lit/wp/spry294/spry294.pdf

полная версия страницы