Single chip DSP implementation of MPEG sound coding
1 MPEG sound coding principle
MPEG sound coding is a sub-band sound coding algorithm based on the auditory characteristics of the human ear. It belongs to a sensory sound coding method. The basic structure of the sensory sound coding algorithm is shown in Figure 1. According to whether the encoder focuses on frequency resolution or time Resolution can be divided into subband encoder and transform encoder. MPEG sound layer 2 coding algorithm divides the sound signal into 32 subbands in the frequency domain, which belongs to a subband encoder. In Figure 1, time-frequency mapping Also called filter bank, it is used to map the input sound signal to sub-sampled frequency components. According to the nature of the filter bank used, that is, the resolution of the filter bank in the frequency domain, these frequency components can also be called sub With sample or frequency line.
(a) (b) Figure 1 Structure block of sensory sound ***
The output of the filter or the output of the time-frequency transform in parallel with the filter bank is provided to the psychoacoustic model to estimate the time-dependent sound masking threshold. The psychoacoustic model uses the known simultaneous masking effect, including tuning masking Characteristics and untuned masking characteristics. If the front-to-back masking effect of the sound is used, the accuracy of the masking threshold estimation can be further improved. The subband samples or frequency lines are quantified according to the criterion to ensure that the spectrum of the quantization noise is below the masking threshold. And coding, this can ensure that the quantization introduced by the human ear minimizes the noise introduced. According to the complexity requirements, you can use block companding or entropy coding analysis and synthesis methods.
Frame packing combines the output of the quantization and encoding and the relevant side information in accordance with the prescribed format for *** use.
2 Coding quality and DSP speed
The key to realizing MPEG sound coding with single-chip ADSP-2181 is to solve two problems: one is how to ensure the quality of sound coding; the second is how to make full use of the computing speed of DSP. These two problems are often a contradiction and need to find the most A good combination.
Generally speaking, the quality of the MPEG audio encoder is determined by the quality of the acoustic model. However, for single-chip 16bit fixed-point DSP applications, this conclusion is no longer applicable. The analysis shows that the limited word length The effect of the effect on the encoding quality has become a major contradiction. Especially when analyzing the filter bank, the truncation effect actually brings 33 times the noise of the 16bit AD conversion quantization error, and the limited length of the window coefficient makes the side lobe attenuation as high as 96dB. The filter response is reduced to less than 70dB. Therefore, to ensure the sound coding quality, the analysis filter bank algorithm must be expanded in accuracy.
Regarding speed, the first thought is to use a fast algorithm. We also tried to use a fast algorithm in subband filtering [4].
However, practice has proved that the use of these fast algorithms on DSP is not ideal. The reasons are as follows: (1) Only the number of additions and multiplications are considered, and there is no concern for operations such as value addition and addressing, but For DSPs where all instructions are single cycles, the number of multiplications and additions is not particularly important compared to other operations; (2) The hardware characteristics of the DSP are not considered, and its algorithm cannot fully utilize the DSP ’s multiply-accumulator (MAC) parallelism Processing power; (3) ADSP-2181 is optimized for 16-bit algorithm operation. In the case of precision expansion, the amount of calculation will increase dramatically at an order of magnitude.
Based on the analysis of the above quality and speed requirements, we have selected a multiphase structure filter bank implementation suitable for DSP multiply-accumulate instructions, and adopt a precision expansion method based on MAC structure, which better solves the problem between encoding quality and DSP speed. Contradiction. In addition, the input method of sampling data, psychoacoustic model, and scale factor coding have been improved for ADSP-2181, reducing the amount of calculation and ensuring real-time performance.
3 Algorithm software design
Software design is the core of the single chip DSP implementation of MPEG sound coding. The requirements of coding quality and speed can only be achieved by carefully designing DSP software.
(1) Based on the accuracy of the MAC structure, the analysis filter bank for MPEG sound coding can be implemented in many ways. The multiphase structure is one recommended by the MPEG standard, and its mathematical expression is
(1)
(2)
The analysis shows that the double word expansion of Yk can reduce the noise caused by the truncation effect by 33 times. However, considering that ADSP-2181 only supports 16-bit multiply-accumulate operation, it is necessary to convert equation (1), that is
(3) Yk = HYk + 2-16LYk
In this way, the multiplier-accumulator structure of DSP can be used, and the amount of operation is only increased by about 1 times, and the storage amount is only increased by 64 words.
(2) Organization of input data The organization of input data should not only consider the easy acquisition of the original sound data from the digital-to-analog converter, but also consider the storage of the input data in the on-chip data RAM, which is suitable for FFT of the polyphase filter bank and acoustic model The input of the operation. The polyphase filter bank shifts in 32 new sound data each time and shifts out 32 old samples. The operation is as follows:
Xi = Xi-32, i = 511,510, ..., 32
Xi = next-input-audio-sample, i = 31,30,…, 0
However, ADSP-2181 is not suitable for data movement. Each assignment operation requires two instructions to complete, and each analysis filtering operation requires 1024 instruction cycles. If you use ADSP-2181's multi-channel automatic buffer serial port and indirect addressing capabilities , Properly organize the input sound data, you can use the sliding window method to move the data in and out, as shown in Figure 2.
Figure 2 Sliding window technology for polyphase filtering
In order to ensure the continuity of frame boundary processing, the input data buffer should be designed as a circular buffer, and its length should be able to store two frames of sound input data. When the DSP is processing one frame of data, the input data can be buffered to another frame. , The cost of data movement is saved. At the same time, the organization of the input data is also conducive to the FFT operation of the acoustic model. FFT needs to use the address reversal addressing mode of ADSP-2181. Because the FFT calculation and the input data cache are performed simultaneously , So the pointer calculated by FFT needs address reversal, but the input buffered pointer cannot be address reversed, otherwise it will cause the input sound data to be chaotically arranged. ADSP-2181 provides this capability, its first address pointer group I0, I1, I2, I3 have address reversal capability, but the second address pointer group I4, I5, I6, I7 is not affected by the address reversal mode. So select the pointer from the second address pointer group for input buffering, from the first address Select the pointer in the pointer group for FFT calculation.
(3) Improvement of the acoustic model. One of the difficulties in implementing psychoacoustic models with DSP is that there are a large number of logarithmic operations. Although polynomial approximation can be used to find its approximate value, its huge amount of calculation shows that this is not a wise choice. In the improved psychoacoustic model, the FFT operation is not immediately converted to the logarithmic domain, but a piecewise polyline is used to approximate the masking effect curve of the linear domain. For simplicity, a segmentation method consistent with the standard is used. The approximation adopts an exponent The polynomial expansion of the first term method, although this method is relatively rough, but as previously analyzed, the acoustic model is not the main contradiction when the 16bit fixed-point implementation, so it is still acceptable.
After obtaining the masking threshold, in order to calculate the signal-to-mask ratio for bit allocation, we still need to convert from the linear domain to the log domain. At this time, we use an approximate calculation method using the ADSP-2181 shifter. Through the EXP instruction, you can Extracting the exponent of the two's complement decimal has 1bit about 3dB for energy. Therefore, the exponent value is multiplied by 3 to approximate the dB value of the two's complement decimal, and the influence of the mantissa part is ignored.
(4) Coding of scale factors A total of 63 scale factors are given in the MPEG sound coding standard, but not all of these scale factors can be represented by 16bit binary numbers. If double words are used for precision expansion, when quantizing It will also face the huge overhead of double word division, so only use a subset of which can be accurately represented by 16bit two's complement decimals, that is, a scale factor that is a multiple of 3 and less than or equal to 45.
After using the scale factor subset, the scale factor coding can no longer be obtained by comparison, but can be obtained directly by calculating the index of the maximum amplitude of the subband, which simplifies the coding of the scale factor.
(5) The software simulation results are combined with the improvement of the above algorithms. According to the characteristics of ADSP-2181 and the MPEG standard, the software simulation is carried out with the development software of AD Company. Table 1 lists the calculation and storage capacity of each module obtained by the simulation. The estimated results required. The simulation was performed with a sampling rate of 48 kHz, an encoding mode of stereo, a sine wave with a frequency of 1 kHz, and an output code rate of 192 kbit / s. From Table 1, the performance of ADSP-2181 is known It has been fully utilized. The simulation results show that under the above conditions, the signal-to-noise ratio of the decoded output can reach about 80dB. It can be seen that the algorithm improvements made are more effective.
Table 1 Calculation and storage requirements of each module
Operation amount / (106 instructions / s) program storage amount / 103 word data storage amount / 103 word subband filtering 18 3.0 6.5 acoustic model 103.5 1.5 bit allocation and quantization 2 2.0 — formatted bit stream 1 0.5 1.0
4 Hardware design
The block diagram of the hardware structure is shown in Figure 3. The basic functions of each module are as follows:
DSP core: In addition to completing all encoding algorithms, the initial configuration of the analog-to-digital conversion circuit must also be completed; the sampling clock is selected through the auxiliary control circuit, and the encoding parameters of the host are accepted through the interface circuit.
Auxiliary control circuit: implemented by FPGA and auxiliary circuits, complete the functions of clock generation, FIFO status monitoring, address decoding, etc.
Output buffer: Temporary storage area of ​​the code stream, while providing a completely asynchronous output interface. It is particularly useful in applications that need to achieve lip synchronization of image and sound.
External memory: including BDMA space and I / O space.
Analog-to-digital conversion circuit: complete the digitization of the sound, and directly connect with the serial port 0 of the DSP. The sampling frequency is determined by the frequency of 256 times the sampling clock provided by the outside, and it needs to be initialized before normal operation.
Interface circuit: The interface circuit is divided into two parts, one is the code output interface, and the other is the interface to the host. The host interface uses the RS232 interface chip to complete the connection between the DSP serial port 1 and the host serial port, and the DSP uses interrupts and internal timers to achieve asynchronous Serial communication.
The above scheme has been implemented in the "Ninth Five-Year" scientific and technological research project, and the real-time codec sound has passed the subjective test.
* National "Ninth Five-Year Plan" Key Science and Technology Research Supporting Project Author Unit: Lin Shengmen Aidong School of Telecommunication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876; First author 25 years old, male, doctoral student
references
[1] ISO / IEC 11172-3-1993 Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit / s——part 3: audio
[2] Brandenburg K, Dehry YF, Johnston JD, et al. ISO-MPEG-1 audi a generic standard for coding of high-quality digital audio. J Audio Eng Soc, 1994, 42 (10): 780 ~ 791
[3] Wang Jianxin, Dong Zaiwang, India and Japan Fangqiang Research and Real-time Implementation of MPEG Audio Coding Algorithm, Journal of Tsinghua University, 1997, 37 (10): 45 ~ 48
[4] KonstanTInides K, Fast subband filtering in MPEG audio coding. IEEE Signal Processing Letters, 1994.
Li-(CFx)n Electro-Chemical Battery
DADNCELL lithium carbon fluoride battery (Li-(CFx)n) battery is a safe and stable battery system with an energy ratio of up to 2,400 wh/kg because of the positive electrodes used in CFx materials.
Fluorocarbon materials have stable physical and chemical properties and excellent high and low temperature operating properties. ≤600°C does not decompose, low temperature does not crystallize, and stable operation in a high and low-temperature environment at 20~125°C. The battery also has higher safety performance in short circuits, collisions, and compressions, and has explosion-proof and self-burning characteristics. At present, the whole series of products of the company's lithium fluoride batteries have passed the acupuncture test. Also, the lithium fluoride battery discharges only <0.5% per year, and the battery has a storage life of more than 10 years, at which time the battery can be thrown to save 95% of its electricity. There are no heavy metals and other pollutants in the production, use, and scrapping of batteries, which are green and environmentally friendly. All battery materials of DADNCELL batteries are independently developed and produced by the company, which can ensure the complete and long-term stable supply of the supply chain. Lithium fluoride batteries currently developed and produced by the company are applied on a large scale in fields with strict requirements for high and low temperature and high energy density, such as automobile tire pressure gauge (TPMS) batteries, industrial control motherboard batteries, computer motherboard batteries, smart instrument batteries, emergency equipment power supply for oilfield rigs, offshore rescue. Bioflash, implantable medical battery, etc.
The company can formulate DADNCELL lithium fluoride series battery solutions according to customer requirements.
Li-(Cfx)N Electro-Chemical Battery,Li-(Cfx)N Battery Pack,Lithium Carbon Fluorine Soft Pack Battery,High Temperature Resistance
Shandong Huachuang Times Optoelectronics Technology Co., Ltd. , https://www.dadncell.com