Multi-core and multi-threaded embedded system solutions
Build in embedded multi-core devices (including homogeneous or heterogeneous), as well as multi-threaded technology, it can bring many benefits, especially to improve the system performance most significantly.
Although RISC embedded technology challenges faced by more and more, but in the past, embedded software resources to maintain the compatibility of the premise, to improve the applicability of its future, and effectively enhance the performance of the new system so that it can be good solutions.
Decided to multi-core or multi-threaded applications
Multi-core and multi-threaded performance there in its help, but the effectiveness of these technologies with built-in fact there is no absolute relationship between the cause of the main reasons for this is the application of environmental requirements. mobile phone as an example, the integration in the SoC chip phone is a multi-core architecture is a ring, but the mobile phone used for SoC application processor chip, the integration of the core is not totally belong to the same nature of the structure, homogeneous multi-core in the practical application of embedded systems in fact, very few cases.
And multi-threaded processors in automotive electronics or embedded network environment play an important role, but some manufacturers use a number of stars composed of multi-threaded multi-core chip with both multi-threaded computing architecture, in other words, the two who is not only simple to choose sides, according to the needs of practical application, with or develop their own final solution has become the face of many manufacturers attitude problem. It also represents, in the choice of embedded systems infrastructure, the processor itself is only one aspect of the application, how applications will be able to maximize the effectiveness of the required must be in accordance with different products and a variety of considerations.
Not only between the technical and emotional
Real homogeneous multi-core architecture-ARM11 MPCore
Applications in the embedded multi-core processors this field, currently the technology leader for the ARM, although the company does not have fabs, but simply sold in the form of IP-processor architecture, as the positioning is correct, in a short period of few years has been a great market position, the vast majority of the world's hand-held devices are embedded ARM processor technology.
The development of its technology, the early structure of the ARM7 itself to meet the application of a number of audio codecs. And 16 saturation in the increase in instruction and the advancement of computing speed ARM9 core, not only to complete the work of audio codecs, as well as about 80 MHz, 15 frames / sec speed of the MPEG-4 QCIF (4 per 1 CIF resolution ) encoding. ARM11 V6 in the instruction set architecture and SIMD instructions to increase the speed, you can achieve the VGA-resolution H.264 encoding. Further to its Cortex A8 latest 64-bit SIMD architecture based on the Neon with the work under the accelerator, you can complete the 30 frame / s MPEG-4 VGA encoding ARM11 spent only half of the cycle. In practice, the work of about 300 MHz. In order to make these options more viable for users, ARM is developing a prototype of a parallel compiler, which can extract data parallelism, and SIMD hardware to use it.
Fig said: ARM11 MPCore diagram of the structure.
ARM11 MPCore is the basis of the ARM11 core of the framework directive on the system belongs to V6. In accordance with the needs of different applications, MPCore can be configured for 1 ~ 4 processor combinations, according to official said, the highest performance can be achieved about the extent of 2600 Dhrystone MIPS. MPCore homogeneity is a standard multi-core processors, MPCore is composed of four framework based on the ARM11 processor core, due to the advantages of multi-core design is the same in the case of frequency so that the performance of the processor was improved significantly, Therefore the application of multi-tasking is expected to have good performance, it is suitable for the future needs of the family consumer electronics. For example, set-top box recording several TV channels at the same time, but also via the Internet to watch on-demand digital video programs, in-vehicle navigation systems to provide navigation features at the same time, there is still a room to the rear seat passengers can play all kinds of video entertainment such as streaming.
Application in such circumstances, the structure of multi-core embedded processor to show strong performance advantages. According to the original data, MPCore multi-processor can support up to 4-way symmetric structure shared cache multi-processor (four-way cache coherent symmetric multiprocessing, SMP), or 4-way asymmetric multi-processor (four-way asymmetric multiprocessing, AMP), as well as both 4-way symmetric / asymmetric multi-processor system hybrid. The design of the high flexibility in the design of the application of theory to meet a variety of cross-elasticity of demand for computing performance to ensure that the system capacity available to respond to first-class or data throughput.
ARM11 MPCore But as early as in 2004, has been released in 2005, formally joined the authorized business, as of now, the products use the processor to focus on home appliances and automotive electronics, but the number is not that much, is the industry's computing power for the processor demand has not yet appeared? It is understood that in the automotive electronics, automotive applications of microprocessors have become increasingly demanding, but the past can meet the single-core is basically the use of vehicles, and as more and more electronic aids into car , during which the work required to deal with more and more complicated, has been far more than the traditional automotive microcontroller can afford, there can be expected that the next few years should be more and more car manufacturers to adopt a similar multi - the core structure of the system to obtain a reasonable reaction rate.
As for the home appliance applications, in fact, we need to use the core of such a complex product of not more than, the largest in the application of audio-visual products, in fact, most of the manufacturers are using dedicated hardware or a dsp decoding circuit for encoding and decoding of the action, direct the use of multi-core processors for encoding and decoding efficiency of action in fact was not obvious. Applications in action, in fact, power is still the most action-oriented product manufacturers, and ARM11 MPCore can be achieved even if a very low-power multi-core work at the same time, but still can not be compared with the single-core version, so the visibility of the application in action is not high. However, with the Intel implementation of MID (Mobile Internet Device), a similar product is expected to become a great framework for ARM11 MPCore chance, because even the next generation of 45nm products Stealey of Silverthorne, its power consumption is still higher than MPCore more than 5 times (plus on the total power consumption chipset), and only single-core structure, flexibility in the application is obviously not as good as MPCore structure, but there are 1 It is worth noting that, Silverthorne taking a lot of X86 software resources, ARM and other RISC-based system processor in this regard to be a visible place in the wind.
RISC architecture in the MID category of products, we can also consider the latest ARM processor architecture, that is, Cortex-A8, the processor based on ARM v7 system up-to-date, and an integrated 64-bit DSP processing unit, for streaming applications have a very good acceleration, it is very applicable to MID-type multimedia handheld devices and even gaming applications. Strictly speaking, Cortex-A8 can also be regarded as one of multi-core system, but its structure and MPCore core of a different kind of homogeneity, instead of using a general-purpose processor core and DSP core together with a more heterogeneous core processor, I believe this area from Texas Instruments ARM to a lot of experience in the development of application processors.
Fig said: Cortex-A8 diagram of the structure.
In fact, NOKIA will be the N770/N800 has all the features of the MID, but it is even short thin, but unfortunately, with the original 1500mAh rechargeable battery, and its continued use of the time only 3.5 hours to reach the general market UMPC products on the order of magnitude, less on Intel's MID products, systems using ARM processor (N800-based ARM1136J (F)-S core of the i.MX31 applications processor) advantage of the power has not been highlighted in this out, but standby time should be slightly longer than the MID.
Adhere to multi-threaded MIPS line
May be regarded as emotional, MIPS and ARM adhere to the technical development of different strategies, ARM development of Multi Processor (MP, multi-processor core), and MIPS are to Multi Thread (MT, multi-threaded) to develop, on the application of the concept of point of view, MP and MT technology, both committed to improving the overall performance of processors, both of which can reduce the current software implementation of any application thread of the processing time. But neither of these technology uses a different hardware architecture to reduce processing time, so for any specific software program code for, MP and MT improvement of processor performance of the different extent.
However, such a result would cause, in fact, two research and development of the concept of IP vendors are very connected. MT as a result of technology focused on the processing unit, memory controller of the efficient use of savings to the maximum extent possible the use of transistors, and under this premise up to enhance performance, which is in the MP framework, the number of system performance requirements, they copy the number of core chips into a completely different approach to waste, MP can be made more comprehensive breadth of application, but a bit of extravagance and waste, In contrast, MT in terms of cost and effectiveness of the performance of the balance is far more clever.
Many people will be placed on a par MP and MT, and to some extent, in fact, such a comparison is not really meaningful, because the basic design concept has been to far worse days, the use of the structure can not be generalized. Technically, in order to achieve multi-processing hardware, the best of both software complexity are in fact the same structure than the single-core to many more complex, and in order to avoid processing unit and memory controller in the conflict in the allocation of resources , MT structure may be more complex than some, but the MP framework to some extent, in fact, will face the same problem (especially the shared cache memory controller and the multi-core structure). Both in the instruction level, thread-level or multi-task, with the traditional single-core single-threaded process of writing and the best way to a great difference.
The general design of the MT framework, a single processor core in the process of computing, memory access will often could not keep pace with the increase in processor frequency, which led to the cache miss (miss), the formation of the implementation of the pipeline idle time situation, we all know that a system of storage units, have to be the fastest processor in the buffer, followed by the L1 cache, L2 cache body in mind, the last is the main memory, the speed of differences of up to a few thousand times more, the processor to obtain instructions or data, we must first extract from the cache, stored in the buffer in computing, the final results to come back to keep the cache, and when they are free to fill Back to the main memory, processor to cache when the issue of access needs, they found that the required data is not in the cache, this is it must cost an enormous amount of time to the main memory to find and read it during the waste time may be as high as the frequency of several tens of cycles waiting for data processing pipeline to fill the time idle was formed.
If the use of the concept of multi-threaded processing, and timely implementation of the other threads will pull from idle to fill has caused a state of growth and even its speed can reach the point very clear, if not doubled, but from 20% to 40% may . In order to achieve this purpose we need only to increase the number of transistors by about 15% to the extent, if the same general structure of a single-core processors to dual core in the change in the performance of growth of about 40% to 70% of the extent, and almost doubled the number of transistors the situation, we can see that the MT technology MIPS how efficient the. MT technology but there is a serious shortcoming, it is the work of multi-threaded process, too often the context switch (context switch) would be likely to cause great loss of performance.
Fig said: MIPS 74K processor structure diagram.
MIPS companies have large product lines, which are single-threaded the 24K and 74K series, as well as the 34K multi-threaded series. 74K just published in June this year, in the 65nm process, the frequency of its operation has gone beyond 1GHz, with the use of general-purpose DSP core processor design, but the performance of the overall performance and power consumption similar to the structure of slightly less than ARM Cortex-A8. Multi-threaded processor-34K series protagonist, the processor core can be set 1 or 2 virtual processing units (VPE) and up to 5 thread content (TC), can be configured to provide sufficient flexibility. However, speaking plainly, in fact, the practice of two VPE is simulated for single-core Core 2, so that 34K core can run two independent operating system, or a two-way symmetric multiprocessor operating system.
MIPS32 34Kc core using 90nm manufacturing process, under the worst operating frequency of 500MHz. Core size of 2.1mm2, while the core of the power consumption of 0.56mW/MHz @ 1.0V. At present, the core of the series include the 34Kc, 34Kf, 34Kc Pro and 34Kf Pro. The core is fully compliant with the specifications in the IEEE 754 floating-point processor hardware. Which 34Kc Pro and 34Kf Pro core with CorExtend features, allows SoC R & D expansion of the industry on its own instructions.
Fig said: MIPS 34K processor structure diagram.
According to their estimates of MIPS, the 24K with the same product family comparison, 34K in 2 VPE and two TC configuration, the performance can be upgraded to 24K processors beyond the extent of 60%, the chip area of Western increased by 14%, while multi-threaded operations because the result of errors in the ratio of the cache is increased to 4.41 percent from 5.16 percent, is in the acceptable range. However, compared with single-core 74K up, 34K is more does not apply to networks or multimedia streaming intensive computing environments, and the VPE and the increase in TC cells, the same chip area would increase. Although the limitations of MT technology to make it not suitable for the application of multimedia codecs, but in the automotive electronics, manufacturers have been successful use of the composition of two 34K-processor dual-core multi-threaded processors, and provide a significant outstanding performance, there is precedent for this success, we can also predict the future, there will be more integration MIPS multi-core and multi-threaded solution emerged, but the way, in terms of cost advantages but also the deployment of the remaining the number provided by the program makers to worry about it.
Embedded Systems Articles
- LPC2131-based CPLD CAN-Interface Design
- To reduce power consumption in portable applications
- FPGA-based iterative tomography reconstruction in the fractional approach to
- FPGA-based digital TV Signal Generator Design and Implementation of
- Embedded NVM to improve power management flexibility
- The advantages of FPGA co-processor
- The use of video on the video package to accelerate FPGA development
- ARM9-based embedded gateway Research
- Real-time operating system, μC / OS-II Improvement and Application of Research
- ARM-based remote control of intelligent home design
- Compact ARM-based image acquisition system
- Multi-core and multi-threaded embedded system solutions
- ARM high-speed flash memory-based MCU needs to deal with a wide range of embedded
- AT91 RM9200 used to build highly reliable embedded systems
- Space embedded image processing technology
- ARM7 and FPGA combination
- Embedded System LCD interactive menu design
- Design of embedded real-time operating system
- Linux2.4 kernel scheduler and Linux2.6 Comparative Study
- An improved version of UML in Embedded System
Can't Find What You're Looking For?
Rating: Not yet rated