Linux2.4 kernel scheduler and Linux2.6 Comparative Study
Linux kernel development is a long process, since November 2001 since 2.5.0 developed, Linux kernel development is very rapid, major improvements were made a lot of performance has been greatly improved. Improvement of the kernel scheduler is one of the most important advances in this paper a comparative study of the Linux2.4 and scheduler Linux2.6, comprehensive analysis of the scheduler Linux2.6 improvements.
A success of the basic requirements scheduler can be summarized as the following three points:
(1) reduce the time spent on scheduling, on the implementation of procedures to increase the time;
(2) In multi-processor system to maintain the load balancing processor;
(3) of interactive applications have a good response speed.
However, a successful scheduling is very difficult to design good, because a real system in operation by many factors. Vis-à-vis Linux2.6, Linux2.4 the scheduler a lot of the inadequacies of the Linux kernel version 2.6 the use of a new scheduling algorithm, called the 0 (1) algorithm, which in the case of high load implemented extremely well, and when there are many processors will be able to expand well.
O (n) algorithm, O on behalf of order, in parentheses under the figures represent the worst-case algorithm efficiency depends on the algorithm involves the upper limit of the number of elements, O (1) that is a constant, in this case, each the efficiency of meeting scheduling is the same, with the number of elements involved in has nothing to do, O (n) algorithm efficiency depends on the algorithm that the number of elements involved.
1 Linux2.4 scheduling mechanism
Linux2.4 scheduling mechanisms can be used to describe the following algorithm, as shown in Figure 1 diagram.
All the ready processes in place in the overall process of a queue, the queue without any sort of meaningful; time slice algorithm is re-count in the process have been exhausted all the time they are re-calculated after the film. The entire queue by a read / write spin lock (read / write spinlock) to protect, so that parallel access to multiple processors, but at the same time write operation to provide access to the mutex.
As can be seen by the algorithm, Linux2.4 the scheduling algorithm can be described as an O (n) algorithm, because the scheduler to select the cost of the implementation process is in place with the system of linear growth in the process of growth. At the same time, when there are multiple processors, the visit to ready queue becomes the bottleneck process, performance will decline significantly. Thus there are many shortcomings:
(1) for each scheduling, the scheduler must be a linear traversal of the queue to identify the most worthy of implementation of the running processes: When the system load high. Executable process would be long queues, linear search time is linear growth, this will be a very long time, when the time long enough, there may be multiple processors choose the same process, so that Some processors will find that he has chosen the process has been allocated to other processors, and had to choose, choose to run the process even longer than the actual implementation process of the case much longer time.
(2) process in place when most of the time slices are used up and also have a re-allocation of time for films, SMP systems in some processor is idle, it will affect the efficiency of SMP.
(3) When the processor idle time for the beginning of the implementation of those films have not been exhausted and in the process of waiting for the state (if their own processors busy) would result in the processor between the beginning of the process "jump", or occupation of real-time process the process of memory big jump between the processor will seriously affect the performance of the system.
(4) there are many processors in a system, when the process run out of their time waiting for films to be re-considered later, in order to obtain a new time slice, resulting in most of the processor is idle; this will have an impact on the SMP efficiency.
Therefore, when the system is not difficult to see a large number of executable processes, the choice to carry out a process may take a longer period of time, the system has multiple processors, the difficulty is even greater, the scheduling, in multi-processor or the system load is relatively high, the performance be affected.
2 Linux2.4 scheduling reasons for low performance
From the above analysis we can see that the result of scheduling Linux2.4 the main reason for the lower performance are as follows:
(1) system scheduling algorithm is O (n), spending growth is linear;
(2) there is only one global queue of ready processes, the flexibility of multi-processor support for good;
(3) processor affinity not easily lead to the process between the processors, "jump";
(4) time slice of the re-count cycle of constrained multi-processor efficiency.
Linux2.6 do a lot of improvement, it uses O (1) algorithm, which in the case of high load extremely well implemented, and when there are many processors can also be a good time to expand, not only greatly improved the SMP support, but also take into account the single-CPU or dual-CPU system.
3 Linux2.6 to improve the goal of the scheduler
In order to improve the above-mentioned Linux2.4 insufficient, Linux2.6 the scheduler can provide the following new features to improve the performance of scheduler:
(1) to provide a complete O (1) scheduling algorithm, that is to say, regardless of the number of systems in the process of how all of the scheduler algorithm have to be in constant time.
(2) should be a good SMP scalability, the ideal case, each processor should have an independent executable process queue and lock mechanism.
(3) of the processor should be more SMP-friendly, but at the same time there should be a time when the load imbalance in the inter-processor migration process.
4 Linux2.6 scheduling mechanism
The new scheduler are to achieve these goals, specific methods yes. Based on the CPU to the distribution of each time slice, and the abolition of the global cycle of synchronization and re-count.
There are two arrays for each process, activities and an array of ready process queue inactive ready queue array process. Each array has 140 ready process queue (runqueue), each queue 140 corresponds to a particular priority. From a bitmap to indicate which queue is empty, which is not empty, each queue is FIFO (FIFO). In this way, in the selection process, as long as the adoption of the first find_first_bit not find the queue is empty, and the first team in the process of taking it.
If a process consumes over its "time slice" is ready to enter the process of inactive queues corresponding array of teams end. When all the processes are "exhausted" its "time slice", the exchange of active and inactive ready process queue pointer array can, and do not need any other costs.
In this way, regardless of the number of queue processes in place, the speed of the selection process is in place must, therefore, referred to as 0 (1) algorithm, which may be described as follows, schematic diagram shown in Figure 2.
This algorithm has many advantages, summarized as follows:
(1) Each processor has its own queue of ready process, each processor can run in parallel to the selection process Scheduler program run on different processors, can process parallel to sleep, wake-up call and context switching.
(2) the process is only mapped to a processor queue of ready processes, the other processor will not be selected, and, therefore, will not jump between the different processors.
Of course, sometimes the needs of processors in the migration process between the processor, for example, when the load imbalance, each processor once every 200ms the other processor is not in the load imbalance, the ready process queue is empty processors will be inspected once every lms.
But this situation is not frequent, so the basic processor affinity can be guaranteed.
The new scheduler there is the greatly improved performance, a server among multiple processors in a large number of messages to send the test results as shown in table 1.
As can be seen from the table, use the new scheduler, at the same time, the system can do more.
5 Linux2.6 deficiencies Scheduler
New scheduling algorithm in the following areas could be improved.
First, although the processor speed in the development of fast, but the pace of development of the storage system is relatively slow, the operating time of the memory is often the bottleneck.
Scheduler to the processor when the allocation process should consider the relevance of the process. Consider such a situation: the two processes through frequent communication channels or shared memory, tests show that they are working on the same processor will be even better, because not related to the data processor from a copy in the caehe another processor in the cache. At present, the scheduler can not guarantee that this is closely related to the process of distribution to the same processor. The same problem also exists in the equipment related to each other.
Secondly, it is the process of migration, as in the inter-processor cost of the relocation process is different not the same, so the migration process, it should give due consideration to the characteristics of the process.
Migration process should be taken into account the size of the process (in this case refers to the size of share memory resources), the relocation process and the process has to take into account the size of memory, the process of migration to other large processors would be more severely affected performance of the system. Just imagine such a situation: it is the only processor A to the major process of moving to a processor B, and processor B are the major process of all processes, storage resources already strained, so that the process of processor A storage very rich in resources. And processor B is even more cake slot. At present, Linux2.6 scheduler in the migration process is also not taken into consideration when the size of the process.
Finally, when the system detects the need to migrate the process to balance the load when the load is not really non-equilibrium could he not? Load imbalance when the system is very minor and is not necessarily need to balance the load. Assume that such a situation: there are six at the same time completed the process of request, but only four processor system. In this way, there are two processes of two processors, while the other two processors, there is only one process. This is a problem, because the system is always unbalanced, resulting in total between the process of migration in the same processor, which also formed a jump.
Six pairs of points scheduler Linux2.6 recommendations for improvement
A task queue with the process and the process of the same family as far as possible are mapped to the same processor, because these processes require frequent communication between the possibilities is the largest; can also dynamically adjust the mapping process and the processor, when monitoring a two different processors in the process of frequent communications when every 200ms to check on the use of load-balancing plan to adjust them to the same processor.
Each process can be ready in the process of bitmaps stored in the process of some major signs of information, with the processor the process of accounting for the Chinese University of proportion to the process of moving out or moving into the major.
Regulation to set up a load balancing processor load threshold load_threshold, in the inspection system function load_balance For the processor load regulation of the actual load does not exceed the pre-given threshold, not on the processor for real-conditioning load balancing .
Embedded Systems Articles
- LPC2131-based CPLD CAN-Interface Design
- To reduce power consumption in portable applications
- FPGA-based iterative tomography reconstruction in the fractional approach to
- FPGA-based digital TV Signal Generator Design and Implementation of
- Embedded NVM to improve power management flexibility
- The advantages of FPGA co-processor
- The use of video on the video package to accelerate FPGA development
- ARM9-based embedded gateway Research
- Real-time operating system, μC / OS-II Improvement and Application of Research
- ARM-based remote control of intelligent home design
- Compact ARM-based image acquisition system
- Multi-core and multi-threaded embedded system solutions
- ARM high-speed flash memory-based MCU needs to deal with a wide range of embedded
- AT91 RM9200 used to build highly reliable embedded systems
- Space embedded image processing technology
- ARM7 and FPGA combination
- Embedded System LCD interactive menu design
- Design of embedded real-time operating system
- Linux2.4 kernel scheduler and Linux2.6 Comparative Study
- An improved version of UML in Embedded System
Can't Find What You're Looking For?
Rating: Not yet rated