大三课件云计算workload aware credit scheduler for improving network io perance in virtualization environment_第1页
大三课件云计算workload aware credit scheduler for improving network io perance in virtualization environment_第2页
大三课件云计算workload aware credit scheduler for improving network io perance in virtualization environment_第3页
大三课件云计算workload aware credit scheduler for improving network io perance in virtualization environment_第4页
大三课件云计算workload aware credit scheduler for improving network io perance in virtualization environment_第5页
已阅读5页,还剩8页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、130IEEE TRANIONS ON CLOUD COMPUTING, VOL. 2, NO. 2, APRIL-JUNE 2014Workload-Aware Credit Schedulerfor Improving Network I/O Performance in Virtualization EnvironmentHaibing Guan, Member, IEEE, Ruhui Ma, and Jian Li, Member, IEEEAbstractSingle-root I/O virtualization (SR-IOV) has become the de facto

2、standard of network virtualization in cloud infrastructure.Owing to the high interrupt frequency and heavy cost per interrupt in high-speed network virtualization, the performance of networkvirtualization isly correlated to the computing resource allocation policy in Virtual Machine Manager (VMM). T

3、herefore, moresophisticated methods are needed to process irregularity and the high frequency of network interrupts in high-speed network virtualization environment. However, the I/O-intensive and CPU-intensive applications in virtual machines are treated in the same manner since application attribu

4、tes are transparent to the scheduler in hypervisor, and this unawareness of workload makes virtual systems unable to take full advantage of high performance networks. In this paper, we discuss the SR-IOV networking solution and show by experiment that the current credit scheduler in Xen does not uti

5、lize high performance networks efficiently. Hence we propose anovel workload-aware scheduling mwith two optimizations to eliminate the bottleneck caused by scheduler. In this m, guestdomains are divided into I/O-intensive domains and CPU-intensive domains according to their monitored behaviour. I/O-

6、intensive domains can obtain extra credits that CPU-intensive domains are willing to share. In addition, the total number of credits available isadjusted to accelerate the I/O responsiveness. Our experimental evaluations show that the new scheduling ms improve bandwidthand reduce response time, by k

7、eethe fairness between I/O-intensive and CPU-intensive domains. This enables virtualizationinfrastructure to provide cloud computing services more efficiently and predictably.Index TermsCloud computing, I/O virtualization, Xen, SR-IOV, hypervisor, schedulingÇ1INTRODUCTIONWvice providers must sa

8、tisfy users various require-ITH the rapid development of cloud computing, ser-resource 6, 7, 8, especially for high-speed networks such as a 10 Gbps network connection. Thus, network I/O processing correlates highly with the resource allocation policy of VMM scheduler, owing to the heavy cost of net

9、- work I/O processing. In a compelling cloud infrastructure, VMM should enable network performance of VMs to make full use of physical line rates. Furthermore, the virtual machine systems with different I/O or computing workload attributes should be scheduled differently with workload attribute awar

10、eness.A lot of research has been done on ways to make virtu- alization a desirable choice for network-intensive applica- tions 2. The Virtual Machine Device Queue (VMDq) is a component of Intel virtualization technology for connec- tivity 9. VMDq classifies packets by network adapter, which reduces

11、the burden on the VMM 10. However, itments with improved business agility. Virtualization is one effective way of building cloud infrastructure for enter- prises by utilizing their existing IT investments efficiently 1, 2, 3, 4, 5. Moreover, it can increase the available resources and the flexibilit

12、y of their management; consoli- date diverse operating systems on one physical machine tothe use of disparate hardware, allow IT staff tiver a bet-ter service at lower cost and leverage a broad ecosystems with greater security and reliability.Not only computing resources, but also network facilities

13、 are virtualized by the Virtual Machine Monitor (VMM), which allows multiple guest operating systems to access the network concurrently. The virtual network processing involves multiple trap-and-emulations, which make net- work I/O virtualization a heavy consumer of computingstill needs VMM interven

14、tion for memoryandaddress translation. Direct I/O is a technique that allows VMs to access hardware directly, without VMM interven- tion 11. However, a network device is assigned to a par- ticular VM and cannot be shared by other VMs. Single Root I/O Virtualization (SR-IOV) 12 is a novel solution fo

15、r network virtualization, which has become a de facto standard for cloud computing infrastructure. In this H. Guan and R. Ma are with the Shanghai Key Laboratory of Scalable Computing and Systems, Department of Computer, Shanghai Jiao TongUniversity, Shanghai 200240,.: hbguan, ruhuima. J. Li is with

16、 the School of Software, Shanghai Jiao Tong University, Shanghai200240,.: li-jian.m, packets are sent from Network Interface CardManuscript received 15 Sept. 2013; revised 26 Feb. 2014; accepted 24 Mar. 2014. Date of publication 1 Apr. 2014; date of current version 30 July 2014. Recommended for acce

17、ptance by I. Bojanova, R.C.H. Hua, O. Rana, andM. Parashar.(NIC) to the VM without the traditional data copy in the para-virtualized driver. Meanwhile, an SR-IOV-capable device can create multiple virtual functions (VFs), andFor information on obtaining reprints of this article, please send reprints

18、, and reference the Digital Object Identifier below. Digital Object Identifier no. 10.1109/TCC.2014.2314649to:each VF can be assigned to one VM, while all VMs stillshare the physical resource 12.2168-7161 2014 IEEE.Seeal use is permitted, but republication/redistribution requires IEEE permis

19、sion.for more information.GUAN ET AL.: WORKLOAD-AWARE CREDIT SCHEDULER FOR IMPROVING NETWORK I/O PERFORMANCE IN VIRTUALIZATION.1312 BACKGROUND2.1 Credit-Based SchedulerThe default scheduler in the current versionSR-IOV offloads almost all the driver domains overhead to NIC hardware and achieves high

20、 I/O performance 12, 13, 14. However, current VMM schedules I/O-intensive domains, as well as CPU-intensive domains, simulta- neously and equivalently, and as a result, they can only achieve poor throughput with high latency, because the scheduling is less than optimal. We analyse two major fac- tor

21、s that affect the I/O performance. First, an I/O-intensive domain may not get enough CPU resources in the presence of a high-performance network that costs a lot to process a large number of packets. Second, an I/O-intensive domain may not be scheduled in time. The default credit scheduler allocates

22、 credits with a fixed time slice, which is 30 ms. If the I/O-intensive domain happens to be inactive when the scheduler allocates credits, all the credits are allocated to another active domain, and network performance may be seriously affected.The main contributions of this work are summarized as f

23、ollows:First, we analyse the network performance bottlenecks, using a high system workload on the Xen VMM platform. Experimental results demonstrate that one of defects is that an I/O-intensive domain normally lacks credits to perform I/O. The other is that an I/O-intensive domain is more likely to

24、be inactive at credit allocation time, and may wait a relatively long time to handle the inputs. As tested, bottlenecks in the network virtualization frame- work degrade application performance, which motivate our further development and implementation of network virtualization.Second, we propose a

25、workload-aware network virtuali-of the XenVMM is the credit scheduler. The problem is to determine which virtual CPU (vCPU) should run on each physical CPU (pCPU), where a pCPU is a real CPU in a machine and a vCPU is a simulated CPU in a VM. Each processor core can provide 300 credits in each sched

26、uling slice (30 ms); the scheduler allocates these credits to domains according to each domains weight. There are three vCPU statuses for a domain: under, over and boost. under indicates that a vCPU still has credits to use, while over means that the vCPU has exhausted its credits. A vCPU which has

27、boost status has the highest priority; when an event is sent to an idle vCPU, this vCPU will be awakened, and if its priority is boost, it will preempt the vCPU which is running to achieve low I/O response latency. A run queue maintains vCPUs with boost on top, followed by vCPUs with under, while vC

28、PUs withover are last. The scheduler picks a vCt the head of therun queue as the next vCPU, and vCPUs that have the same priority will be scheduled in First In First Out (FIFO) mode. When a vCPU has enough credit, it is allowed to run for three tick periods (30 ms), the scheduler debits 100 credits

29、every 10 ms, and credits are redistributed every 30 ms. When it comes to multi-core architecture, if there is no vCPU with under in a pCPUs run queue, it will steal a runn-able vCPU from another pCPU. This load-balng mecha-nism guarantees that no pCPU is idle when there is runnable vCPU in the syste

30、m. The priorities are imple- mented as different regions of the same queue, with one queue of task per physical processor. The front part of the queue is boost priority, the next part is under priority, fol- lowed by over priority. When a task enters the run queue, it is simply inserted into the end

31、 of the corresponding region of the queue, and the scheduler picks up the task from the head of the queue to execute.zation mto monitor virtual machine behaviour, andpropose two new optimizations based on it: Shared Schedul- ing and Agile Credit Allocation. Since an I/O-intensive domain often lacks

32、credit in the presence of network traffic bursts, our shared scheduling provides extra credits that are shared by CPU-intensive domains to I/O-intensive domains. This approach ensures that I/O-intensive domains2.2 I/O Virtualization Mof Xenacquiequate resources. Agile Credit Allocation adjuststhe to

33、tal credits adaptively, according to the number of I/O- intensive domains, which shortens the waiting time for an I/O-intensive domain when it is removed from the active list during scheduling.In order to guarantee the safety of the system, the Xen virtual machine makes use of Isolated Driver Domain

34、s (IDD) to per- form I/O operations, while the user domains (domainU)use virtual device drivers for transparent I/O access. In thisFinally, we implemented the proposed mand opti-m, a device driver is divided into three parts: frontendmization, obtained extensive performance data to validatedriver; b

35、ackend driver; and native driver. The frontend driver is included in domainU; the backend driver and native driver run in domain0. The frontend driver transfers domainUs I/O requests to the backend driver, which inter- prets the request, and maps it into a physical device. Then the native device dri

36、ver controls hardware to finish the I/O operation. The I/O requests between the frontend and back- end drivers are passed through an I/O ring, implemented in memory shared between domainU and domain0. The I/O ring just contains the descriptive information for I/O requests, and I/O data is not put in

37、 the I/O ring.A VM transmits packets as follows: the frontend driver writes request information into the I/O ring through a hypercall; the backend driver handles the I/O request when domain0 is scheduled. It forwards the I/O request to the native driver, and after I/O access is completed by hard- wa

38、re, the backend sends a response to the frontend to finishand qufy the benefits of proposed approaches on actualXen virtualized systems. Note that our proposed shared scheduling and allocation schemes are generic enough, and can extend to all VMMs with proportional shared CPU scheduling algorithms,

39、such as Vmware ESXi 15 server and KVM 16, etc.The rest of this paper is organized as follows. We firstintroduce Xens credit scheduler, I/O mand the SR-IOV m in Section 2. Following this, in Section 3, we ana- lyse the network performance under heavy workload. Section 4 presents the design of our wor

40、kload-aware creditscheduling m, and describes the two optimizations in. In Section 5, we evaluate the performance of our sol- utions with several testing scenarios. Section 6 introduces related work about I/O scheduling, and we give our conclu- sions in Section 7.132IEEE TRANIONS ON CLOUD COMPUTING,

41、 VOL. 2, NO. 2, APRIL-JUNE 2014Fig. 1. Xen I/O virtualization architecture.the I/O operation. A VM receives packets using the oppo- site process of transmission. Xen provides a grant table to deliver DMA data effectively between domainU and domain0. Each domain maintains a grant table to indicate wh

42、ich pages can be accessed by which domains, while Xen has an active grant table to record the content of each domains grant table. When domainU needs to perform DMA, the frontend driver puts a grant reference (GR) and an I/O request into the I/O ring. After domain0 receives the request, it asks Xen

43、to lock the page according to the GR, Xen checks the grant reference and responds to domain0. Then domain0 sends a DMA request to the actual hardware. The architecture of Xen is shown in Fig. 1.Fig. 2. SR-IOV architecture.other words, I/O tasks do not get any special care from the scheduler, unless

44、that VMM is able to predict the behaviour of guest domains 17, 18. The systems computing com- pletion time for virtual network processing affects I/O per- formance significantly in virtualized environment, so there is potential to improve it. We set up an experiment to ana- lyse the performance of a

45、 high-speed network with highworkload. Theed hardware configuration of theexperimental platform is described in Section 5.3.1 Inadequate CreditA high-performance network is different from a low-speed network, especially in virtualized environment. Specifically, a low-speed network does not require t

46、oo much use of CPU resources, and I/O events just need a timely response. How- ever, in a high-performance network, such as 10-Gigabit Ethernet, CPU resource is another important factor in net- work performance because it has to process lots of packets in a short time. Under the SR-IOV architecture,

47、 packets are transmitted from NIC to VM directly, without the extra data copy in domain0; the VMM just needs to handle some interrupts whose overhead is mitigated, and domain0 is no longer a bottleneck for network I/O.Fig. 3 shows network performance change when the num- ber of CPU-intensive domains

48、 becomes higher and higher on2.3Single-Root I/O VirtualizationTraditional I/O virtualization, such as VMDq or direct I/O, does not give high I/O performance while supporting device sharing at the same time. SR-IOV is a novel solution to this problem, which improves the performance of data transmissi

49、on by eliminating the VMM intervention. An SR- IOV-capable device is a PCIe device, which has one or more physical functions (PF). A PF is a complete PCIe device that manages and configures all the I/O resources. A Virtual function is derived from a PF, which is a simplified PCIe device to implement

50、 I/O, but can not manage a physical NIC. The PF driver runs in domain0 to manage the PF, which can access all PF resources directly, to configure and manage VFs. The VF driver runs in a guest OS as a normal PCIe device driver, and accesses its corresponding VF directly. SR-IOV technology eliminates

51、the bottleneck from the driver domain by avoiding an extra data copy, so as to greatly improve the throughput of the VM with lower CPU utilization. The SR-IOV virtualization architecture is shown in Fig. 2. SR-IOV offloads the network I/O processing from the hypervisor to specialized hardware, and d

52、evice drivers VF interrupt is left as the major overhead that is still inter- vened by VMM, to remap it from host to guest 12.our testbed with quad-core ased in Section 5.1. In thisexperiment, only one VM is used as a server to receive packets from a remote native Linux, and the Netperf bench- mark

53、is used. Other VMs execute the CPU-intensive tasks, which are launched one by one to illustrate the performance change. An infinite loop is used here to burn credits.Fig. 3a shows that when there is only one domain per-network I/O, it can achieve 5.5;Gbps. It cannotforachieve higher bandwidth becaus

54、e only one VF is config- ured to the I/O-intensive domain, and it could not use more physical cores for I/O. When there are another one or two guests running CPU-intensive applications, the net- work performance is not much affected, because CPU- intensive domains use other idle physical cores witho

55、ut3PROBLEM ILLUSTRATION BY EXPERIMENTGenerally speaking, timely processing of I/O tasks is neces- sary for a better user experience. However, VMM is not aware of the behaviour of guest domains because of the scheduling hierarchical architecture, so the credit scheduler schedules vCPUs equally, accor

56、ding to their weights. Inharthe I/O-intensive domain. However, when thereare more than three domains running CPU-intensiveGUAN ET AL.: WORKLOAD-AWARE CREDIT SCHEDULER FOR IMPROVING NETWORK I/O PERFORMANCE IN VIRTUALIZATION.133Fig. 3. I/O performance with heavy load.applications, network I/O performa

57、nce declines signifi- cantly, because the I/O-intensive domain has to compete its CPU resource with CPU-intensive domains and domain0. Comparing Figs. 3a and 3c, the TCP test shows better performance with 0-2 CPU-intensive domains. This is mainly because TCP is a reliable protocol with conges- tion

58、avoidance and flow control algorithms to control the transmission rate intelligently. While, UDP protocol stack is as not well controlled as the TCP protocol stack, and when the network is congested Ethernet flow control mechanism in Data Link layer is triggered to stop the transmission. This is the reason why UDP achieves lower throughput than TCP. On the other hand, since UDP is more l

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论