Paintings & Photography

Big Data Framework Interference In Restricted Private Cloud Settings

In this paper, we characterize the behavior of "big" and "fast" data analysis frameworks, in multi-tenant, shared settings for which computing resources (CPU and memory) are limited, an increasingly common scenario used to
of 6
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Similar Documents
  Big Data Framework Interference In RestrictedPrivate Cloud Settings Stratos Dimopoulos, Chandra Krintz, Rich Wolski Dept. of Computer Science, Univ. of California, Santa Barbara { stratos, ckrintz, rich } - Contact:  Abstract —In this paper, we characterize the behavior of “big”and “fast” data analysis frameworks, in multi-tenant, sharedsettings for which computing resources (CPU and memory)are limited, an increasingly common scenario used to increaseutilization and lower cost. We study how popular analyticsframeworks behave and interfere with each other under suchconstraints. We empirically evaluate Hadoop, Spark, and Stormmulti-tenant workloads managed by Mesos. Our results showthat in constrained environments, there is significant performanceinterference that manifests in failed fair sharing, performancevariability, and deadlock of resources.  Keywords —  Big Data Infrastructures and Frameworks, PrivateCloud Multi-tenancy, Performance Interference I. I NTRODUCTION Recent technological advances have spurred production andcollection of vast amounts of data about individuals, systems,and the environment. As a result, there is significant demandby software engineers, data scientists, and analysts with avariety of backgrounds and expertise, for extracting actionableinsights from this data. Such data has the potential for facil-itating beneficial decision support for nearly every aspect of our society and economy, including social networking, healthcare, business operations, the automotive industry, agriculture,Information Technology, education, and many others.To service this need, a number of open source technologieshave emerged that make effective, large-scale data analyticsaccessible to the masses. These include “big data” and “fastdata” analysis systems such as Hadoop, Spark, and Stormfrom the Apache foundation, which are used by analysts toimplement a variety of applications for query support, datamining, machine learning, real-time stream analysis, statisticalanalysis, and image processing [3, 13]. As complex softwaresystems, with many installation, configuration, and tuningparameters, these frameworks are often deployed under thecontrol of a distributed resource management system [16, 4]to decouple resource management from job scheduling andmonitoring, and to facilitate resource sharing between multipleframeworks.Each of these analytics frameworks tends to work best(e.g. most scalable, with the lowest turn-around time, etc.)for different classes of applications, data sets, and data types.For this reason, users are increasingly tempted to use multipleframeworks, each implementing a different aspect of theiranalysis needs. This new form of multi-tenancy (i.e. multi-analytics) gives users the most choice in terms of extractingpotential insights, enables them to fully utilize their computeresources and, when using public clouds, manage their fee-for-use monetary costs.Multi-analytics frameworks have also become part of thesoftware infrastructure available in many private data centersand, as such, must function when deployed on a privatecloud. With private clouds, resources are restricted by physicallimitations. As a result, these technologies are commonlyemployed in shared settings in which more resources (CPU,memory, local disk) cannot simply be added on-demand inexchange for an additional charge (as they can in a publiccloud setting).Because of this trend, in this paper, we investigate andcharacterize the performance and behavior of big/fast datasystems in shared (multi-tenant), moderately resource con-strained, private cloud settings. While these technologies aretypically designed for very large scale deployments such asthose maintained by Google, Facebook, and Twitter they arealso common and useful at smaller scales [1, 11, 10].We empirically evaluate the use of Hadoop, Spark, andStorm frameworks in combination, with Mesos [4] to mediateresource demands and to manage sharing across these bigdata tenants. Our goal is to understand how these frameworksinterfere with each other in terms of performance when underresource pressure, and how Mesos behaves and achieves fairsharing [2] when when demand for resources exceeds resourceavailability.From our experiments and analyses, we find that eventhough Spark outperforms Hadoop when executed in isolationfor a set of popular benchmarks, in a multi-tenant system, theirperformance varies significantly depending on their respectivescheduling policies and the timing of Mesos resource offers.Moreover, for some combinations of frameworks, Mesos isunable to provide fair sharing of resources and/or avoid dead-locks. In addition, we quantify the framework startup overheadand the degree to which it affects short-running jobs.II. B ACKGROUND In private cloud settings, where users must contend for afixed set of data center resources, users commonly employ thesame resources to execute multiple analytics systems to makethe most of the limited set of resources to which they have beengranted access. To understand how these frameworks interferein such settings, we investigate the use of Mesos to managethem and to facilitate fair sharing. Mesos is a cluster managerthat can support a variety of distributed systems includingHadoop, Spark, Storm, Kafka, and others [4]. The goal of ourwork is to investigate the performance implications associatedwith Mesos management of multi-tenancy for medium andsmall scale data analytics on private clouds.  Fig. 1: Mesos Architecture.Mesos provides two-level, offer-based, resource schedulingfor frameworks as depicted in Figure 1. The Mesos Master isa daemon process that manages a distributed set of MesosSlaves. The Master also makes offers containing availableSlave resources (e.g. CPUs, memory) to registered frameworks.Frameworks accept or reject offers based on their own, localscheduling policies and control execution of their own taskson Mesos Slaves that correspond to the offers they accept.When a framework accepts an offer, it passes a descriptionof its tasks and the resources it will consume to the MesosMaster. The Master (acting as a single contact point for allframework schedulers) passes task descriptions to the MesosSlaves. Resources are allocated on the selected Slaves via aLinux container (the Mesos executor). Offers correspond togeneric Mesos tasks, each of which consumes the CPU andmemory allocation specified in the offer. Each framework usesa Mesos Task to launch one or more framework-specific tasks,which use the resources in the accepted offer to execute ananalytics application.Each framework can choose to employ a single Mesostask for each framework task, or use a single Mesos task to run multiple framework tasks. We will refer to the formeras “fine-grained mode” (FG mode) and the later as “coarse-grained mode” (CG mode). CG mode amortizes the cost of starting a Mesos Task across multiple framework tasks. FGmode facilitates finer-grained sharing of physical resources.The Mesos Master is configured so that it executes onits own physical node and with high availability via shadowMasters. The Master makes offers to frameworks using apluggable resource allocation policy (e.g. fair sharing, priority,or other). The default policy is Dominant Resource Fairness(DRF) [2]. DRF attempts to fairly allocate combinations of resources by prioritizing the framework with the minimum dominant share  of resources.The dominant resource of a framework is the resource forwhich the framework holds the largest fraction of the totalamount of that resource in the system. For example, if aframework has been allocated 2 CPUs out of 10 and 512MBout of 1GB of memory, its dominant resource is memory( 2 / 10  CPUs <  512 / 1024  memory ). The dominant share of a framework is the fraction of the dominant resource that ithas been allocated ( 512 / 1024  or  1 / 2  in this example). TheMesos Master makes offers to the framework with the smallestdominant share of resources, which results in a fair share policywith a set of attractive properties (share guarantee, strategy-proofness, Pareto efficiency, and others) [2]. We employ thedefault DRF scheduler in Mesos for this study. We use Mesosto manage the Hadoop, Spark, and Storm data analytics frame-works from the Apache Foundation in a multitenent setting.Each framework creates one Mesos executor and one ormore Mesos Tasks on each Slave in an accepted Mesos offer.In CG mode, frameworks release resources back to Mesoswhen all tasks complete or when the application is terminated.In FG mode, frameworks execute one application task perMesos task. When a framework task completes, the framework scheduler releases the resources associated with the task back to Mesos. The framework then waits until it receives a newoffer with sufficient resources from Mesos to execute its nextapplication task. Spark supports both FG and CG modes;Hadoop and Storm implement the CG mode only.III. E XPERIMENTAL  M ETHODOLOGY We next describe the experimental setup that we use for thisstudy. We detail our hardware and software stack, overviewour applications and data sets, and present the framework configurations that we consider.Our private cloud is a resource-constrained, Eucalyptus [9]v3.4.1 private cluster with nine virtual servers (nodes) and aGigabit Ethernet switch. We use three nodes for Mesos Mastersthat run in high availability mode (similar to typical fault-tolerant settings of most real systems) and six for Mesos Slavesin each cloud. The Slave nodes each have 2x2.5GHz CPUs,4GB of RAM, and 60GB disk space.Our nodes run Ubuntu v12.04 Linux with Java 1.7, Mesos0.21.1 which uses Linux containers by default for isolation, theCDH 5.1.2 MRv1 Hadoop stack (HDFS, Zookeeper, MapRe-duce, etc.), Spark v1.2.1, and Storm v0.9.2. We configureMesos Masters (3), HDFS Namenodes, and Hadoop JobTrack-ers to run with High Availability via three Zookeeper nodesco-located with the Mesos Masters. HDFS uses a replicationfactor of three and 128MB block size.Our batch processing workloads and data sets come fromthe BigDataBench and the Mahout projects [17]. We havemade minor modifications to update the algorithms to havesimilar implementations across frameworks (e.g. when theyread/write data, perform sorts, etc.). These modifications areavailable at [8]. In this study, we employ WordCount, Grep,and Naive Bayes applications for Hadoop and Spark and aWordCount streaming topology for Storm. WordCount com-putes the number of occurrences of each word in a given inputdata set, Grep produces a count of the number of times aspecified string occurs in a given input data set, and NaiveBayes performs text classification using a trained model toclassify sentences of an input data set into categories.We execute each application 10 times after three warmupruns to eliminate variation due to dynamic compilation by theJava Virtual Machine and disk caching artifacts. We reportthe average and standard deviation of the 10 runs. We keepthe data in place in HDFS across the system for all runsand frameworks to avoid variation due to changes in datalocality. We measure performance and interrogate the behaviorof the applications using a number of different tools including  Available Min RequiredSlave Total Hadoop Spark Storm CPU 2 12 1 2 2Mem (MB) 2931 17586 980 896 2000 Max Usedper Slave TotalHadoop Spark Storm Hadoop Spark Storm CPU 2 2 2 12 12 6Mem (MB) 2816 896 2000 16896 5376 6000 TABLE I: CPU and Memory availability, minimum framework requirements to run 1 Mesos Task and maximum utilizedresources per slave and in total.Ganglia [7], ifstat, iostat, and vmstat available in Linux, andlog files available from the individual frameworks.Table I shows the available resources in our private clouddeployment, the minimum required resources that should beavailable on a slave for a framework to run at least one task onMesos, and the maximum resources that can be utilized whenthe framework is the only tenant. We configure the HadoopTaskTracker with 0.5 CPUs and 512MB of memory and eachslot with 0.5 CPUs, 768MB of memory, and 1GB of disk space.We set the minimum and maximum map/reduce slots to 0 and50, respectively. We configure Spark tasks to use 1 CPU and512MB of memory, which also requires an additional 1 CPUand 384MB of memory for each Mesos executor container.We enable compression for event logs in Spark and use thedefault MEMORY ONLY caching policy. Finally, we configureStorm to use 1 CPU and 1GB memory for the Mesos executor(a Storm Supervisor) and 1 CPU and 1GB memory for eachStorm worker.This configuration allows Hadoop to run 3 tasks per Mesosexecutor. Hadoop spawns one Mesos executor per MesosSlave and Hadoop tasks can be employed as either mapper orreducer slots. Spark in FG mode runs 1 Mesos/Spark task perexecutor. In CG mode, Spark allocates its resources to a singleMesos task per executor that runs all Spark tasks within it. Inboth modes, Spark runs one executor per Mesos Slave. Weconfigure the Storm topology to use 3 workers. We run oneworker per Supervisor (Mesos executor) so three Slaves areneeded in total. We consider three different input sizes for theapplications to test for small, medium and long running jobs.As the number of tasks per job is determined by the HDFSblock size (which is 128MB), the 1GB input size correspondsto 8 tasks, the 5GB input size to 40 tasks and, the 15GB inputsize to 120 tasks.IV. R ESULTS We use this experimental setup to first measure the perfor-mance of Hadoop and Spark when they run in isolation (singletenancy) on our Mesos-managed private cloud. Throughout theremainder of this paper, we refer to Spark when configured touse FG mode as  SparkFG  and when configured to use CGmode as  SparkCG .Figure 2 presents the execution time for the three appli-cations for different data set sizes (1GB, 5GB, and 15GB).These results serve as a reference for the performance of theapplications when there is no resource contention (no sharing)across frameworks in our configuration.Fig. 2: Single Tenant Performance: Benchmark execution timein seconds for Hadoop and Spark on Mesos for different inputsizes.Fig. 3: Multi-tenant Performance: Benchmark execution timein seconds for Hadoop and SparkCG using different inputsizes, with SparkCG receiving its offers first.The performance differences across frameworks are similarto those reported in other studies, in which Spark outperformsHadoop (by more than 2x in our case) [12, 6]. One interestingaspect of this data is the performance difference betweenSparkCG and SparkFG. SparkCG outperforms SparkFG in allcases and more than 1.5x in some cases. The reason for thisis that SparkFG starts a Mesos Task for each new Spark task to facilitate sharing. Because SparkFG is unable to amortizethe overhead of starting Mesos Tasks across Spark tasks asis done for coarse grained frameworks, overall performance issignificantly degraded. SparkCG outperforms SparkFG in allcases and Hadoop outperforms SparkFG in multiple cases.  A. Multi-tenant Performance We next evaluate the performance impact of multi-tenancyin a resource constrained setting. For this study, we executethe same application in Hadoop and SparkCG and start themtogether on Mesos. In this configuration, Hadoop and SparkCGshare the available Mesos Slaves and access the same data setsstored on HDFS. Figure 3 shows the application executiontime in seconds (using different input sizes) over Hadoop andSparkCG in this multi-tenant scenario. As in the previous setof results, SparkCG outperforms Hadoop for all benchmarksand input sizes.  Fig. 4: Multi-tenant Performance: Benchmark execution timein seconds for Hadoop and SparkCG using different inputsizes, with Hadoop receiving its offers first.We observe in the logs from these experiments that Spark-CG is able to setup its application faster than Hadoop is ableto. As a result, SparkCG wins the race to acquire resourcesfrom Mesos first. To evaluate the impact of such sequencing,we next investigate what happens when Hadoop receives itsoffers from Mesos ahead of SparkCG. To control the timingof offers in this way, we delay the Spark job submission by10 seconds. We present these results in Figure 4. In this case,SparkCG outperforms Hadoop for only the 1GB input size.The data shows in this case that even though Spark ismore than 355 seconds faster than Hadoop in single-tenantmode, it is more than 600 seconds  slower   than Hadoop whenthe Hadoop job starts ahead of the Spark job. Whicheverframework starts first, executes with time similar to that of the single tenancy deployment. This behavior results from theway that Mesos allocates resources. Mesos offers  all  of theavailable resources to the first framework that registers with it,since it is unable to know whether or not there will be a futureframework to register. Mesos is unable to change its system-wide allocation decisions when a new framework arrives,since it does not implement resource revocation. SparkCG andHadoop will block all other frameworks until they completeexecution of a job. In Hadoop, such starvation can extendbeyond a single job, since Hadoop jobs are submitted on thesame Hadoop JobTracker instance. That is, a Hadoop instancewill retain Mesos resources until its job queue (potentiallyholding multiple jobs) empties.These experiments show that when an application requiresresources that exceed those available in the cloud (input sizes5GB and above in our experiments), and when frameworks useCG mode, Mesos fails to share cloud resources fairly amongmultiple tenants. In such cases, Mesos serializes applicationexecution limiting both parallelism and utilization significantly.Moreover, application performance in such cases becomesdependent upon framework registration order and as a resultis highly variable and unpredictable.  B. Fine-Grained Resource Sharing We next investigate the operation of the Mesos schedulerfor frameworks that employ fine grained scheduling. For suchframeworks (SparkFG in our study), the framework schedulercan release and acquire resources throughout the lifetime of anapplication. For these experiments, we measure the impact of interference between Hadoop and SparkFG. As in the previoussection, we consider the case when Hadoop starts first andwhen SparkFG starts first.We find (as expected) that when Hadoop receives offersfrom Mesos first, it acquires all of the available resources,blocks SparkFG from executing, and outperforms SparkFG.Similarly, when SparkFG receives its offers ahead of Hadoop,we expect it to block Hadoop. However, from the performancecomparison, this starvation does not occur. That is, Hadoopoutperforms SparkFG even when SparkFG starts first and canacquire all of the available resources. (We omit the figure withexecution times for each framework due to space limitationsfrom these text and we provide it on [14])We further investigate this behavior in Figure 5. In this setof graphs, we present a timeline of multi-tenant activities overthe lifetime of two WordCount/5GB applications (one overHadoop, the other over SparkFG). In the top graph, we presentthe number of Mesos Tasks allocated by each framework.Mesos Tasks encapsulate the execution of one (SparkFG) ormany (Hadoop) framework tasks. The middle graph shows thememory consumption by each framework and the bottom graphshows the CPU resources consumed by each framework.In this experiment, SparkFG receives first the offers fromMesos and acquires all the available resources of the cloud(all resources across the six Mesos Slaves are allocated toSparkFG). SparkFG uses these resources to execute the appli-cation and Hadoop is blocked waiting on SparkFG to finish.Because SparkFG employs a fine grained resource use policy, itreleases the resources allocated to it for a framework task back to Mesos when each task completes. Doing so enables Mesosto employ its fair sharing resource allocation policy (DRF) andallocate these released resources to other frameworks (Hadoopin this case) – and the system achieves true multi-tenancy.However, such sharing is short lived. As we can observe inthe graphs, over time as SparkFG Mesos Tasks are released,they are allocated to Hadoop until only Hadoop is executing(SparkFG is eventually starved). The reason for this is that eventhough SparkFG releases its task resources back to Mesos, itdoes not release  all  of its resources back, in particular, it doesnot release the resources allocated to it for its Mesos executors(one per Mesos Slave).In our configuration, SparkFG executors require 768MBof memory and 1CPU per Slave. Mesos DRF considers theseresources part of the SparkFG dominant share and thus givesHadoop preference until all resources in the system are onceagain consumed. This results in SparkFG holding onto memoryand CPU (for its Mesos executors) that it is unable to usebecause there are insufficient resources for its tasks to executebut for which Mesos is charging under DRF. Thus, SparkFGinduces a deadlock and all resources being held by SparkFGexecutors in the system are wasted (system resources are un-derutilized until Hadoop completes and releases its resources).In our experiments, we find that this scenario occurs for allbut the shortest lived jobs (1GB input sizes). The 1GB jobsinclude only 8 tasks and so SparkFG will execute 6 out of its 8 task after getting all the resources on the first round of offers. Moreover, Hadoop does not require all the Slaves to  (a) Number of active (staging or running) Mesos Tasks(b) Memory allocation per framework (c) CPU cores allocation per framework  Fig. 5: Multi-tenancy and Resource Utilization: The timelinesshow active Mesos Tasks, memory, and CPU allocation 8 tasks for this job as explained on Section III leavingsufficient space to Spark to continue executing the remainingtwo tasks uninterrupted.Deadlock in Mesos in resource constrained settings is notlimited to the SparkFG scheduler. The fundamental reasonbehind this type of deadlock is a combination of (i) frameworks“hanging on” to resources and, (ii) the way Mesos accountsfor resource use under its DRF policy. In particular, anyframework scheduler that retains resources across tasks, amortize the startup overhead of the support services (likeSpark executors), will be charged for them by DRF, andthus may deadlock. Moreover, any Mesos system for whichresource demand exceeds capacity can deadlock if there is atleast one framework with a fine grained scheduler and at leastone framework with a coarse grained scheduler. C. Batch and Streaming Tenant Interference We next evaluate the impact of performance interferencein Mesos under resource constraints, for batch and stream-ing analytics frameworks. This combination of frameworksis increasingly common given the popularity of the  lambdaarchitecture  [5] in which organizations combine batch process-ing to compute views from a constantly growing dataset andstream processing to compensate for the high latency betweensubsequent iterations of batch jobs and to complement thebatch results with newly arrived unprocessed data.For this experiment, we execute a streaming applicationusing a Storm topology continuously, while we introduce batchapplications. Our measurements show (Detailed performancegraphs can be found on [14]) that across frameworks and inputsizes, the performance degradation introduced by the Stormtenant varies between  25%  to  80%  across frameworks andinputs, and is insignificant for the 1GB input.The reason for this variation is that Storm accepts offersfrom Mesos for three Mesos Slaves to run its job. This leavesthree Slaves for Hadoop, SparkFG, and SparkCG to share.The degradation is limited because fewer Slaves impose lessstartup overhead on the framework executors per Slave. Theoverhead of staging new Mesos Tasks and spawning executorsis so significant that it is not amortized by the additionalparallelization that results from additional Mesos Slaves.When we submit batch and streaming jobs simultaneouslyto Mesos, we find no perceivable interference from batchsystems on Storm throughput and latency when storm receivesits offers ahead of those for the batch frameworks. When Stormreceives its offers after a batch framework, it blocks in thesame way that fine grain frameworks do (as we describe theprevious section) and fair sharing is violated.  D. Startup Overhead of Mesos Tasks We next investigate Mesos Task startup overhead for thebatch frameworks under study. We define the startup delay of a Mesos Task as the elapsed time between when a framework issues the command to start running an application and whenthe Mesos Task appears as running in the Mesos user interface.As part of startup, the frameworks interact with Mesos viamessaging and access HDFS to retrieve their executor code.This time includes that for setting up a Hadoop or Spark job,for launching the Mesos executor (and respective framework implementation, e.g. TaskTracker, Executor), and launching thefirst framework task.We find that as new tasks are launched (each on a newMesos Slave), the startup delay increases and each successiveSlave takes longer to complete the startup process as a result of network and disk contention. Slaves that start earlier completethe startup process earlier and initiate application execution(execution of tasks). Task execution consumes significant net-work and disk resources (for HDFS access) which slows downthe startup process of   later   Slaves. We also find that this inter-ference grows with the size of application input. (We providethe performance graph on [14]) Our measurements indicatethat this overhead contributes to a significant degradation in the
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!