Simulating Resource Management across the Cloud-to-Thing Continuum: A Survey and Future Directions

: In recent years, there has been signiﬁcant advancement in resource management mechanisms for cloud computing infrastructure performance in terms of cost, quality of service (QoS) and energy consumption. The emergence of the Internet of Things has led to the development of infrastructure that extends beyond centralised data centers from the cloud to the edge, the so-called cloud-to-thing continuum (C2T). This infrastructure is characterised by extreme heterogeneity, geographic distribution, and complexity, where the key performance indicators (KPIs) for the traditional model of cloud computing may no longer apply in the same way. Existing resource management mechanisms may not be suitable for such complex environments and therefore require thorough testing, validation and evaluation before even being considered for live system implementation. Similarly, previously discounted resource management proposals may be more relevant and worthy of revisiting. Simulation is a widely used technique in the development and evaluation of resource management mechanisms for cloud computing but is a relatively nascent research area for new C2T computing paradigms such as fog and edge computing. We present a methodical literature analysis of C2T resource management research using simulation software tools to assist researchers in identifying suitable methods, algorithms, and simulation approaches for future research. We analyse 35 research articles from a total collection of 317 journal articles published from January 2009 to March 2019. We present our descriptive and synthetic analysis from a variety of perspectives including resource management, C2T layer, and simulation.


Introduction and Motivation
Today, we are in the midst of a new evolution of the Internet, the Internet of Things (IoT), "a global network and service infrastructure of variable density and connectivity with self-configuring capabilities based on standard and interoperable protocols and formats. IoT consists of heterogeneous things that have identities, physical and virtual attributes, and are seamlessly and securely integrated into the Internet" [1]. Forecasts suggest that the number of connected things will continue to explode over the next five years reaching 42 billion and generating 79.4 zettabytes of data [2]. This explosion of data are changing the characteristics of Internet traffic fundamentally. of resource management topics investigated and simulation tools and methods used. The article concludes in Section 6 with a brief summary of findings and future directions for research. A glossary of acronyms used in this article can be found at the end of the article.

The Cloud-to-Thing Continuum
Traditional cloud computing aims to provide a virtually unlimited pool of resources available for an on-demand provisioning. Consequently, by design, cloud data centers are hyper scale systems characterised by hardware homogeneity, a common management layer, single organisation control, and a strong focus of cost efficiency [16]. They largely assume centralised compute and storage and are often located outside of major urban areas due to the cost and size of these operations.
The IoT revolves around a number of key concepts and enabling technologies including object (thing) identification, information sensing, communications technologies for data exchange, and network integration technologies [17]. Due to the scale, heterogeneity, distributed geographies, mobility, and variation in use and context inherent in IoT, traditional cloud computing and telecommunication networks can struggle to meet QoS requirements. As a result, new paradigms in computing have emerged to locate processing and storage along a C2T continuum-in the cloud, locally (at the edge), or somewhere in between (the fog). For the purposes of this study, the following NIST definitions will be used: • Fog Computing is a layered model for enabling ubiquitous access to a shared continuum of scalable computing resources. The model facilitates the deployment of distributed, latency-aware applications and services, and consists of fog nodes (physical or virtual), residing between smart end-devices and centralised (cloud) services [3].

•
Edge Computing is the network layer encompassing the end devices and their users, to provide, for example, local computing capability on a sensor, metering or some other devices that are network-accessible [3].
Edge devices and fog nodes often have limited capabilities due to form and limited power. Their geographic position may move regularly and unexpectedly, e.g., smart vehicles, and they may have intermittent connectivity to the Internet or other latency issues. Resource management in fog and edge computing environments directly impacts both quality of service (QoS) and quality of experience (QoE). Note that QoS is network-centric, whereas QoE is user-centric [18]. In addition, poor resource management may result in degradation of the useful working life of the end point device.

Simulation Software-From the Cloud to the Edge
The majority of real-world systems are too complex to be modelled using analytical approaches, hence such systems are studied using simulation [19]. Cloud computing is such a complex system comprising different interacting components, the behaviour of which can be modelled using a simulation approach. The core elements of the cloud system model describe data center hardware resources such as CPUs, memory, storage and network capabilities, the virtual environment where services are deployed, and user interaction with the services in the form of requests, jobs or tasks.
In cloud computing, simulation is typically used to evaluate system performance by modelling the relationships between hardware resource consumption and user request execution. Performance is typically measured by resource utilisation, task completion time, and/or service costs.
The most common simulation method for cloud systems is Discrete Event Simulation (DES) [20,21] where every system change is modelled as an event. Each event occurs in a sequential order and is capable of changing the state of the entire system, hence a strict, orderly queue of events must be observed. Maintaining such a queue of events can become a limiting factor for modelling large scale systems, particularly warehouse scale and beyond. Such models require a lot of memory and are hard to parallelise. New approaches, such as Discrete Time Simulation (DTS) [22], are emerging to overcome these limitations. DTS eliminates the need for a queue and calculates the system state in parallel at each time step.
Previous research [20,21] suggests that CloudSim is the most cited simulation framework used for cloud computing. In our review, this is also the case. CloudSim [6] uses DES and is open source. As such, it is widely used in scholarly research in its original form and has been extended by different researchers with additional features [23][24][25], to support more complex models and use cases [26][27][28][29], as well as enabling the evaluation of a wider set of KPIs [30,31]. Other DES simulation frameworks used for cloud research include OMNeT++ [32], CloudSched [13], DCSim [33], GreenCloud [34,35] and SimIC [36]. A more comprehensive overview of simulation software can be found in Byrne et al. [20,21].
Given the scale and complexity of modern cloud computing, let alone fog and edge computing, recent articles [37,38] have noted the need for simulator software to be designed (i) to simulate parallel-scheduling models, (ii) to support the scale of modern hyperscale infrastructure, (iii) for easy replication and extension, and (iv) for validation against a trustworthy source. It is questionable whether this is possible at hyperscale with a pure DES approach. While the use of DTS simulation software is in its nascency, their advantages in terms of scalability and speed are promising for fog and edge computing research [22].

Resource Management
Resource management is an umbrella term that encompasses all the characteristics and usage of resources in the cloud. It includes two main steps: (i) resource provisioning (resource detection and resource selection), and (ii) resource scheduling (resource mapping, resource allocation, resource monitoring and load balancing) [10]. These two steps face similar challenges, namely dispersion, uncertainty and heterogeneous resources. As such research has focused on new research provisioning methods and resource scheduling algorithms so that users receive cloud services within the time, cost and QoS parameters set out in the SLA [8,9]. Resource management in joint edge-fog computing environments faces similar challenges to traditional cloud computing however with significantly greater resource constraints, heterogeneity, multi-tenancy, and dynamism [5,11].
While a detailed discussion of resource management is beyond the scope of this paper, a number of surveys have been completed on resource management for cloud computing and, to a lesser extent, fog and edge computing that are worth referencing.   [9] is one of the most widely-cited survey papers on resource provisioning. They categorise literature from 2007 to 2014, identifying different types of study, and classifying resource provisioning mechanisms and their sub-types. In their analysis of 105 papers, only eight different simulators were identified for validating resource provisioning in the cloud-CloudSim, CloudAnalyst, GreenCloud, NetworkCloudSim, EMUSIM, SPECI, GroundSim and DCSim [9]. Chana and Singh (2016) [8] present a similarly widely-cited survey on resource scheduling based on works published from 2005 to mid-2015. They classify papers by study type, resource type, scheduling algorithms (and subtypes) and QoS parameters, resource distribution policies and resource scheduling tools. Again, the same eight simulators as [9] were identified. Unsurprisingly given the time, neither fog nor edge computing appear in either survey. More recently, Kumar et al. (2019) [10] published a survey of 209 papers published from 2010 to 2018 on scheduling techniques in cloud computing. While they note advancement in the state of the art, they also highlight the same underlying resource management issues persist. In the context of this paper, it is worth noting that Kumar et al. (2019) [10] analyse the evaluation methods and simulation tools used, and note simulation is used in 81% of the papers; however, they only identify six simulators-CloudSim, CloudAnalyst, EMUSIM, GroundSim, GreenCloud, and NetworkCloudsSim. Fog computing is referenced as an opportunity for future research and edge computing is not referenced at all.
In their recent survey of resource management in fog and edge computing, Hong [12] published a survey of 100 papers on resource management in fog computing; their analysis identified the preponderance of iFogSim and CloudSim in 49% of studies reviewed but also noted use of OmNet++, Java JMT as well as other mathematical simulation techniques. Duc et al. (2019) [11] recently published a survey of 73 papers on resource provisioning for edge-cloud environments. Of the 38 papers identified using simulation, the majority used mathematical simulation and simulation software was only identified in two studies, both of which used CloudSim or extensions thereof.
As can be seen from this brief overview, resource management is a key research topic in cloud, fog and edge computing. Simulation software is widely used for evaluating resource management methods and algorithms; however, the coverage and detail of simulation analysis is left wanting from a simulation research perspective. To the best of our knowledge, this is the first methodical review of its kind on the evaluation of resource management mechanisms across the C2T continuum that focuses exclusively on studies using simulation software.

Methods and Review Framework
For the purposes of this review, we follow a methodical literature survey technique with three phases of activities-planning, conducting, and reporting as per [39] and complemented by reference to the widely accepted guidelines and process outlined by [39,40]. The remainder of this section details the research question, the process for the identification of research, and the data extraction process.

Research Questions
The following research questions have been identified for this survey: • What resource management mechanisms were simulated in the research sample? • In which layer of cloud-to-thing continuum were studies conducted? • What variable or key performance indicator (KPI) was evaluated in the studies? • What simulation tools were used in the studies? • What simulation methods were used in the studies? • How has research on the evaluation of resource management across the C2T continuum evolved over time?

Identification of Research
Research was identified from the following electronic databases-Springerlink, ScienceDirect, IEEEeXplore, ACM Digital Library, Wiley Interscience, and Taylor & Francis Online. The search string "('resource management' OR 'resource scheduling' OR 'resource provisioning', 'QoS-based resource scheduling', 'QoS-based resource provisioning') AND cloud AND (fog OR edge) AND (IoT OR 'internet of things') AND ('simulation framework' OR simulator OR 'simulation tools' OR 'simulation engine')" was used to query these databases. To keep a focus on the quality of works, we limited the search to journal publications only, excluding symposiums, workshops, book chapters, and conference papers. In the case of ScienceDirect, the query did not produce sufficiently accurate results. As such, the query for ScienceDirect was rerun to include the keywords "simulation framework", "simulator", "simulation tools", and "simulation engine".
The initial search returned a total of 317 research articles related to the research topic. In line with Kitchenham (2004) [39], two researchers independently screened titles, abstracts and conclusions, meeting regularly to resolve differences. Title-based exclusion reduced the number of papers to 82; subsequent exclusion based on abstract and conclusions reduced the number of works to 60. In the next pilot phase, we started reading papers fully.
Following the pilot phase, a number of works were removed because they were outside the scope of the C2T continuum, e.g., grid and cluster computing. Additionally, we removed works that used mathematical analytical models referred to as simulations. As a number of quality issues were identified, it was decided to limit the survey to Q1 and Q2 journals only. The final inclusion and exclusion criteria are described in Table 1. Following the application of these criteria, 35 research articles were included in the final review. These are listed in Table 2.
While our search was across all years, the selected articles were all published in the period from January 2015 to March 2019 as shown in Table 2. Note that Data were only collected for the first quarter of 2019; thus, only seven publications are reviewed for that year. An analysis of journals suggests that Future Generation Computer Systems is the leading journal publishing on the topic with six publications followed by Computers & Electrical Engineering, and Simulation Modelling Practice and Theory, each with three publications. Notwithstanding this, the most cited article (508 citations), Gupta et al. (2017) [28] (iFogSim), features in Software: Practice and Experience, a journal edited by one of the leading authorities in cloud simulation and the originator of CloudSim, followed by  [42] in Information Sciences (178 citations).

Data Extraction
The following data were extracted for each of the 35 articles-bibliographic data, quality indicators, the resource management context, the variable or KPI under study, the C2T context (cloud, fog, edge or some combination thereof), the simulation software, and the simulation methodology. Data were extracted by two authors independently and compared. A third author cross-checked on random samples. Again, where conflict arose, a compromise meeting was convened and issues resolved. Extracted data were stored in a shared spreadsheet and entered using a form designed for this purpose.

Summary of Selected Research
This section summarises the research articles identified in the survey. Articles are presented by layer (cloud, fog, and edge) and within each layer in ascending order based on their year of publication.

Cloud Computing
Malawski et al. (2015) [41] examine resource scheduling and provisioning for a scientific workflow ensemble on an Infrastructure as a Service (IaaS) cloud. The goal of the study was to maximise the number of prioritised workflows that can be completed given budget and deadline constraints. Malawski et al. develop three algorithms to solve this problem-(1) a dynamic provisioning dynamic scheduling (DPDS) algorithm, (2) a workflow-aware DPDS (WA-DPDS) algorithm that extends DPDS by introducing a workflow admission procedure, and (3) a static provisioning static scheduling (SPSS) algorithm that creates a provisioning and scheduling plan before running any workflow tasks. This enables SPSS to start only those workflows that it knows can be completed given the deadline and budget constraints, and eliminates any waste that may be allowed by the dynamic algorithms. The algorithms are evaluated using CloudSim on a range of synthetic workflows, which were generated based on statistics from real scientific applications. The results of the simulation studies indicate that the two algorithms that take into account the structure of the workflow and task run-time estimates, WA-DPDS and SPSS, yield better results than the simple priority-based scheduling strategy (DPDS), which makes provisioning decisions based purely on resource utilisation. Arianyan et al. (2015) [44] explore consolidation in virtualised cloud data centers to tackle the problem of energy consumption and related increases in carbon emissions. They propose Enhanced Optimisation (EO) policy as a novel resource management procedure in cloud data center and novel multi-criteria algorithms for both phases of resource allocation, and determination of under-loaded physical machines (PMs) in cloud data centers. More precisely, they propose TPSA (TOPSIS (Technique of Order Preference Similarity to the Ideal Solution) Power and SLA-aware Allocation) as a novel policy for resource allocation as well as MDL (Migration Delay), AC (Available Capacity), and TACND (TOPSIS Available Capacity, Number of VMs, and Migration Delay) for determination of under-loaded PMs. Arianyan et al. used CloudSim to test and evaluate the proposed policies and algorithms using synthetic workloads. They compare the simulation results using different parameters. Performance is measured by comparing energy consumption, number of VM migrations and SLA violations. Kecskemeti (2015) [35] propose a new IaaS simulation framework called DISSECT-CF for simulating discrete event-based energy consumption for clouds and federations of clouds. It is designed for easy extensability, energy evaluation, and rapid evaluation of multiple scheduling and IaaS internal behaviours. The simulator has five main components: Event System (time reference), Unified Resource Sharing (resource consumption, resource scheduler), Energy Modelling (power state, consumption model), Infrastructure Simulation (VMs, PM, Network node), and Infrastructure Management (VM scheduling, PM scheduling). DISSECT-CF offers two major benefits: (i) a unified resource-sharing model, and (ii) a more complete IaaS stack simulation including, for example, VM image repositories, storage, and in-data-centre networking. DISSECT-CF is analysed by first validating it against the behaviour of a small-scale infrastructure cloud at the University of Innsbruck. The system simulated behaviour matched real-life experiments with negligible error (in terms of application execution time, larger-scale network transfers, and energy consumption). For larger-scale experiments, DISSECT-CF was validated by comparing it to CloudSim [6] and GroudSim [71] using both real-world and synthetic traces. The comparison results reveal that DISSECT-CF is scalable and produces more accurate results. Bux et al. (2015) [43] propose an extension of CloudSim, DynamicCloudSim, for evaluating resource allocation and scheduling strategies for cloud computing. The extended features of DynamicCloudSim include the ability to model instability in the cloud including homogeneity and dynamic changes of performance at run-time, as well as failures during task execution. DynamicCloudSim allows the simulation of different kinds of tasks (e.g., CPU, I/O, bandwidth-bound, etc.) on VMs with different performance characteristics. To inject greater verisimilitude, DynamicCloudSim supports heterogeneity, randomisation of individual performance of a VM, dynamic changes at runtime due to external loads, and fault tolerant approaches (through the introduction of straggler VMs and failures during task execution). In their paper, Bux et al. assess the impact of instability on the performance of four schedulers (Round Robin, Heterogeneous Earliest Finishing Time (HEFT), Greedy Queue and Longest Approximate Time to End (LATE)) using DynamicCloudSIm and compare results with results from real cloud infrastructure (Amazon EC2).
Magalhaes et al. (2015) [45] propose an extension of CloudSim in the form of a web application model to capture the behavioural workload patterns of different user profiles, and to support analysis and simulation of resource utilisation in cloud environments. The workload patterns are modelled in the form of statistical distributions, thus the patterns fluctuate based on realistic parameters to represent dynamic environments. The simulation results suggest that web application modelling can produce data to accurately represent different user profiles. A validation model is provided through graphical and analytical methods to show that the simulator effectively represents the observed patterns. Higashino et al. (2016) [27] propose an extension to CloudSim called CEPSim for modelling and simulating both complex event processing (CEP) and stream processing (SP) in the cloud. It can be used to study the scalability and performance of CEP queries and to easily compare the effects of different query processing strategies. The simulator can model different CEP systems by transforming user queries into a directed acyclic graph (DAG). The simulator supports resource placement and scheduling algorithms. CEPSim is highly customisable and can be used to analyse the performance and scalability of user-defined queries and to evaluate the effects of various query processing strategies. The CEPsim simulation results are compared to real CEP/SP system results. Experimental results suggest that CEPSim can simulate existing systems in large Big Data scenarios with accuracy and precision. Castro et al. (2016) [47] first evaluate the impact of RAM and CPU usage on energy consumption in a cloud data center; then, they propose two joint CPU-RAM approaches for dynamic VM placement in cloud data centers. The former is concerned with the detection of under-loaded servers, and the latter concerned with VM placement. The proposed approaches are tested in CloudSim and the results show that they can reduce energy consumption and SLA violations. However, the approach only examines consolidation on the data center servers and not in the network and storage devices. The technique is computationally heavy.
Da Silva et al. (2016) [46] propose a topology-aware VM placement algorithm. The goal of the proposed algorithm is to place a set of VMs in data centers in order to minimise the areas of a hierarchical data center network for a set of VM placements, so that fewer switches are needed to serve the network flows. This should lead to consolidated network flows from the VMs. In order to test their algorithm, Da Silva et al. extend CloudSim by adding a data center network topology (fat-tree topology) and an energy consumption model. Simulation results show that the proposed algorithm prevents the formation of network bottlenecks therefore accepting more requests of allocation of VMs, without compromising energy efficiency. The energy consumption of servers and switches are taken into account, and these are switched off whenever idle. However, the simulation addresses only one network topology and considers just one traffic pattern. Samimi et al. (2016) [42] propose a new model called Combinatorial Double Auction Resource Allocation (CDARA) to address biased market-based resource allocation by the cloud provider. CDARA focuses on the principle of equitable fairness to both providers and buyers. The scenario of multiple users and multiple cloud providers was simulated using four entities (user, broker, cloud provider, and cloud marketplace). CDARA has seven phases: (1) advertising and resource discovery where the provider proposes his services (resources), the user proposes his requirements, and the Cloud Information Service (CIS) creates the list of resources that matches with the user's requirements; (2) creating the auction and the bundle; (3) informing the end user of an auction; (4) determining the winner; (5) resource allocation; (6) pricing model; and (7) task allocation and payment. Samimi et al. use CloudSim to evaluate the performance and efficiency of CDARA based on an economic perspective and intensive compatibility. The simulation results show the effectiveness of the CDARA model. In CDARA, the winners are determined by a mediator; determining an optimal winner that maximises total profit for the cloud providers would seem to remain an open challenge. Cai et al. (2017) [48] propose Delay-based Dynamic Scheduling (DDS) algorithm for dynamic resource provisioning and scheduling to minimise resource renting costs while meeting workflow deadlines. New VMs are dynamically rented by the DDS according to the practical execution state and estimated task execution times, to fulfil the workflow deadline. Bag-based deadline division and delay scheduling strategies consider the bag structure to decrease the total renting cost. Cai et al. use a simulator called ElasticSim, an extension of CloudSim, to evaluate the performance of their proposed algorithm. The simulation results suggest that the dynamic algorithm decreases the resource renting cost while guaranteeing the workflow deadline compared to existing algorithms. Lin et al. (2017) [50] propose an extension for CloudSim, MultiRE-CloudSim. They add a multi-resource scheduling and power consumption model which allows for a more accurate evaluation of power consumption in dynamic multi-resource scheduling for cloud computing. MultiRE-CloudSim offers the following capabilities to CloudSim: (1) the ability to change the configuration of host and power models of resources easily, and test the effect of the algorithm under different parameters; (2) simulation of tasks that demand multiple types of resources and define different resource allocation algorithms with fine-grained evaluation; (3) seamlessly switch between static load and dynamical load experiments which makes it able to simulate more scenarios; (4) power simulation of multi-resource scenarios which is more accurate when compared with single resource CPU power simulation. Lin (4) placement of migrating VMs on PMs. They propose novel fuzzy multi criteria and an objective resource allocation algorithm that enables resource administrators to weight different criteria in the resource management solution by importance using fuzzy weights. CloudSim is used to evaluate the performance of the proposed algorithms. Experimental results suggest that the combination of the proposed resource management policies leads to a notable reduction in both energy consumption and SLA violations compared to benchmarks.
Ranjbari et al. (2018) [52] not only evaluate CPU usage but propose an algorithm using learning automate overload detection (LAOD) to predict CPU usage of PMs based on user resource usage history in order to improve resource utilisation and reduce energy consumption. The technique is tested using CloudSim and compared to benchmark algorithms. The results suggest that the approach outperforms comparators in reducing data center energy consumption. In addition, this study illustrates the benefit of using learning algorithms to predict resource usage for efficient resource management in the cloud. Mishra et al. (2018) [56] propose Energy-aware Task-based Virtual Machine Consolidation (ETVMC), a new task-based VM placement algorithm for mapping tasks to a VM, and a VM to a PM, in the cloud. The goal of ETVMC is to efficiently allocate tasks to VMs and then VMs to hosts so the allocation minimises energy consumption, makespan and task rejection rate. The proposed approach is evaluated and tested using CloudSim, and the simulation results demonstrate the effectiveness of the proposed algorithm compared to FCFS (First-Come First-Served), Round-Robin and EERACC (Energy Efficient Resource Allocation in Cloud Computing). Gawali et al. (2018) [53] propose an heuristic approach for optimising task scheduling and resource allocation in the cloud. The heuristic is a combination of four methods: (i) the modified analytic hierarchy process (MAHP), (ii) bandwidth aware divisible scheduling (BATS) + BAR (Balance-Reduce) optimisation, (iii) longest expected processing time preemption (LEPT), and (iv) divide-and-conquer methods. The MAHP is used to process the task before its allocation in the cloud while BATS + BAR optimisation is used to allocate the resources for that given task. The optimisation takes into consideration the bandwidth and cloud resources as constraints. The LEPT is used to preempt resource-intensive tasks. The divide-and-conquer method is applied to aggregate the results after task preemption (scheduling and allocation). Gawali et al. use CloudSim to test and evaluate their proposed approaches. Experimental results suggest that the proposed approach outperforms the existing BATS algorithm in terms of turnaround time and response time metrics.
Filelis-Papadopoulos et al. (2018) [22] propose a highly parallel MPI-and OpenMP-based simulation framework for hyperscale cloud computing, the CloudLightning Simulator. The simulation implements a sequential search and deploy algorithm that iterates through available resources to deploy VMs and tasks. The validation of the framework is based on number of tasks deployed, task execution time and resource utilisation parameters. Results suggest that the proposed simulator can cater for extremely large numbers of cloud nodes, the execution of extremely large numbers of tasks, and can support CPU, memory and network over-commitment. Kumar et al. (2018) [57] propose a resource allocation model for processing applications efficiently in the cloud, PSO-COGENT. PSO-COGENT uses the meta-heuristic based particle swarm optimisation (PSO) approach to allocate the task to resources in an optimal way. PSO-COGENT optimises execution time, cost, and reduces energy consumption in cloud data centers, considering deadline as a constraint. The problem is formulated as a multi-objective scheduling problem in the form of a mathematical model and defined objective and fitness functions. Kumar [37] examine the performance of a novel dynamic cloud resource allocation framework based on the principles of self-organisation and self-management (SOSM). The evaluation is performed using a warehouse scale model of a cloud data center and a parallel time-advancing loop-based simulation framework. Simulation experiments take into account resource heterogeneity including hardware accelerators such as Graphics Processing Units (GPU), Many Integrated Cores (MIC) and Field Programmable Gate Arrays (FPGA). The SOSM framework evaluation is conducted by analysing resource utilisation parameters such as CPU, memory, network and energy consumption. Results suggest that the proposed approach enables more accurate decisions on the placement of VMs over resources compared to the traditional centralised management approach.
Fernández-Cerero et al. (2018) [58] propose SCORE, a hybrid discrete-event multi-agent scheduling and resource utilisation simulator to reduce energy consumption in cloud computing environments. SCORE comprises two parts: (a) an energy-aware independent batch scheduler; and (b) a set of energy-efficiency policies for the hibernation of idle VMs. Four scheduling policies have been proposed for the control of energy consumption and the makespan during the assignation of tasks to VMs. The proposed scheduler takes into account the security demands of each task and trust levels of VMs that are computing those tasks. Additionally, the proposed simulator enables the compute of energy consumption for the whole system including the energy spent on performing security operations. Fernández-Cerero et al. test and validate their proposed simulator using different policies and parameters. Results suggest a reduction of up to 45% of cloud system energy consumption costs. Madni et al. (2019) [68] propose a Multi-objective Cuckoo Search Optimisation (MOCSO) algorithm for dealing with the resource scheduling problem in cloud computing. The algorithm reduces costs for cloud users and enhances performance by minimising task execution time and maximising resource utilisation. Improved resource utilisation enables increased revenue for cloud providers as more user requests can be processed with existing hardware. The MOCSO algorithm is evaluated using CloudSim and its performance compared against other multi-objective algorithms i.e., ant colony, genetic, multiple order model migration and particle swarm. Simulation results suggest that the proposed MOCSO algorithm performs better than the existing algorithms, and that it balances multiple objectives for expected time to completion and expected cost to completion matrices for resource scheduling in the IaaS cloud computing environment. The problem was modelled as a multi-objective optimisation problem that optimises the resource scheduling and load balancing while minimising the resource utilisation and processing time in the cloud environment. Simulation analysis suggests that F-MRSQN achieves better performance in terms of average success rate, resource scheduling efficiency and response time. It also suggests that F-MRSQN improves resource scheduling efficiency by 7% and reduces response time by 35.5% when compared to the benchmarks.
Moghaddam et al. (2019) [63] propose an extension of CloudSim by embedding new functionalities such as dealing with anomaly detection and elasticity of the resources (VMs) in the cloud environment. The proposed Anomaly and Cause Aware auto-Scaling (ACAS) framework consists of three main modules, monitoring, a data analyser (to detect anomalies), and a resource auto-scaler, which exploits two types of the resource adjustment policies, horizontal and vertical scaling. Horizontal policies address resource configuration strategies that change the number of active VMs in the system, whereas vertical policies are defined at finer grains of control (Elastic VMs) and adjust the number of allocated resources based on the new demands of the VM. An extensive experimental evaluation of the proposed system is done under various loads. Results demonstrate the ability of ACAS to maintain better quality CPU and response time performance compared to existing resource management solutions. Gupta et al. (2017) [28] propose an extension to CloudSim, iFogSim, which is capable of modelling fog computing and IoT scenarios. It enables simulation of resource management and application scheduling policies across edge and cloud resources under different scenarios and conditions. Additionally, iFogSim allows application designers to validate the design of their application with regards to cost, network use, and latency. Resource management is the core component of the iFogSim architecture and consists of components that coherently manage resources of the Fog device layer in such a way that application-level QoS constraints are met and resource wastage is minimised. The paper also presents a simple IoT simulation recipe and two case studies to demonstrate how one can model an IoT environment and plug in and compare resource management policies. Sood (2018) [54] propose a social network analysis (SNA) technique to detect resource deadlock at the fog layer. SNA is a rule-based approach for priority resource allocation. If the deadlock is detected, SNA tries to find free resources from jobs already allocated, and if the deadlock is not solved, the job is sent to the cloud. The allocation is based on rules and priorities. Sood uses CloudSim to evaluate the performance of their proposed technique and examine resource utilisation and request processing metrics. Simulation results suggest that SNA provides an effective solution to detect and solve deadlock in the fog and enables efficient resource utilisation in both the cloud and fog layers. Notwithstanding this, the study does not compare SNA to other state-of-the-art techniques. Mahmoud et al. (2018) [60] propose an energy-aware allocation policy for placing application modules in fog devices. It is an energy-efficient strategy that allocates the incoming tasks to Fog devices based on the remaining CPU capacity and energy consumption. The aim of the proposed allocation strategy is to increase energy efficiency at fog devices by allocating application tasks based on improved Round Robin and DVFS algorithms. Mahmoud et al. use iFogSim to evaluate the performance of the proposed approach. Remote patient monitoring was used as the use to evaluate the energy efficiency of the proposed placement strategy. The performance of the proposed strategy is evaluated by comparing it with the default allocation and cloud-only policies. Simulation results suggest that the proposed solution is more energy-efficient with approximately 2.72% in energy savings compared to cloud-only policies and approximately 1.6% in energy savings compared to the fog-default. Qayyum et al. (2018) [61] propose FogNetSim++, a novel simulation framework that provides detailed configuration options to simulate large fog network scenarios. The framework is based on OMNet++ and incorporates customised mobility models, fog node scheduling algorithms, and handover management mechanisms. FogNetSim++ can simulate network traffic at a packet level including packet drop, delay, network errors, handovers, latency, and re-transmissions. To evaluate the performance of the simulator, a communication network scenario is simulated based on a smart traffic management system. The simulation analysis demonstrates the scalability and effectiveness of the FogNetSim++ in terms of CPU and memory usage. The network parameters, such as execution delay, packet error rate, handovers, and latency are also benchmarked to demonstrate the FogNetSIM++'s ability to simulate network functionality. Naranjo et al. (2018) [62] present a study of a popular applications used in the Fog/IoT paradigm, followed by an architectural design for resource management using a container middleware layer. The proposed architecture allows dynamic scaling of compute and network resources to meet the demands of IoT and meet required QoS. In addition, the proposed management architecture is extended by a bin packing heuristic that works with scaling options to jointly reduce energy consumption. The approach is evaluated using two scenarios in iFogSim, a static and a mobile application, and compared against Modified Best-Fit-Decreasing (MBFD) and Maximum Density Consolidation (MDC) algorithms. Results suggest that the average energy consumption of the proposed heuristic is 20.36% and 33.84% less than benchmark algorithms.

Fog Computing
AbdElhalim et al. (2019) [70] use iFogSim to explore the latency of user nodes offloading tasks in a hyper-dense fog-to-cloud computing network. AbdElhalim et al. introduce adaptive low latency networks, irrespective of the number of user nodes, based on game theory. Game theory was applied to reduce energy consumption on the fog nodes as well as maintaining an acceptable latency. They also develop automated distributed fog computing for computational offloading using the theory of minority games. Resource allocation was performed in a distributed manner for better network management. Simulation results suggest that the proposed algorithm achieves the user satisfaction latency deadline as well as the required QoE. Moreover, it guarantees an adaptive equilibrium level of the Fog-to-Cloud computing system, which is suitable for heterogeneous wireless networks. Talaat et al. (2019) [67] propose an Effective Load Balancing Strategy (ELBS) for fog computing environments using data from a medical sensor application use case. Their solution includes a Priority Assigning Strategy (PAS), a Data Scheduling Algorithm (DSA), an External Data Requesting Algorithm (EDRA), a Server Requesting Algorithm (SRA), and a Probabilistic Neural Network based Matching Algorithm (PMA). PAS is used to classify incoming tasks based on their priority. DSA is used to locate the data on a local file system for task execution, and EDRA used to fetch data from remote file system if it is not present at the task processing location. The SRA algorithm allocates resources to process tasks by liaising between the Fog Master Server and Master Server Manager. Finally, PMA is used to match tasks with the server in a master priority queue. The proposed approach efficiency is demonstrated by modelling the system and medical sensor data use-case in iFogSim. Simulation results suggest that ELBS outperforms benchmarks achieving the lowest average turnaround time and failure rate. Accordingly, ELBS may be a suitable strategy for load balancing in the fog as it supports reliable execution for real-time applications.
Guerrero et al. (2019) [69] focus on resource provisioning in fog environments. They propose a lightweight decentralised service placement policy for performance optimisation in fog computing in order to optimise service placement and so reduce latency and network usage. iFogSim is used for evaluation and results compared to the existing (Edgewards) placement policy, which support the efficiency claims of the proposed approach.  [66] propose a discrete-time parallel simulation framework, based on the CloudLightning DTS simulator [22] that can handle large vCDN networks. The proposed simulation framework can update, in parallel, the state of sites and their resource utilisation with respect to incoming requests at hyperscale in a significantly faster way that existing cloud simulators. The framework is capable of reducing memory requirements while enhancing performance. It can be used for studying and optimising multi-content large scale vCDN networks under different distributions of VMs, hardware and input requests. Results support the claims of the improved effectiveness of the proposed simulation framework for studying vCDNs and optimisation of network infrastructure.

Resource Management Mechanisms
As we can see in Table 3, there is a significant imbalance in the evaluation of resource management across the C2T continuum. At a high level, every paper included resource scheduling. More specifically, 28 of the 35 papers (80%) focus on resource allocation; it was the primary focus of all papers regardless of C2T layer (see Table 3). This is not wholly unsurprising given that resource allocation is the central task in resource scheduling. With respect to other resource scheduling tasks, only two papers focus on resource monitoring [35,63] and three others specifically on load balancing [65][66][67]. Five papers focus on resource mapping [41,48,50,58,68] all in relation to cloud computing rather than fog and edge computing. This suggests a significant opportunity for research in evaluating resource mapping, resource monitoring and load balancing using simulation, in all layers of the C2T continuum. With regard to resource provisioning, only nine papers [41,[48][49][50]62,[64][65][66]69] specifically addressed both detection and selection tasks; with the exception of [62,69] in fog computing and [66] in edge computing, the remainder of articles focus on the cloud layer.
There is much more variation in the variables investigated. As well as a variety of performance metrics (e.g., VM placement, scalability, run-time, latency, accuracy, failure rates, and resource utilisation), three major themes that emerge are energy consumption/efficiency, cost, and SLA adherence (see Table 3). These are largely consistent across research on all layers of the C2T continuum although the number of articles on fog and edge computing is low. Fog and edge computing have obvious idiosyncrasies that do not feature in the research on centralised cloud computing relating to the different resources that feature in architectures built on these paradigms, for example, network usage and utilisation of fog nodes, and importance, for example, latency, device energy constraints, and reliability.
It is worth noting a significant difference between cloud computing and fog and edge computing in terms of focus. Resource management research in cloud computing overwhelmingly focuses on IaaS and as such the research focuses on the data center and infrastructure deployed in that environment for both input and output variables. In one sense, cloud computing models used in the simulation of resource management are much simpler. In contrast, fog and edge computing need to consider significantly expanded universe of variables not least the network infrastructure connecting the cloud to the fog and the edge, the configuration and location of the fog nodes and the edge devices, the connectivity of those edge devices, the capability and limitations of the form factors of the edge devices, and the use of context both in terms of user behaviour and situational context. All these factors complicate the time, effort and resources needed to undertake research across the cloud-to-thing continuum. Extant surveys identified more than 215 papers (of all types) on resource management in cloud computing, 105 on aspects of resource provisioning and 110 papers on aspects of resource scheduling. The research identified in this survey uses very similar QoS parameters to those reported in [8,9]; however, there is significant variance in resource management mechanisms used. There would seem to be significant opportunities for researchers to revisit extant research in these areas to evaluate the efficacy of these mechanisms across the C2T continuum and not in centralised clouds and the impact on QoS parameters and particularly C2T-specific KPIs, e.g., latency, fog utilisation, etc. In particular, given the extreme complexity of the C2T continuum, nature and bio-inspired approaches (including self-managing and self-learning approaches) may provide particularly fruitful avenues for research. Fog and edge computing are emerging paradigms in computing. NIST published its definitions and reference architectures for cloud computing in 2011 [72,73], whereas they only presented their recommendations for fog computing in 2017 and 2018 [3,74], and the definition of edge computing only featured in an Annex to these reports. Indeed, there is a plethora of proposed edge and fog architectures reflecting the nascency of the domain [75].
It is therefore unsurprising that the majority of the selected research articles reviewed (25) focuses on the cloud layer, followed by fog computing (8) and then edge computing (2). One of the issues in this domain is the difficulty in separating the boundaries of fog and edge computing and the conflation of the term IoT with both fog and edge computing.

Simulation Tools and Methods
Six discrete simulators were identified in the survey-CloudSim [6], DISSECT-CF [35], SCORE [58], ONE [76], FogNetSim++ [61] and CloudLightning Simulator [22]. Despite the small range of underlying platforms, this is significantly more than reported in recent resource management surveys. In line with [20], CloudSim was by far the most popular cloud simulation framework used in resource management research surveyed in this study (see Table 3). Seventeen articles used CloudSim and a further ten articles used variations or extensions of CloudSim including ElasticSim [48], CEPSim [27], DynamicCloudSim [56], MultiRE-CloudSIm [50] and iFogSim [28]. The popularity of CloudSim can be attributed to the maturity of the simulator, the support community, and the modelling design approach which supports brokering of workloads between multiple date centres; a design that can be re-purposed for a distributed edge infrastructure. After the original CloudSim, the most popular extension of CloudSim used was iFogSim which was used in six studies, the most popular simulator for fog computing research for obvious reasons. Only eight articles used simulators other than CloudSim or variants. While the CloudLightning Simulator was used in three articles, it was used only by the originators of the software. Only two simulators were designed for fog computing specifically, iFogSim and FogNet++; both ONE and the variation of the CloudLightning Simulator in [66] were designed or adapted for edge computing scenarios.
As CloudSim is overwhelmingly the most popular base simulation platform in the survey, by default, DES is the most popular modelling technique. Only the CloudLightning Simulator supports DTS although the SCORE simulator implements a hybridisation of discrete-event modelling and multi-agent systems to achieve scalability. Given the complexity and scale of computing in the C2T continuum and IoT scenarios, the use of DES-based simulators call into question the generalisability of results from simulation research on resource management based on DES.
It should also be noted that 19 articles presented simulation frameworks or extensions to simulation frameworks, four of which related to fog or edge computing [28,59,61,66].

Evolution over Time
As can be seen in Table 3, most of the articles found are in the cloud layer (25), followed by fog (8) and edge (2). It is also worth noting that while all articles present research on the evaluation of resource management across the C2T continuum, the survey also includes 19 articles proposing both simulators or extensions to simulators. In total, some 20 journals have published on the topic in the period under study. It is also worth noting that the number of Q1 and Q2 journal publications on the evaluation of resource management across the C2T continuum is increasing year on year. In particular, the introduction of iFogSim [28] in 2017 is directly responsible for stimulating research in subsequent years, a trend that is expected to continue into the future based on the maturity of the underlying CloudSim simulator framework and the established base of users. All other articles identified for both fog and edge computing were published in 2018 and early 2019. While there is a clear transition of research from cloud computing to the edge, it is more correct to say that 'all boats are rising'; there is more research being published on this topic both in cloud computing, and in fog and edge computing. Given the significant multiplier of papers that have been published in the last two years in conferences and as reported in recent surveys [5,10,11], one would expect significant increase in journal publications in the near term. Table 3 illustrates the dominance of resource allocation as a theme from 2015 to 2019. While this trend is likely to continue, one can see that not only are more journal articles being published on a wider number of resource management mechanisms, but one can see greater focus on resource provisioning and load balancing in 2019. Resource mapping and monitoring remain at relatively low levels.

Conclusions and Future Directions
This paper provides a methodical survey of scholarly literature of top quartile journal articles that evaluate resource management mechanisms across the C2T continuum using simulation software tools. While Computer Science research traditionally places greater emphasis on conference publications and those publications can garner comparatively favourable citations to journals, inconsistent acceptance and rejection rates and regional bias can skew citations [77]. Rightly or wrongly, in the wider academic community, peer review journal publication continues to be perceived to be of higher academic and review rigour. As such, a decision was made to limit this review to publications in the two top quartiles (Q1 and Q2) of journals. To this end, from an initial pool of 317 journal articles across all time, 35 relevant articles on the survey topic from January 2015 to March 2019 were reviewed.
Resource management evaluation using simulation software emerged in top quartile journal publications in 2015 for cloud computing in 2015 with the adoption of CloudSim, but publications on fog computing and edge computing are a recent phenomenon. They can be directly traced back to the release and first publications on iFogSim in 2017. This is consistent with efforts to agree on definitions and reference architectures on fog computing and edge computing by NIST and others in 2017 and 2018 [3,74]. Given the significant interest in the IoT and the conference proceedings pipeline, it is likely the trend towards increased publications on resource management across the cloud-to-thing continuum is likely to continue.
From a resource management perspective, the review reveals a heavier research interest in resource scheduling over resource provisioning, and specifically in resource allocation; 28 of the 35 papers reviewed dealt with resource allocation. There are not proportionate efforts in the resource provisioning (resource detection and resource selection), and the wider resource scheduling tasks, e.g., resource mapping, resource monitoring, and load balancing are significantly under-represented, even allowing for the small sample size. Similarly, the evaluation of resource management mechanisms for cloud computing is the dominant theme. There were only ten fog and edge computing articles in total. While this is relatively low compared to cloud computing, it reflects the nascent stage of research on these topics. This represents an opportunity to revisit extant studies, policies, algorithms and other techniques on resource management and investigate their applicability in new fog and edge computing contexts. In particular, we encourage researchers to explore how nature or bio-inspired approaches to resource management mechanisms, including self-organising, self-managing and self-learning approaches, might be used to address the complexity of resource management across the C2T continuum. Likewise, there is a need to explore whether the appropriate variables and optimisation functions under examination are the most important and appropriate for C2T continuum contexts. The KPIs for each layer, cloud, fog, edge, the endpoint, and indeed the network infrastructure supporting these, are likely to be significantly different and contingent on context much more than a centralised (albeit distributed) cloud computing model, which is the basis for much of extant research.
Given the particular idiosyncrasies and scale of the cloud-fog-edge environments and the criticality of many use cases, not least e-health and autonomous transport, and the importance of QoE in highly competitive markets, more complex modelling and simulation will be needed. Highly contextualised infrastructure, user, workload distribution, workload propagation, QoS and QoE models are required. Future research needs to explore, simulate, and evaluate mechanisms not only for each of these in isolation but how they impact each other. Similarly, more fine-grained data are required to inform these models for high-demand applications in steady state over long time horizons to account for seasonality, but also to allow for anomalies. While a number of studies explore consolidation [47,51,56], these typically focus on consolidation in cloud data centers, and not in the network or at the edge. This would seem worthy of exploration particularly given the energy constraints at the edge. Finally, only one of the articles reviewed dealt with the issue of security [58]; data privacy and protection are two of the main barriers to adoption of cloud, fog and edge computing; this is likely to only intensify as more and more things are connected and the chain of service provision and multi-tenancy across the C2T continuum fragments and increases in complexity. Different workloads with different security requirements, and different resource mechanisms have different levels of vulnerability. It is entirely foreseeable that service providers and their clients may impose constraints on resource allocation mechanisms and access to various infrastructure. The mechanisms to impose these constraints while meeting SLA requirements will require evaluation.
As discussed earlier, 19 of the 35 articles presented simulation frameworks or extensions to simulation frameworks; however, only four of these related to fog or edge computing. It is clear from this review that significant effort needs to be invested in the development of simulation frameworks and underlying models, such as those discussed above, capable of supporting research within a domain with the complexity, heterogeneity, and scale of the IoT. While there are benefits to mature simulation frameworks such as CloudSim and techniques such as DES, they may not be fit for purpose with regard to resource management in the C2T continuum. Recent studies using new simulators, such as SCORE and CloudLightning Simulator, are a positive step forward in this respect; however, they are not widely adopted. The significant part of the value of simulation is associated with the level of verisimilitude. This is essential for credibility in the wider cloud, fog, and edge community and for successful translation to industry.
5G and beyond 5G networks are significant foundational technologies of the future IoT. We specifically do not address such works in detail in this survey. While such networks mandate operators move from bare metal to a more cloud-based approach and they share similar problems, e.g., heterogeneous resources, both resource management and simulation in communication networks vs. cloud computing have evolved as discrete fields of research. Due to the complexity of developing and operating a unified framework of a complete 5G infrastructure, telecommunications network researchers typically focus on a subset of aspects related to physical layer transmission, access control, dynamic network configuration, new bandwidth demanding services, and radio network planning procedures [78]. In contrast, cloud researchers traditionally remain distant from the physical layer. Similarly, communication network researchers have developed specific network simulation software designed to meet their purposes [78]. As these two fields become more dependent on each other, greater integration of resource management thinking and simulation will be required. For example, proactive or anticipatory resource management is worthy of exploration. Furthermore, in 5G, cell size and density vary, and in proactive optimisation, different levels of context granularity will need to be modelled and simulated for both cells and cell clusters. For applications such as virtual content distribution networks or smart mobility, integrated models and simulations will be required whether to allocate network resources or applications across the edge, fog or cloud, in order to optimise micro data center placements. As the time-frames for proactive optimisation are typically extremely short, it is unclear whether simulation software, particularly DES, would be appropriate without a step change in the state of the art. DTS such as CloudLightning or the hybrid ONE simulator may be more promising.
This paper provides two key contributions to the research community. First, to the best of our knowledge, this is the first methodical review of its kind on the evaluation of resource management mechanisms across the C2T continuum that focuses exclusively on studies using simulation software. Second, it categorises extant research and highlights areas where future research contributions in (i) resource management, (ii) cloud, fog and edge computing and their interaction, and (iii) simulation, are required. Our suggestions for research contributions are based on the idea that every key task in resource provisioning and resource scheduling is important for investigation for the C2T continuum and the future IoT. This review also provides an insight into one part of the simulation, resource management, and cloud, fog, and edge computing research community, and how it is evolving. Finally, it is worth noting that the Internet of Things has the potential to change how society interacts and operates. Therefore, it will be useful for future research efforts to combine insights from other disciplines into these research communities and research agendas.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: C2T