Skip to content

Latest commit

 

History

History
101 lines (76 loc) · 4.35 KB

clover.md

File metadata and controls

101 lines (76 loc) · 4.35 KB

Tutorial for Clover

Overview

In order to run Clover, we need one server for Metadata Server (MS), at least one server for Computing Node (CN), and at least one server for Memory Node (MN). For best performance, we recommend dedicating a physical machine to each Clover node. In theory, it is possible to consolidate Clover nodes using VMs and SR-IOV enabled RNIC.

We will also run a memcached server instance acting as a central RDMA-metadata store. Clover nodes publish their QP/MR information to memcached, which are used to build all to all connection by Clover automatically.

Run

Quickstart

Switch folder to pDPM/clover/.

Suppose we use three servers [S0, S1, and S2] to run a [1 MS, 1 CN, and 1 MN] setting. We will run both MS and memcached on S0; a single CN on S1; and a single MN on S2. To start, run the following script at each server one by one:

  • Server-0: memcached -u root -I 128m -m 2048
  • Server-0: ./run_ms.sh 0
  • Server-1: ./run_clients.sh 1
  • Server-2: ./run_memory.sh 2

The parameter passed to each script is a static unique Clover node ID. Usually the MS uses 0. CNs and MNs will span a contiguous range.

Once experiment finishes, run local_kill.sh to terminate all relevant processes.

Configurations

General

Most of the runtime configurations are controlled by bash scripts. Since each script file is very similar, we will use run_ms.sh to demonstrate.

First, specify the RNIC device via a unique device ID and port index. The device ID starts from 0 and follows the sequence reported by ibv_devinfo. The port index is per-device and starts from 1. We can configure them via the following two variables in the script:

# We use the first port of the first device reported by ibv_devinfo
ibdev_id=0
ibdev_base_port=1

Second, specify the number of CNs and MNs used in a paritcular run. We can configure them via the following two variables in the script:

NR_CN=1
NR_MN=1

Third, specify the IP address of the memcached server instance, which is expressed by the following variable:

MEMCACHED_SERVER_IP="192.168.0.1"

Fourth, we could choose RoCE and Infiniband mode by enabling one of the following lines at ibsetup.h:

#define RSEC_NETWORK_MODE    RSEC_NETWORK_IB
#define RSEC_NETWORK_MODE    RSEC_NETWORK_ROCE

All three scripts: run_ms.sh, run_memory,sh, and run_clients.sh at all machines need to have the same configuration. Otherwise the experiment will not start.

Number of CN threads

This is controlled by the following macro in clover/mitsume_benchmark.h.

#define MITSUME_BENCHMARK_THREAD_NUM            8

Replication Factor

The replication factor is expressed here. The default value is 1. To use a larger value, you will need more Clover MNs running. Note that the system may report failure if this value is larger than the number of MNs.

#define MITSUME_BENCHMARK_REPLICATION           1

Code Internal

  • All instances' entry point is at init.cc:main(). They act differently based on the parameters.
  • Client testing code entry point is at mitsume_benchmark.cc:mitsume_benchmark()
    • There are several available benchmarks. mitsume_benchmark_latency is a single threaded benchmark. mitsume_benchmark_ycsb is a multithreaded benchmark using YCSB workloads. Make sure you are using the desired benchmark.

Run YCSB Benchmark

Clover does not directly invoke YCSB benchmark. Several trace files are already pre-generated by YCSB. Clover will parse those files to mimic YCSB behavior.

To run, enable this line in benchmark.cc:

mitsume_benchmark_thread(MITSUME_BENCHMARK_THREAD_NUM, local_ctx_clt,
                           &mitsume_benchmark_ycsb);

For each run, please specify the YCSB mode you are running among A, B, and C in mitsume_benchmarh.h:

#define MITSUME_YCSB_OP_MODE MITSUME_YCSB_MODE_B

The path to the trace files are specified by this macro. If you don't want to change it, you can diretly clone the Dataset-pDPM and copy the Dataset-pDPM/workload/ to pDPM/clover/workload/.