Skip to content

Latest commit

 

History

History
152 lines (133 loc) · 9.09 KB

README.md

File metadata and controls

152 lines (133 loc) · 9.09 KB

Artifact Description

Artifact Author: Adones Rukundo Contact: [email protected], [email protected], [email protected]

The artifact design is based on the adapted version of ASCYLIB framework. Memory is managed using ssmem library, a simple object-based memory allocator with epoch-based garbage collection.

Related Publication

TSLQueue: An Efficient Lock-free Design for Priority Queues. https://link.springer.com/chapter/10.1007/978-3-030-85665-6_24

Data structure implementations

The artifact contains three implantations of a priority queue abstract data structure:

Getting Started Guide

Prerequisites

The following are the hardware and software requirements to run the artifact

  • Computer Architecture:
    • The artifact is designed and tested for x86_64 computer architectures
  • OS Platform:
    • Linux OS distribution
  • Compiler:
    • gcc compiler is required to compile the artifact tests
  • Memory:
    • Test machine should have at least 30 GB of RAM.
  • Software:
    • To automatically generate throughput graphs, the test machine must have python 2.7.5 and above installed together with matplot and numpy modules. The graphs are generated using matplot and data is read using numpy.

Artifact Installation:

  • Step 1: Unzip the artifact to any directory e.g Europar2021.
  • Step 2: Navigate to the home directory of the artifact where the script directory is located
  • Step 3: Run chmod -R 774 "scripts" This grants you permission to run execution commands on the script files

Artifact directories:

  1. benchmark: In this directory you will find all the necessary files (source code) related to the implemented data structures and tests.
    • benchmark/src: Contains the data structures' source code
    • benchmark/includes: Contains support files
    • benchmark/common: Contains make definitions file
    • benchmark/external: Contains external libraries e.g the ssmem library
    • benchmark/bin: Contains the binary files for the compiled data structures
  2. scripts: In this directory you find some scripts that can be used to run batch tests, generate data tables and graphs.
  3. data: The data generated by the scripts in the scripts directory is saved in this (data) directory.
  4. graphs: The graphs generated by the scripts in the scripts directory are saved in this (graphs) directory.

Simple Compilation

  • Step 1: Navigate to the benchmark directory
  • Step 2: Run make This compiles all the implementations using the default make definitions

Binary files will be saved in benchmark/bin/

Simple test run

  • Step 1: Navigate to the bin directory benchmark/bin/
  • Step 2: Run ./pq-tsl_wcas -n 2 or ./pq-linden -n 2 or ./pq-lotan -n 2 or ./pq-tsl_cas -n 2

An output of the test results will be displayed.

Step by Step Instructions

Compiling data structure tests:

  • Step 1: Navigate to the benchmark directory
  • Step 2: Run make Without arguments, the make command compiles all data structure tests with the default settings. make can take arguments to define specific execution settings e.g make VERSION=O3 GC=1 INIT=all
    • make settings:
      • VERSION defines the optimisation levels e.g VERSION=O3. It takes one of the following five values:
        • DEBUG compile with optimisation zero and debug flags
        • SYMBOL compile with optimisation level three and -g
        • O0 compile with optimisation level zero
        • O1 compile with optimisation level one
        • O2 compile with optimisation level two
        • O3 compile with optimisation level three (Default)
      • GC defines if deleted nodes should be recycled GC=1 (Default) or not GC=0.
      • INIT defines if data structure initialisation should be performed by all active threads INIT=all (Default) or one of the threads INIT=one
      • AFFINITY defines the thread to core mapping, AFFINITY=dense defines dense pinning while AFFINITY=sparse defines sparse pinning and AFFINITY=none lets the system decide on which cores the threads should run.
    • More information about available make settings options can be found in the make file benchmark/common/Makefile.common

Defining machine specific thread pinning:

Different machines have different thread id layouts especially if the machine has more than one socket. For each affinity, dense or sparse, the machine thread id order must be defined e.g 0,2,4 means that pin the first thread to hardware thread id 0, second thread to hardware thread id 2 and third thread to hardware thread id 4. Machine affinity definitions are in file benchmark/includes/utils.h.

  • Run lscpu to determine the order of hardware thread ids.
  • Run cat /proc/cpuinfo to determine how the hardware thread ids are distributed to the core ids.
  • Use this information to define the order in which threads are pinned for either dense or sparse mapping. dense mapping should be defined to pin one thread per core close together. sparse pinning should be defined to pin one thread per core far apart e.g different sockets/tiles.
  • In the file benchmark/includes/utils.h, the DEFAULT definition can be replaced with the specific mapping/affinity. Or the specific machine definition can be added.

Running individual tests:

Binary files for each data structure implementation are in the directory benchmark/bin/.

  • pq-tsl_wcas TSLQueue compiled with double word CAS
  • pq-tsl_cas TSLQueue compiled with single word CAS
  • pq-linden Linden priority queue
  • pq-lotan Lotan priority queue

To run individual tests for each compiled data structure implementation navigate to the bin directory benchmark/bin/.

  • Run e.g ./pq-tsl_wcas -h to get the information about the different parameters. An output similar to the following will be displayed

    Help information:
    
    Usage:
    ./pq-tsl_wcas [options...]
    
    Options:
      -h, --help
            Print this message
      -d, --duration <int>
            Test duration in milliseconds
      -i, --initial-size <int>
            Number of elements to insert before test
      -n, --num-threads <int>
            Number of threads
      -r, --range <int>
            Range of integer keys inserted in set
      -p, --put-rate <int>
            Percentage of put (insert) operations
  • Run e.g ./pq-tsl_wcas -n 6 to run test with six threads. An output similar to the following will be displayed:

    Initial, 12000 
    Range, 1073741824 
    Algorithm, TSLQueue 
    Size of node padded, 64 B
    Size of initial, 750.00 KB is 0.73 MB
    [ALLOC] initializing allocator with fs size, 10000 objects
    BEFORE size is, 12000
    AFTER  size is, 42293
    Inserting_count_total , 3497542    
    Inserting_count_total_succ , 3497444    
    Inserting_perc_succ ,      100.0 
    Inserting_perc ,       50.2 
    Inserting_effective ,       50.2 
    DeleteMin_count_total , 3467151    
    DeleteMin_count_total_succ , 3467151    
    DeleteMin_perc_succ ,      100.0 
    DeleteMin_perc ,       49.8 
    DeleteMin_effective ,       49.8 
    Number of Threads , 6 
    Million Operations per Second , 6.965
    Operations per Second , 6964693.00

The default parameters are defined in the file benchmark/includes/common.h

Running batch tests:

The script file script/pq-experiments.sh is used to run batch test executions . In this file, execution settings such as number of repeats can be set. Follow the following steps to run batch tests and generate graphs automatically.

  • Step 1: Navigate to the scripts directory
  • Step 2: Run nohup ./pq-experiments.sh 1,4,8,12,16,20,24,28,32,36 &> log.out&, where 1,4,8,..,36 is a list of number of threads per test. The list of threads can be modified to fit the tests and the machine on which the tests are being run. The longer the thread list the more time the batch execution test will take. The execution events are logged in log.out and saved in the scripts directory Data is saved in the data directory as .csv files, while the graphs will be saved in the graphs directory as png files. The file name includes the put rate (-p), key range (-r), queue initial size (-i), affinity (dense) and experiment timestamp.

NOTE: The results in the paper "TSLQueue: An Efficient Lock-free Design for Priority Queues" are for dense mapping affinity (pinning threads one per core, filling one socket at a time.