Statistics in Medicine

Cloud‐based simulation studies in R ‐ A tutorial on using doRedis with Amazon spot fleets

Journal Article

Simulation studies are helpful in testing novel statistical methods. From a computational perspective, they constitute embarrassingly parallel tasks. We describe parallelization techniques in the programming language R that can be used on Amazon's cloud‐based infrastructure. After a short conceptual overview of the parallelization techniques in R, we provide a hands‐on tutorial on how the doRedis package in conjunction with the Redis server can be used on Amazon Web Services, specifically running spot fleets. The tutorial proceeds in seven steps, ie, (1) starting up an EC2 instance, (2) installing a Redis server, (3) using doRedis with a local worker, (4) using doRedis with a remote worker, (5) setting up instances that automatically fetch tasks from a specific master, (6) using spot‐fleets, and (7) shutting down the instances. As a basic example, we show how these techniques can be used to assess the effects of heteroscedasticity on the equal‐variance t‐test. Furthermore, we address several advanced issues, such as multiple conditions, cost‐management, and chunking.

Related Topics

Related Publications

Related Content

Site Footer


This website is provided by John Wiley & Sons Limited, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ (Company No: 00641132, VAT No: 376766987)

Published features on are checked for statistical accuracy by a panel from the European Network for Business and Industrial Statistics (ENBIS)   to whom Wiley and express their gratitude. This panel are: Ron Kenett, David Steinberg, Shirley Coleman, Irena Ograjenšek, Fabrizio Ruggeri, Rainer Göb, Philippe Castagliola, Xavier Tort-Martorell, Bart De Ketelaere, Antonio Pievatolo, Martina Vandebroek, Lance Mitchell, Gilbert Saporta, Helmut Waldl and Stelios Psarakis.