Joint Generation of Binary, Ordinal, Count, and Normal Data with Specified Marginal and Association Structures in Monte-Carlo Simulations


Demirtas H., Allozi R., Hu Y., İnan G., ÖZBEK L.

MONTE-CARLO SIMULATION-BASED STATISTICAL MODELING, ss.3-15, 2017 (SCI-Expanded) identifier identifier

Özet

This chapter is concerned with building a unified framework for concurrently generating data sets that include all four major kinds of variables (i.e., binary, ordinal, count, and normal) when the marginal distributions and a feasible association structure are specified for simulation purposes. The simulation paradigm has been commonly employed in a wide spectrum of research fields including the physical, medical, social, and managerial sciences. A central aspect of every simulation study is the quantification of the model components and parameters that jointly define a scientific process. When this quantification cannot be performed via deterministic tools, researchers resort to random number generation (RNG) in finding simulation-based answers to address the stochastic nature of the problem. Although many RNG algorithms have appeared in the literature, a major limitation is that they were not designed to concurrently accommodate all variable types mentioned above. Thus, these algorithms provide only an incomplete solution, as real data sets include variables of different kinds. This work represents an important augmentation of the existing methods as it is a systematic attempt and comprehensive investigation for mixed data generation. We provide an algorithm that is designed for generating data of mixed marginals, illustrate its logistical, operational, and computational details; and present ideas on how it can be extended to span more complicated distributional settings in terms of a broader range of marginals and associational quantities.