A computational approach to optimal multivariate designs with respect to stratification and allocation is investigated under the assumptions of fixed total allocation, known number of strata, and the availability of administrative data correlated with the variables of interest under coefficient-of-variation constraints. This approach uses a penalized objective function that is optimized by simulated annealing through exchanging sampling units and sample allocations among strata. Computational speed is improved through the use of a computationally efficient machine learning method such as K-means to create an initial stratification close to the optimal stratification. The numeric stability of the algorithm has been investigated and parallel processing has been employed where appropriate. Results are presented for both simulated data and USDA's June Agricultural Survey. An R package has also been made available for evaluation.
Optimal stratification and allocation for the June agricultural survey