Category Archives: stratified sampling

Enjoy R: Stratified sampling and its application using dplyr

author: Davide Passaretti Simple random sampling is the most common practise when dealing with data sets which are large enough to be split into training and test set for predictive purposes. Think of classification models. You randomly extract, say, 75% of the rows, and that’s a fair technique, at least until you are quite sure that …

Continue reading