# Category Archives: rstats

## Enjoy R: Do two consecutive seeds behave independently?

I’ve always wondered whether two random seeds in R provide independent results, whatever they are. In particular, I wanted to check if repeating a sampling operation with two consecutive seeds, say set.seed(20) and set.seed(21), this would produce unrelated outputs as expected. Pseudo-randomness in R is based on algorithms I honestly have read nothing about, and …

## Enjoy R: install packages “on the fly”

It is annoying when you load a package and you find out you don’t have it installed. So you need to install it first, and finally load it. As I am used to installing LaTeX packages on the fly, I thought of a simple script to do the same in R. The following is a …

## Enjoy R: Looping in R

Loops are run in most applications and are supported by all languages. In general, there are more than one way to execute the same task via looping, and the efficiency of each choice varies among languages. This post is not intended to demonstrate any general truth about R loops, but aims to provide some insights into some …

## Enjoy R: how to automatically give readable names to variables in a loop

For making the same operations on each element of a collection — e.g. vector, matrix, list —, we generally use loops. Sometimes, we want to save the results of each iteration in variables which are related only to the current iteration. To do that properly, we should give each variable a name that simoultaneously refers …

## Enjoy R: is my x included in these bounds?

How many times have you written code like the following? if(x > lower_bound & x < upper_bound) return(T) return(F) Throughout my coding experience so far, I’ve faced that a lot. And everytime this happened, I started thinking that I was not really writing it in the same way as I would have written it using basic …

## Enjoy R: A useful function for clearing the workspace

Today I was coding with my supervisor, and we actually had a bunch of things saved in our workspace which we wanted to get rid of. The annoying matter was that we aimed to delete almost all the workspace and keep only a couple of functions. R has a fast way to clear all the workspace, which is rm(list …

## Enjoy R: Compare ROC curves of different multinomial classification models

When we think of a ROC curve, we usually refer it to a binary classification problem. For a multiclass case, it comes to be less used, also because it loses most of its explicative power. However, it would be good to use it also in this scenario in order to own one more weapon for diagnostics. …

## Enjoy R: Stratified sampling and its application using dplyr

author: Davide Passaretti Simple random sampling is the most common practise when dealing with data sets which are large enough to be split into training and test set for predictive purposes. Think of classification models. You randomly extract, say, 75% of the rows, and that’s a fair technique, at least until you are quite sure that …

## Enjoy R: How to make a Pareto Chart using ggplot2 (and dplyr)

Hi all. The well-known choice of pushing ggplot2 users towards a cleaner and more correct way of plotting data, has led to the miss-implementation of a secondary axis. This is at the basis of the difficulty of plotting a Pareto Chart using this smart R package. In this post, I suggest a way to overcome this hurdle, by …

## Enjoy R: Simulations of the Monty Hall problem

The Monty Hall problem is a recurring and charming game in which the probability theory plays an essential role. A very simple and clear explanation may be found here: http://www.youtube.com/watch?v=7u6kFlWZOWg. Anyway, let’s just summarize how it goes. There are three shut doors: behind two of them there is a goat, whereas the remaining door hides …

## Recent Comments