Sunday, August 19, 2012

R - I'm a Nerd

I have recently discovered R.  R is a free software environment for statistical computing and graphics. I just learned about it recently. I started playing it in. I should probably do a few tutorials because I don't really know anything now.  I probably know less than 1% about the program and yet I still have fun with it.  When I started plotting, I emailed people to tell them about my plots. I'm pretty sure they all think I'm crazy.

If you want to make a histogram, they look way better in R than in Excel. 
I did graph a few things.  Most things I have done you can do in Excel, but then R takes it a step further and lets you do a little more.    Note the regression line. It isn't really that accurate.  Before creating these plots, I assumed that the further I ran, the slower I ran. I thought it would be distinct. As you can see it is not.

Then I thought about it and thought that sometimes I run with other people and I might run at their pace (slower or faster) so I figured out how to do some plots with Who I ran with too.

I like these plots a little more.  You can sort of see that the further we ran, the slower we ran.   I'm not quite sure why my runs with Jen increased so much. I think it might be because at the beginning we went slower and did 4.5 miles but then as we got better, we started going further but we were also in better shape.


  1. In my base data, if I ran with more than 1 person, I only listed 1 person.
  2. If I ran 1/3 of the run with someone else and 2/3 on my own, I listed the other person
  3. Most of the times that I ran/walk with Dave, I did not put them in my base data.  If I ran/walked with him then ran on my own, then I put it in the file. 
  4. I would have graphed the mile pace versus mph but couldn't figure it out.
  5. I have run with more people but if I only ran with them once or twice, I removed them. I kept Jill on because I liked that regression line.
  6. I had to go back and fill out half of March, April, and May last night because I was slacking off back then. So I might have missed running with someone. 
  7. Update: Someone pointed out that the lack of trend could also be because I've run when injured.
 I need to make a file with only my data with Dave and not when I ran on my own before/after.

On a side note when I was going through the data, I had all the distances written in my file but not the total time. I was using my Garmin tabs to fill that out if I filled those out. A few weren't on there so I made a guess. Then once I realized since I ran it on the treadmill, it might be in my running notebook. It was. And I was 1 second off from my guess!   Then later I had another one I didn't know and guessed. Then I realized it might be on the Garmin site. It was. I was 1 second off again. Both of my guesses were 1 second too slow.  Maybe I don't even need all this data since apparently I can guess. Or maybe all the data, helped me guess the missing times.  I based my guess on where I ran and the time of year so I knew what shape I was in. I also know that treadmill runs tend to be faster because they weren't as hilly.

Time to look into this program some more to see what else I can do.  I have a lot of files of data but they don't mean that much so it's hard to play with the program to see what cool things to do. I just need more data.  I need to run more to get more data. I need to run with Hilary more times so she can be included.  I need to do other things (I just don't know what) so I can get more data.  Any suggestions?

Nerd x2 post
Nerd x3 post


  1. Oh that's pretty cool. Is the mileage increasing for us because of the long runs on the weekends? That was as we were ramping up the half marathon training, right?

    1. looking at the data most of the long runs were Jan/Feb before your injury and days I ran some with you and some without but based on spreadsheet it counts as all with you.