Tuesday, December 17, 2013
R - POSIXct vs. POSIXlt
As I continue to learn about managing date and time data, I'm working to understand the difference between
POSIXct and POSIXlt
POSIXct is a date-time data object that stores the number of seconds since a certain point in time in the past.
POSIXlt stores a list of day, month, year, hour, minute, second, etc.
I'm still learning when best to use each type in different contexts. More to come on that.
R - strftime() vs. strptime()
I'm learning how to handle date and time data in R.
There are two functions that are similar when converting date/time related data.
strptime() takes a character vector (string) and converts to a POSIXlt or POSIXct data object.
strftime() takes a a POSIXlt or POSIXct data object and converts to a character vector (string).
Here is documentation on use including formatting characters: http://www.inside-r.org/r-doc/base/strftime
There are two functions that are similar when converting date/time related data.
strptime() takes a character vector (string) and converts to a POSIXlt or POSIXct data object.
strftime() takes a a POSIXlt or POSIXct data object and converts to a character vector (string).
Here is documentation on use including formatting characters: http://www.inside-r.org/r-doc/base/strftime
Sunday, December 15, 2013
R - Comment Lines in Data File
Today I learned that you can have comments lines in data files. You need to include comment.char="#" as an argument to read.csv() and use whatever symbol you want, though a # (octothorpe) would be consistent with commenting in Rscript.
Example:
data <- read.csv("../datasets/heatmaps_in_r.csv", comment.char="#")
Saturday, December 14, 2013
R - Lubridate - Reading Date/Time Data
I'm reading in date and time data from my sleep project using the parse_date_time() function from the lubridate package.
sleep$StartDateTime <- parse_date_time(paste(sleep$Date, sleep$StartTime), "%m-%d-%Y %H.%M", truncate = 2)
The date and time data were in two separate columns so I needed to use the paste()function. The minutes data is in the format HH.MM.
Also, I was getting NA data at first when the minutes data was 00 (at the top of the hour). I added the truncate argument so that it would ignore incomplete data in seconds and minutes.
Here is a good summary of the parse_date_time() function: http://www.inside-r.org/packages/cran/lubridate/docs/parse_date_time
sleep$StartDateTime <- parse_date_time(paste(sleep$Date, sleep$StartTime), "%m-%d-%Y %H.%M", truncate = 2)
The date and time data were in two separate columns so I needed to use the paste()function. The minutes data is in the format HH.MM.
Also, I was getting NA data at first when the minutes data was 00 (at the top of the hour). I added the truncate argument so that it would ignore incomplete data in seconds and minutes.
Here is a good summary of the parse_date_time() function: http://www.inside-r.org/packages/cran/lubridate/docs/parse_date_time
Monday, December 9, 2013
Sleep Segments
First graph of the number of sleep segments per night. This, combined with the duration of sleep would be the two most significant measurements of the quality of sleep. So far, it appears I am improving in my sleep by waking less during the night.
![]() |
(click to view) |
R - Converting Column to Date
In my sleep data, I have a date column that was originally a character type. When plotted in ggplot(), it was listing every date in the x axis, making the labels very crowded.
Then I added the following code
dailytotals$EndDate <- as.Date(dailytotals$EndDate , "%m-%d-%Y")
which converted it to a date type. Then ggplot automatically created tick marks and labels every 15 days (for data that spans two months.
Note that the %Y (capitalized) recognizes a four digit year where %y (lower case) recognizes a two digit year.
Then I added the following code
dailytotals$EndDate <- as.Date(dailytotals$EndDate , "%m-%d-%Y")
which converted it to a date type. Then ggplot automatically created tick marks and labels every 15 days (for data that spans two months.
Note that the %Y (capitalized) recognizes a four digit year where %y (lower case) recognizes a two digit year.
Saturday, December 7, 2013
Initial Sleep Analysis
This is my first post sharing my initial analysis of my sleep data. I've collected data for two months now using Tasker on my Android phone. It collects the start and end date/time of each sleep period as well as why I woke up.
I've just started using R and love it. Here is the first graph showing the total number of hours slept each night. Weekends are colored red because I wanted to know if I slept better on the weekends (it doesn't seem to matter).
Here is the graph:
I had a few days in November where my tasker action didn't save all the sleep intervals. On those days, I added estimate data. I may remove those dates altogether. There may be other dates that are incomplete. I have a could of days with less than four hours and I can't remember if that is correct or not. That probably has an impact on the dip in sleep duration in November.
Generally, I am getting more sleep that I expected, averaging around 6.5 to 7 hours per night.
Next I need to factor in the number of sleep segments to come up with a measurement of sleep quality (greater duration and fewer segments represents better sleep). I also need to consider why I woke up. If I was interrupted by an alarm, for example, and that cut my sleep shorter than it would have been naturally, it's not necessarily fair to treat that as a dependent variable when considering factors that affect sleep quality.
Here is the R Script I used (via RStudio):
library(methods)
library(lubridate)
sleep <- read.csv("C:/Users/xxxxxxx/Documents/R/Sleep/sleepdata.csv", header=T)
sleep["EndDate"] <- NA
sleep$EndDate <- ifelse(sleep$EndTime > 12, format(mdy(as.character(sleep$Date)) + days(1), format="%m-%d-%Y"), format(mdy(as.character(sleep$Date)), format="%m-%d-%Y")) #fill column with date of the morning of each sleep period
sleep["Weekday"] <- NA
sleep$Weekday <- weekdays(mdy(sleep$EndDate)) #fill column with weekday of the sleep period
sleep #display data in case I want to review
dailytotals <-unique(within(sleep, {
Duration <- ave(Duration, EndDate, FUN=sum)rm(Date,StartTime,EndTime,Status)
})) #create frame, one record per day with total number of hours slept
dailytotals #display in case I want to review
ggplot(dailytotals, aes(x=EndDate, y=Duration, group=1, colour=Weekday)) +
geom_point() + #plot points on graph
stat_smooth(level=.99) + #regression/curve line with 99% certainty range
theme(axis.text.x = element_text(angle = 90), axis.title.x = element_text(angle = 0), axis.title.y = element_text(angle = 0)) + #labels, turn x axis vertically
scale_color_manual(values=c(Saturday="red", Sunday="red", Monday="blue", Tuesday="blue", Wednesday="blue", Thursday="blue", Friday="blue")) #color code weekend days
I've just started using R and love it. Here is the first graph showing the total number of hours slept each night. Weekends are colored red because I wanted to know if I slept better on the weekends (it doesn't seem to matter).
Here is the graph:
![]() |
(click to view) |
Generally, I am getting more sleep that I expected, averaging around 6.5 to 7 hours per night.
Next I need to factor in the number of sleep segments to come up with a measurement of sleep quality (greater duration and fewer segments represents better sleep). I also need to consider why I woke up. If I was interrupted by an alarm, for example, and that cut my sleep shorter than it would have been naturally, it's not necessarily fair to treat that as a dependent variable when considering factors that affect sleep quality.
Here is the R Script I used (via RStudio):
library(methods)
library(lubridate)
sleep <- read.csv("C:/Users/xxxxxxx/Documents/R/Sleep/sleepdata.csv", header=T)
sleep["EndDate"] <- NA
sleep$EndDate <- ifelse(sleep$EndTime > 12, format(mdy(as.character(sleep$Date)) + days(1), format="%m-%d-%Y"), format(mdy(as.character(sleep$Date)), format="%m-%d-%Y")) #fill column with date of the morning of each sleep period
sleep["Weekday"] <- NA
sleep$Weekday <- weekdays(mdy(sleep$EndDate)) #fill column with weekday of the sleep period
sleep #display data in case I want to review
dailytotals <-unique(within(sleep, {
Duration <- ave(Duration, EndDate, FUN=sum)rm(Date,StartTime,EndTime,Status)
})) #create frame, one record per day with total number of hours slept
dailytotals #display in case I want to review
ggplot(dailytotals, aes(x=EndDate, y=Duration, group=1, colour=Weekday)) +
geom_point() + #plot points on graph
stat_smooth(level=.99) + #regression/curve line with 99% certainty range
theme(axis.text.x = element_text(angle = 90), axis.title.x = element_text(angle = 0), axis.title.y = element_text(angle = 0)) + #labels, turn x axis vertically
scale_color_manual(values=c(Saturday="red", Sunday="red", Monday="blue", Tuesday="blue", Wednesday="blue", Thursday="blue", Friday="blue")) #color code weekend days
Subscribe to:
Posts (Atom)