##Sat Oct 3 10:33:05 PDT 2009 ## A more relaxed week : review of R for those who would rather ## nail down the basics and postpone the fancy bits about proportional ## hazard for another time. ## ## PLEASE TALK TO ME before you decide to do this exercise instead of the ## proportional hazard exercise. ## ## This file has a bunch of R exercises with the answers. This is purely ## a review of material that we have coverd earlier. The intention is ## to provide some drills in the hopes of developing some intuition ## and thereby making R code less mysterious ## ## Read this file into your ~/213/Week7/exercise7.r buffer by telling emacs: ## C-x i to "include a file" and then editting the suggested path to: ## ~carlm/213/PropHazII/reviewR.r ## ## The answers are in : ## http://www.demog.berkeley.edu/213/PropHazII/reviewRanswers.r ## ######################################################################## ######################################################################## ## Question ######################################################################## ## 1) create a vector called "vector1" of integers between 7and32 ## can you think of two ways of doing this? ## 2) create a vector called "vector2" consisting of all of the ## elements of vector1 divided by 3 ## 3) what will this expression produce and why? vector1 * c(1,0) ## 4) crate a vector cs.vector1 whose elements are the cumulative sum ## of the elements of vector1. In other words the ith element of ## cs.vector1 contains the sum the 1..i elements of vector1. ## 5) print to the screen, vector 1 in reverse order: ######################################################################## ## 6) "modes" of vectors ######################################################################## ## Vectors in R all have a "mode" the possible modes are "numeric", ## "character" and "logical". Identify the mode of each of the ## following expressions: 1:7 (1:7)*T (1:7) > T (1:7) (-7:T) T:F "1":"7" (1:7) > "T" ######################################################################## ##7) Using square brackets [ ] to select elements from objects: ######################################################################## ## First create rv2 as below then write the expressions asked for below rv2<- rnorm(100) ## creates a vector rv2 which has 100 random normals ##a) the last 5 elements of rv2 ##b) the largest 5 elements of rv2 ##c) the elements of rv2 which are greater than the 2 ##d) the index (that is the position in the vector) of the 5 largest ## elements of rv2 ######################################################################## ##8) Arrays/Matricies: ######################################################################## ## A vector with a dimension "attribute" is an ##array. If the dimension attribute has length 2, it is also a ##"matrix". ## wirte three different expressions that will create a 20X5 matrix from a ## vector such as rv2 above ##(1) using the matrix() function: ##(2) using the array() function ##(3) by setting the dimension attribute ######################################################################## ## 9) Using square bracket selection with matrices: ######################################################################## ## use mat1 to do what is asked for below: mat1<- array(NA,c(10,8)) ## change the 8th column of mat1 to be all 8s ## change the 2nd row of mat1 to be 2s BUT do not change the element ## in the 8th column ## change the 10th row of mat1 to 2,1,2,1 ... including the 8th ## column ## change every element whose row index is greater than it's column ## index to 5 ######################################################################## ## 10)Loops ######################################################################## ## write a for loop that creates a vector containing the numbers ## between 1 and 1000 which are divisible by 27 (HINT: remember the ## modula operator "%%" ## write a while loop that does the same thing ## create a 10X9 matrix of NA and then use two for loops to populate ## each element of the matrix with product of the row and the column ## number ######################################################################## ## 11) Using apply() to operate on rows and columns ######################################################################## ## Do the following pointless tasks using the matrix that you produced ## in the previous exercise (a 10X9 matrix where element i,j = i*j ## use apply to find the column sums ## use apply (and selection with [ ] to find the colum sums of rows ## 5-9 ## Divide each element of mat by it's collumn mean. In other words, ## write an expression that evaluates to a matrix with the same ## dimensions as mat and with each element being mat[i,j]/mean(mat[,j]) ## write an expression that evaluates to a matrix with each element ## divided by the mean of the elements in its COLUMN (Why is this so ## much trickier than it looks) ######################################################################## ## 12) factors and logical expressions, the %in% operator and tapply() ######################################################################## ## read in the ACS05 data from a few weeks ago library(foreign) acs<-read.dta(file='/data/commons/carlm/ACS05/ipumsACS05.dta') acs.small<-acs[acs$serial %in% sort(unique(acs$serial))[1:1000],] ## What does this mean? is.numeric(acs$age) ## and this: levels(acs$age) ## Would this work to create a numeric age variable? acs.small$Age<-as.numeric((acs.small$age)) ## Hint: nope what's wrong with it? acs.small$Age<-as.numeric((acs.small$age)) -1 ## write a logical expression that evaluates to true if the ## observation (in acs.small) is under 18 and not a Child of the Head. ## use the expression you just wrote to find the number of people in ## the small sample who are under 18 and NOT listed as Child of Head ## the %in% operator to find the intersection of two vectors ## EXAMPLE: How many households include both a parent and a child of the Head sum(unique(acs$serial[acs$relate == "Child"]) %in% unique(acs$serial[acs$relate == "Parent"])) ## How many Children (of head) live in households that also contain a ## parent of the head? ## How many households contain at least 2 children of the Head ##How many households include at least 2 children ## and a parent of the ## use logical expressions and clever selection with [ ] to find the ## important demographic information asked for below ## Example -- find the mean age difference between Head and Spouse in ##each household that has both sp.serial<-unique(acs.small$serial[acs.small$relate == "Spouse"]) sel<- acs.small$serial %in% sp.serial ageOfHead<-acs.small$Age[sel & acs.small$relate == "Head/Householder"] ageOfSpouse<-acs.small$Age[sel & acs.small$relate == "Spouse"] mean(ageOfHead - ageOfSpouse) ## How many same sex couples are there in the full acs sample ## What is the average number of children in a female headed household ## Assuming that we have a numeric variable called Age ## use tapply() to find the mean age of males and females in acs.small ## use tapply() to find the mean age of spouses ## Use tapply() in addition to clever selection tricks and logical ## expressions to find these interesting demographic fun facts: ## What is the average size of a household that includes both Head and ## Spouse ## Find the median age of the oldest children (of the head) in each hh in the ## acs.small sample