Commit 402ac30c authored by Weigert, Andreas's avatar Weigert, Andreas
Browse files

added solution for 2nd tutorial

parent 5dad1056
......@@ -54,34 +54,41 @@ summary(Shower_clean) # all rows having NA values in one or many columns are now
```{r Simple Statistics}
# Task 5
mean(Shower$ShowerTime) #mean without parameters does not work because we have some NA values in this vector
mean(Shower_clean$ShowerTime) #this data frame is cleaned and mean should work
mean(Shower$ShowerTime, na.rm = T)
# Task 6
var(Shower_clean$ShowerTime) #variance (implements sample variance)
# Task 7
median(Shower_clean$ShowerTime)
# Task 8
sd(Shower_clean$ShowerTime) #standard deviation (implements sample formula)
# Task 9
#compare variance and square of standard deviation
(sd(Shower_clean$ShowerTime)^2) == var(Shower_clean$ShowerTime)
#the comparison of numbers should be made considering a precision:
all.equal((sd(Shower_clean$ShowerTime)^2), var(Shower_clean$ShowerTime))
# Task 10
max(Shower_clean$ShowerTime)
min(Shower_clean$ShowerTime)
# Task 11
quantile(Shower_clean$ShowerTime)
```
```{r Write and filter data}
# Task 12
write.csv2(x = Shower[Shower$Hh_ID == 8899,], file="../output/problematic_shower_data.csv")
# Task 13
write.csv2(x = Shower[Shower$Hh_ID != 8899,], file="../output/cleaned_shower_data.csv")
```
After cleaning data we have stored the data to the folder "output".
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment