Commit 65c9d0fe authored by Weigert, Andreas's avatar Weigert, Andreas
Browse files

added first part of tutorial 9

parent d479b63b
---
title: 'Tutorial 9: Classification'
output: html_notebook
editor_options:
chunk_output_type: inline
---
This file is part of the lecture Business Intelligence & Analytics (EESYS-BIA-M), Information Systems and Energy Efficient Systems, University of Bamberg.
```{r Load libraries}
library(FSelector) #for feature selection
library(party) #for classification algorithm decision trees
library(class) #for classification algorithm kNN
library(e1071) #for classification algorithm SVM
library(randomForest) #further random forest
```
```{r Load and prepare data}
# Load data
load("../data/classification.RData")
# Derive and investigate the dependent variable "number of residents"
adults <- as.integer(ifelse(customers$residents.numAdult=="5 oder mehr",
"5",customers$residents.numAdult))
children <- as.integer(ifelse(customers$residents.numChildren=="5 oder mehr",
"5",customers$residents.numChildren))
table(ifelse(is.na(children), adults, adults+children))
# think in classes. we have some very rare classes of number of residents (>5)
customers$pNumResidents <- sapply(ifelse(is.na(children), adults, adults+children),
function(a) {
if(a==0 || is.na(a)){
return(NA)
} else if(a==1){
return("1 person")
} else if(a==2){
return("2 persons")
} else if(a<=5){
return("3-5 persons")
} else {
return(">5 persons")
}
})
customers$pNumResidents <- ordered(customers$pNumResidents,
levels=c("1 person", "2 persons",
"3-5 persons", ">5 persons"))
table(customers$pNumResidents)
```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment