Implement preprocessing of data
- handle original data before even parsing XML or CSV
- Nonterminals need to be creatable from labelling functions
- CORE::PARSEXML (~) required
- Example: Messed up Pergamon data
(from redmine: issue id 751, created on 2018-01-22tgradl)