Protein Data Analysis DKFZ coding task data normalization, data imputation confounding factor identification feature selection for predictive model pathway-based enrichment analysis