Site Loader

Using domain knowledge as a source of additional information for machine learning algorithms is not a new idea. The main reason why this is interesting is the fact that human sometimes outperform learning algorithms. This is especially true in areas such as medicine where there are multiple complex patterns which needs to be interpreted. Sometimes, the integrated domain knowledge can be indispensable and irreplaceable, especially when data at hand contain noisy data, missing data, when working with rare events, or when domain problem is hard to define and solve. Therefore, efforts are invested in order to make machine learning algorithms more effective and efficient 11.  Holzinger 2016. One can alter goal function of learning algorithm in order to solve problem directly, i.e. instead of minimizing penalty function one can minimize cost where errors have different cost associated with it. One way to include domain knowledge is to include constraints for learning model, i.e. to include regularizations 12. Kim & Xian, 2010. Also, domain knowledge is used to obtain new data using so called virtual examples 10. Vukicevic et al., 2016. However, domain knowledge is mostly used in form of heuristics or exact rules for feature extraction or feature selection.Enhancing logistic regression with medical domain knowledge in form of hierarchy have already given better results compared to plain logistic regression. One approach tried to extract and select attributes using heuristics 13. Ristoski & Paulheim, 2014; 14. Radovanovic et al., 2015. It has been shown in these papers that extracted attributes, not only improves predictive performance, but also provide better stability because efficient attribute selection is performed on more general attributes (represent broader population). Therefore, patterns in data have higher support and confidence. Additionally, more general attributes are interpretable to medical expert. Also, it is worth to notice that these approaches were better in predictive performance compared to traditional and modern attribute selection technique. The main reason was the usage of more general attributes which present general medical concepts. These attributes were extracted using unsupervised learning methods, namely logical or operator. Other approach was to utilize domain hierarchy in form of regularization 15. Kamkar et al., 2015; 16. Jovanovic et al., 2016. Besides better generalizability of predictive model, these models lead to improved interpretability of logistic regression model. Additionally, these models can be used for gaining further insights of causes of readmission and as risk indicators.

Post Author: admin


I'm Katherine!

Would you like to get a custom essay? How about receiving a customized one?

Check it out