Background and Objective: Measuring health literacy becomes more important because its association with health status and healthcare outcomes. Studies have developed at least 133 measurement tools for health literacy. HLS-EU-Q47 is a questionnaire consisting of 12 sub-dimensions and 47 questions developed by the Europe Health Literacy Consortium. Many countries in Europe and Asia have used HLS-EU-Q47 as a tool for measuring health literacy in the general public. Indonesia has conducted general health literacy survey using HLS-EU-Q47 but finding the difficulties because of the time-consuming interview. A shorter version of HLS-EU-Q47 is needed to apply in health literacy researches in Indonesia. This paper reports the results of feature reduction to develop a short Indonesian version HLS-EU questionnaire and measures the accuracy of the model compared with other short form like HLS-EU-SQ16 or HLS-SF12. Method: The analysis was performed on a population-based dataset from Indonesia-Semarang Health Literacy Survey for which there were specific target variables as the classification of health literacy level. All attributes were assessed as potential targets in the models derived from the full dataset and its subsets. The feature selection methods with genetic algorithm were used as the filter as well as validation (cross validation) and classification (k-NN:k-nearest neighbor). The predictive accuracy of health literacy level and the complexity of models based on the reduced datasets were compared among the methods and other short versions such as HLS-EU-SQ16, HLS-SF12. Result: The accuracy of the existing short form models were 90.64% with the HLS-EU-SQ16 and 88.67% with the HLS-SF12. This study proposed a model with 10 features as the construct of a short Indonesian-version (proposed as the HLS-EU-SQ10-IDN) since the model was with higher accuracy than the HLS-SF12, but fewer features for measuring general health literacy index. Moreover, the short version only completed part of 12 dimensions of the full questionnare. Conclusion: A data mining technique using feature selection with combination of genetic algorithm and k-NN algorithm was applied to develop a short version questionnaire and proved to have better accuracy, as compared with the short version developed by traditional statistical technique.
- Data mining
- Design Questionnaire
- Health Literacy
- Multifactor dimensional reduction
ASJC Scopus subject areas
- Computer Science Applications
- Health Informatics