# Project Title Evaluation of Biases in Self-reported Demographic and Psychometric Information: Traditional versus Facebook-based Surveys ## Abstract Social media in scientific research offer a unique digital observatory of human behaviours and hence great opportunities to conduct research at large scale answering complex sociodemographic questions. We focus on the identification and assessment of biases in social media administered surveys. This study aims to shed light on population, self-selection and behavioural biases, empirically comparing the consistency between self-reported information collected traditionally versus social media administered questionnaires, including demographic and psychometric attributes. We engaged a demographically representative cohort of young adults in Italy (approximately 4,000 participants) in taking a traditionally administered online survey and then, after one year, we invited them to use our ad hoc Facebook application (988 accepted) where they filled in part of the initial survey. We assess the statistically significant differences indicating population, self-selection, and behavioural biases due to the different context in which the questionnaire is administered. Our findings suggest that surveys administered on Facebook do not exhibit major biases with respect to traditionally administered surveys neither in terms of demographics, nor personality traits. Loyalty, authority, and social binding values were higher in the Facebook platform, probably due to the platform's intrinsic social character. We conclude, that Facebook apps are valid research tools for administering demographic and psychometric surveys provided that the entailed biases are taken into consideration. We contribute to the characterisation of Facebook apps as a valid scientific tool to administer demographic and psychometric surveys, and to the assessment of population, self-selection, and behavioural biases in the collected data. ### Prerequisites - Python 2.7 - pandas - scipy - sklearn ### Codes The file Statistics-population-self-selection-biases.ipynb generates the statistics reported in the paper for Tables 1 and 2. It also makes the comparisons reported in the first two table of Table 3. The file Behavioural-Biases-and-Bootstrapping.ipynb generates the results reported in the third column of Table 3. ### Data - Survey_data_osf.csv contain the data of the original traditional survey. - ID_list.csv is a map between the IDs of the users in the original study and their respective IDs in the Facebook study. - FB_MFT.csv are the self reported answers of the participants in the Facebook study regarding the moral foundations theory. - FB_Big5.csv are the self reported answers of the participants in the Facebook study regarding the personality traits test. ## Authors * **Kyriaki Kalimeri** - [GitHub](https://github.com/kkyriaki/) ## License This project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details ## Acknowledgments Please cite: Kalimeri, K., Beiró, M. G., Bonanomi, A., Rosina, A., & Cattuto, C. (2019). Evaluation of Biases in Self-reported Demographic and Psychometric Information: Traditional versus Facebook-based Surveys. arXiv preprint arXiv:1901.07876. @misc{kalimeri2019evaluation, title={Evaluation of Biases in Self-reported Demographic and Psychometric Information: Traditional versus Facebook-based Surveys}, author={Kyriaki Kalimeri and Mariano G. Beiro and Andrea Bonanomi and Alessandro Rosina and Ciro Cattuto}, year={2019}, eprint={1901.07876}, archivePrefix={arXiv}, primaryClass={cs.CY} }