C18 - Methodological Issues: GeneralReturn

Results 1 to 3 of 3:

Pitfalls of Quantitative Surveys Online

Iva Pecáková

Acta Oeconomica Pragensia 2016, 24(6):3-15 | DOI: 10.18267/j.aop.560

With the development of the Internet in the last two decades, its use in all phases of field survey is growing very quickly. Indeed, it reduces costs while allowing exploration of relatively large files and enables effective use of a variety of research tools. The academic research is more reserved towards developing online surveys. Demands on the quality of data are the main cause; Internet surveys do not meet them and thus do not allow drawing objective conclusion about the populations surveyed.
Unqualified use of the Internet may significantly influence data and information obtained from their analysis. The problematic definition of the population that is under investigation may result in a fault of its coverage. Its existence can be shown, for example, on a confrontation of the total and Internet population of the Czech Republic, the total and Internet population of the Czech households, etc. Representation of the population through an online panel may cause bias, depending on how the panel is created. A relatively new source of error in an online survey is the existence of "professional" respondents.
The sampling method from a population or an online panel can lead to the emergence of such a sample that is not representative and does not allow inference to the population at all, or only in a very limited way. Even probability sampling, however, can be problematic if it is affected by a higher rate of non-responses. The aim of this paper is to summarise the possible sources of bias associated with any sample survey, but also to draw attention to those that are relatively new and are associated with the implementation of just quantitative surveys online.

Data representativeness problem in credit scoring

Josef Ditrich

Acta Oeconomica Pragensia 2015, 23(3):3-17 | DOI: 10.18267/j.aop.472

When building models, it is common to split the whole dataset into a development and a validation sample. In some cases, using random sampling instead of stratified sampling can lead to loss of representativeness of final samples. In such cases, a model built on these data gives different or unexpected results when its performance is measured on the validation sample. In the business area, a lack of representativeness can cause interpretative problems and can have a huge financial impact when a biased model is involved in the credit granting process. The aim of this paper is to examine and understand why representativeness should be checked before the start of modelling. The paper deals with methods of identification of selection bias in time. It recommends using three tests as a common part of the data preparation process.

Problem of Missing Data in Questionnaire Surveys

Iva Pecáková

Acta Oeconomica Pragensia 2014, 22(6):66-78 | DOI: 10.18267/j.aop.459

Almost any data set can be encountered to the problem of missing data; it is well known in the phenomena relating to people populations and researched in sample surveys. In recent decades, the issue of missing data received considerable attention, because the simple omission of units, for which data are lacking, from the analysis may lead to erroneous conclusions. The approach that accepts the existence of missing data through the modification of the probabilities of units selection with probabilities of obtaining data on them, leads to the construction and use of the weights. Different solution lies in filling in missing data. Using the arithmetic mean or a regression function, recommended for this purpose before, leads at the relevant variables at least to an underestimation of variability; furthermore, it is applicable only for measurable variables. Alternative approaches to missing data are based on the likelihood of collected data assuming some model. Two directions of their development can be distinguished again, estimating population parameters without imputation of missing data on the one hand (EM algorithm) and multiple imputation methods on the other.