Person:
Alcalá Quintana, Rocío

First Name

Rocío

Last Name

Alcalá Quintana

URI

https://hdl.handle.net/20.500.14352/80039

Affiliation

Universidad Complutense de Madrid

Faculty / Institute

Psicología

Department

Psicobiología y Metodología en Ciencias del Comportamiento

Area

Metodología de las Ciencias del Comportamiento

Identifiers

Full item page

Search Results

Now showing 1 - 10 of 22

Testing equivalence with repeated measures: tests of the difference model of two-alternative forced-choice performance.
(The Spanish journal of psychology, 2011) García Pérez, Miguel Angel; Alcalá Quintana, Rocío
Solving theoretical or empirical issues sometimes involves establishing the equality of two variables with repeated measures. This defies the logic of null hypothesis significance testing, which aims at assessing evidence against the null hypothesis of equality, not for it. In some contexts, equivalence is assessed through regression analysis by testing for zero intercept and unit slope (or simply for unit slope in case that regression is forced through the origin). This paper shows that this approach renders highly inflated Type I error rates under the most common sampling models implied in studies of equivalence. We propose an alternative approach based on omnibus tests of equality of means and variances and in subject-by-subject analyses (where applicable), and we show that these tests have adequate Type I error rates and power. The approach is illustrated with a re-analysis of published data from a signal detection theory experiment with which several hypotheses of equivalence had been tested using only regression analysis. Some further errors and inadequacies of the original analyses are described, and further scrutiny of the data contradict the conclusions raised through inadequate application of regression analyses.
Visual and Auditory Components in the Perception of Asynchronous Audiovisual Speech
(i-Perception, 2015) García Pérez, Miguel Ángel; Alcalá Quintana, Rocío
Research on asynchronous audiovisual speech perception manipulates experimental conditions to observe their effects on synchrony judgments. Probabilistic models establish a link between the sensory and decisional processes underlying such judgments and the observed data, via interpretable parameters that allow testing hypotheses and making inferences about how experimental manipulations affect such processes. Two models of this type have recently been proposed, one based on independent channels and the other using a Bayesian approach. Both models are fitted here to a common data set, with a subsequent analysis of the interpretation they provide about how experimental manipulations affected the processes underlying perceived synchrony. The data consist of synchrony judgments as a function of audiovisual offset in a speech stimulus, under four within-subjects manipulations of the quality of the visual component. The Bayesian model could not accommodate asymmetric data, was rejected by goodness-of-fit statistics for 8/16 observers, and was found to be nonidentifiable, which renders uninterpretable parameter estimates. The independent-channels model captured asymmetric data, was rejected for only 1/16 observers, and identified how sensory and decisional processes mediating asynchronous audiovisual speech perception are affected by manipulations that only alter the quality of the visual component of the speech signal.
Bayesian adaptive estimation of arbitrary points on a psychometric function
(The British journal of mathematical and statistical psychology, 2007) García Pérez, Miguel Ángel; Alcalá Quintana, Rocío
Bayesian adaptive methods have been extensively used in psychophysics to estimate the point at which performance on a task attains arbitrary percentage levels, although the statistical properties of these estimators have never been assessed. We used simulation techniques to determine the small-sample properties of Bayesian estimators of arbitrary performance points, specifically addressing the issues of bias and precision as a function of the target percentage level. The study covered three major types of psychophysical task (yes-no detection, 2AFC discrimination and 2AFC detection) and explored the entire range of target performance levels allowed for by each task. Other factors included in the study were the form and parameters of the actual psychometric function Psi, the form and parameters of the model function M assumed in the Bayesian method, and the location of Psi within the parameter space. Our results indicate that Bayesian adaptive methods render unbiased estimators of any arbitrary point on psi only when M=Psi, and otherwise they yield bias whose magnitude can be considerable as the target level moves away from the midpoint of the range of Psi. The standard error of the estimator also increases as the target level approaches extreme values whether or not M=Psi. Contrary to widespread belief, neither the performance level at which bias is null nor that at which standard error is minimal can be predicted by the sweat factor. A closed-form expression nevertheless gives a reasonable fit to data describing the dependence of standard error on number of trials and target level, which allows determination of the number of trials that must be administered to obtain estimates with prescribed precision.
Stopping rules in Bayesian adaptive threshold estimation
(Spatial Vision, 2005) Alcalá Quintana, Rocío; García Pérez, Miguel Ángel
Threshold estimation with sequential procedures is justifiable on the surmise that the index used in the so-called dynamic stopping rule has diagnostic value for identifying when an accurate estimate has been obtained. The performance of five types of Bayesian sequential procedure was compared here to that of an analogous fixed-length procedure. Indices for use in sequential procedures were: (1) the width of the Bayesian probability interval, (2) the posterior standard deviation, (3) the absolute change, (4) the average change, and (5) the number of sign fluctuations. A simulation study was carried out to evaluate which index renders estimates with less bias and smaller standard error at lower cost (i.e. lower average number of trials to completion), in both yes–no and two-alternative forced-choice (2AFC) tasks. We also considered the effect of the form and parameters of the psychometric function and its similarity with themodel function assumed in the procedure. Our results show that sequential procedures do not outperform fixed-length procedures in yes–no tasks. However, in 2AFC tasks, sequential procedures not based on sign fluctuations all yield minimally better estimates than fixed-length procedures, although most of the improvement occurs with short runs that render undependable estimates and the differences vanish when the procedures run for a number of trials (around 70) that ensures dependability. Thus, none of the indices considered here (some of which are widespread) has the diagnostic value that would justify its use. In addition, difficulties of implementation make sequential procedures unfit as alternatives to fixed-length procedures.
Interval bias in 2AFC detection tasks: sorting out the artifacts
(Attention, Perception, & Psychophysics, 2011) García Pérez, Miguel Ángel; Alcalá Quintana, Rocío
Proportion correct in two-alternative forcedchoice (2AFC) detection tasks often varies when the stimulus is presented in the first or in the second interval.Reanalysis of published data reveals that these order effects (or interval bias) are strong and prevalent, refuting the standard difference model of signal detection theory. Order effects are commonly regarded as evidence that observers use an off-center criterion under the difference model with bias. We consider an alternative difference model with indecision whereby observers are occasionally undecided and guess with some bias toward one of the response options. Whether or not the data show order effects, the two models fit 2AFC data indistinguishably, but they yield meaningfully different estimates of sensory parameters. Under indeterminacy as to which model governs 2AFC performance, parameter estimates are suspect and potentially misleading. The indeterminacy can be circumvented by modifying the response format so that observers can express indecision when needed. Reanalysis of published data collected in this way lends support to the indecision model. We illustrate alternative approaches to fitting psychometric functions under the indecision model and discuss designs for 2AFC experiments that improve the accuracy of parameter estimates, whether or not order effects are apparent in the data.
Fixed vs. variable noise in 2AFC contrast discrimination: lessons from psychometric functions.
(Spatial vision, 2009) García Pérez, Miguel Ángel; Alcalá Quintana, Rocío
Recent discussion regarding whether the noise that limits 2AFC discrimination performance is fixed or variable has focused either on describing experimental methods that presumably dissociate the effects of response mean and variance or on reanalyzing a published data set with the aim of determining how to solve the question through goodness-of-fit statistics. This paper illustrates that the question cannot be solved by fitting models to data and assessing goodness-of-fit because data on detection and discrimination performance can be indistinguishably fitted by models that assume either type of noise when each is coupled with a convenient form for the transducer function. Thus, success or failure at fitting a transducer model merely illustrates the capability (or lack thereof) of some particular combination of transducer function and variance function to account for the data, but it cannot disclose the nature of the noise. We also comment on some of the issues that have been raised in recent exchange on the topic, namely, the existence of additional constraints for the models, the presence of asymmetric asymptotes, the likelihood of history-dependent noise, and the potential of certain experimental methods to dissociate the effects of response mean and variance.
Reminder and 2AFC tasks provide similar estimates of the difference limen: a reanalysis of data from Lapid, Ulrich, and Rammsayer (2008) and a discussion of Ulrich and Vorberg (2009)
(Attention, perception & psychophysics, 2010) García Pérez, Miguel Ángel; Alcalá Quintana, Rocío
Lapid, Ulrich, and Rammsayer (2008) reported that estimates of the difference limen (DL) from a two-alternative forced choice (2AFC) task are higher than those obtained from a reminder task. This article reanalyzes their data in order to correct an error in their estimates of the DL from 2AFC data. We also extend the psychometric functions fitted to data from both tasks to incorporate an extra parameter that has been shown to allow obtaining accurate estimates of the DL that are unaffected by lapses. Contrary to Lapid et al.'s conclusion, our reanalysis shows that DLs estimated with the 2AFC task are only minimally (and not always significantly) larger than those estimated with the reminder task. We also show that their data are contaminated by response bias, and that the small remaining difference between DLs estimated with 2AFC and reminder tasks can be reasonably attributed to the differential effects that response bias has in either task as they were defined in Lapid et al.'s experiments. Finally, we discuss a novel approach presented by Ulrich and Vorberg (2009) for fitting psychometric functions to 2AFC discrimination data.
A Comparison of Anchor-Item Designs for the Concurrent Calibration of Large Banks of Likert-Type Items
(Applied Psychological Measurement, 2010) García Pérez, Miguel Ángel; Alcalá Quintana, Rocío; García Cueto, Eduardo
Current interest in measuring quality of life is generating interest in the construction of computerized adaptive tests (CATs) with Likert-type items. Calibration of an item bank for use in CAT requires collecting responses to a large number of candidate items. However, the number is usually too large to administer to each subject in the calibration sample. The concurrent anchor-item design solves this problem by splitting the items into separate subtests, with some common items across subtests; then administering each subtest to a different sample; and finally running estimation algorithms once on the aggregated data array, from which a substantial number of responses are then missing. Although the use of anchor-item designs is widespread, the consequences of several configuration decisions on the accuracy of parameter estimates have never been studied in the polytomous case. The present study addresses this question by simulation, comparing the outcomes of several alternatives on the configuration of the anchor-item design. The factors defining variants of the anchor-item design are (a) subtest size, (b) balance of common and unique items per subtest, (c) characteristics of the common items, and (d) criteria for the distribution of unique items across subtests. The results of this study indicate that maximizing accuracy in item parameter recovery requires subtests of the largest possible number of items and the smallest possible number of common items; the characteristics of the common items and the criterion for distribution of unique items do not affect accuracy.
Fitting model-based psychometric functions to simultaneity and temporal-order judgment data: MATLAB and R routines
(Behavior research methods, 2013) Alcalá Quintana, Rocío; García Pérez, Miguel Ángel
Research on temporal-order perception uses temporal-order judgment (TOJ) tasks or synchrony judgment (SJ) tasks in their binary SJ2 or ternary SJ3 variants. In all cases, two stimuli are presented with some temporal delay, and observers judge the order of presentation. Arbitrary psychometric functions are typically fitted to obtain performance measures such as sensitivity or the point of subjective simultaneity, but the parameters of these functions are uninterpretable. We describe routines in MATLAB and R that fit model-based functions whose parameters are interpretable in terms of the processes underlying temporal-order and simultaneity judgments and responses. These functions arise from an independent-channels model assuming arrival latencies with exponential distributions and a trichotomous decision space. Different routines fit data separately for SJ2, SJ3, and TOJ tasks, jointly for any two tasks, or also jointly for the three tasks (for common cases in which two or even the three tasks were used with the same stimuli and participants). Additional routines provide bootstrap p-values and confidence intervals for estimated parameters. A further routine is included that obtains performance measures from the fitted functions. An R package for Windows and source code of the MATLAB and R routines are available as Supplementary Files.
Psychometric functions for detection and discrimination with and without flankers.
(Attention, perception & psychophysics, 2011) García Pérez, Miguel Ángel; Alcalá Quintana, Rocío; Woods, Russell L; Peli, Eli
Recent studies have reported that flanking stimuli broaden the psychometric function and lower detection thresholds. In the present study, we measured psychometric functions for detection and discrimination with and without flankers to investigate whether these effects occur throughout the contrast continuum. Our results confirm that lower detection thresholds with flankers are accompanied by broader psychometric functions. Psychometric functions for discrimination reveal that discrimination thresholds with and without flankers are similar across standard levels, and that the broadening of psychometric functions with flankers disappears as standard contrast increases, to the point that psychometric functions at high standard levels are virtually identical with or without flankers. Threshold-versus-contrast (TvC) curves with flankers only differ from TvC curves without flankers in occasional shallower dippers and lower branches on the left of the dipper, but they run virtually superimposed at high standard levels. We discuss differences between our results and other results in the literature, and how they are likely attributed to the differential vulnerability of alternative psychophysical procedures to the effects of presentation order. We show that different models of flanker facilitation can fit the data equally well, which stresses that succeeding at fitting a model does not validate it in any sense.