RECENCY EFFECTS IN PRIMARY-AGE CHILDREN AND COLLEGE STUDENTS

. We investigate the evolution of probabilistic reasoning with age and some related biases, such as the negative/positive recency effects. Primary school children and college students were presented with probability tasks in which they were asked to estimate the likelihood of the next occurring event after a sequence of independent outcomes. Results indicate that older children perform better than younger children and college students. Concerning biases, the positive recency effect decreases with age whereas no age-related differences are found for the negative recency effect. Theoretical and educational implications of results are discussed.


Cognitive heuristics
Probabilistic thinking has been found to be related to cognitive heuristics. Heuristics sometimes lead to the desired result but their activation might produce systematic errors (Tversky & Kahneman 1974). The representativeness heuristic is well-known to affect probabilistic reasoning: people estimate the likelihood of an event by judging how well it represents its parent population; this heuristic can cause some predictable errors in certain situations.
The negative recency effect (or gambler's fallacy) is another example. For instance, if a coin is flipped four times and there is a consecutive sequence of four tails, people commonly believe that heads is more likely to occur next. That is, the equal probability of either heads or tails is not considered. Rather, by applying the representativeness heuristic, the consecutive sequence of four tails is not representative of the expected 50 : 50 distribution. That is why, according to this heuristic, heads will follow so as to equilibrate the proportion. People become victims of the fallacy because they believe the likelihood of an event is related to the outcomes of previous tosses, i.e. the fact that single tosses are independent events is not taken into account.

Research on recency biases
Several studies have investigated negative recency effects in college students and children (Kahneman & Tversky 1972, Green 1982, Batanero, et al 1994, Konold 1995, Afantiti-Lamprianou & Williams 2003. However, the effect of age has not been completely clarified. Fischbein & Schnarch's study (1997) found that the negative recency effect decreases with age. In contrast, a recent study (Chiesi, et al 2007) reveals a negative recency bias in the performance of young children and adults, whereas older children are found to be almost immune to it.
Another bias in probabilistic situations is the positive recency effect. For example, if a coin is flipped four times and a consecutive sequence of four tails is obtained, a person might have a different estimation of the probability due to the result of the sequence, and might believe that the coin is not fair, i.e., p(T) > p(H). In other words, past outcomes of random events are used as indicators of future events and, learning from experience, a particular outcome is considered to be more likely. It has been found, that from age ten positive recency is almost absent (Fischbein & Schnarch 1997), whereas younger children show this bias (Chiesi, et al 2007).
Several criticisms have been reported relating to the notions of heuristics and biases. In the main, the adaptive value of heuristic modes of reasoning has been largely documented (Gigerenzer & Todd 1999, Evans & Over 1996. Whereas the relevance of these claims concerning human decision making cannot be discarded, those who teach probability have to deal with the following: students have intuitions on probability with a cognitive foundation (i.e., intuitions are not irrational). Intuitions may coincide with scientifically accepted statements, but may sometimes contradict them (Fischbein & Gazit 1984, Fischbein, et al 1991. Specifically, some probability rules are counter-intuitive, and consequently students are likely to have intuitions that are at variance with the commonly accepted reasoning; as a result probability is a hard subject to learn and teach (Kapadia & Borovcnik 1991, Shaughnessy 1992. Starting from the assumption that intuitions on probability could impede learning probability, it becomes relevant to investigate students' heuristic reasoning and biases. Italian national curricular programs (see Brengio, n. d.) include probability at primary school level starting from the third grade. Specifically, they include statistical surveys and their representations, some linguistic/conceptual issues related to possible, impossible, improbable events, and the development of judgment and estimation of odds under uncertainty through games of chance, in the context of the classical definition of probability. This content was firstly included in the 1985 reform, and more recently the importance of teaching these topics was stressed in the revised government program (Moratti 2004).

The present study
Our study is designed to describe primary school children's ways of handling probability concepts such as the base-rate computation and the independence notion in order to evaluate the impact of teaching probability on intuitions comparing third graders (i.e. students before starting learning probability) and fifth graders (i.e. students after learning probability). In line with previous studies (Jacobs & Potenza 1991, Davidson 1995, Fischbein & Schnarch 1997, Morsanyi & Handley 2007, Chiesi, et al 2008, these topics are also investigated by comparing children and adults' performance. Participants are asked to estimate the likelihood of the next result after a sequence of independent outcomes, in order to investigate both normative (i.e. correct responding) and heuristic reasoning (i.e. recency effects).
In more detail, we present a marble bag game using different base-rates in combination with two different sequences of outcomes. Marble bags compared to tossing a regular coin have the advantage to allow for testing the response to proportions different from 50 : 50, and to compare the answers between equally likely and not equally likely proportions within the same task. Combining these base-rates with sequences of previous outcomes, different trials are obtained suitable to highlight if respondents take into account base-rates and the independence notion, or if they neglect them, relying on representativeness.
Indeed, traditionally recency effects have been studied using the 50 : 50 proportions for the generating process in games like tossing a coin, or drawing from a bag with the same number of marbles of different colours. It may be argued that within a game situation, respondents interpret the task as a forced-choice, i.e. they have to choose one of the possibilities, and the answers cannot be clearly interpreted in terms of correct responding. When base-rate information does not help in choosing between the two options, the previous sequence of outcomes might offer information to the respondent: on one hand, the choice might be oriented to making the sequence more representative of the parent population; on the other hand, the characteristic of the sequence of previous outcomes might be considered as the key to predict the future outcome.
Both represent a departure from probability rules, i.e. notions such as base-rate information and independence are ignored, but it is not correct to interpret answers as errors since they might originate from a different perception of the task by the respondent (for a detailed discussion, see Borovcnik & Bentz 1991). Starting from this premise, we create conditions in which base-rates could help in making the decision, i.e. in which the more likely option was clearly triggered by the numerical information, and answers can be interpreted with less ambiguity in terms of correct/incorrect responding.
In the present study primary school and college students are asked to estimate the likelihood of the next event after a sequence of independent outcomes. The task is created to highlight normative reasoning and the recency effects, and to explore age-related differences.

Design of the present study
The scope of our investigation is whether answers due to recency biases are related to gender or age. The use of classes allows the selection of gender in representative proportions; there should be no effects on the relation between abilities and gender, i.e. the influence factor gender would be represented well by the sampling procedure. With the influence of age, we would favour a longitudinal design, i.e. to follow up a well-selected cohort for a longer period, but this did not prove to be practical. We opt for a cross-age design; we choose deliberately 3 age groups and analyze the differences in age as if it were a development of the same group.
The participants were 83 primary school children and college students. The primary school children were enrolled in Italian public schools that service families from the lower middle to middle socio-economic classes. All students were invited to participate; their parents were given information about the study. Nobody denied consent. The college students were enrolled in various degree programs (psychology, educational sciences, biology, and engineering) at the University of Florence. All were volunteers and did not receive any reward for participation in the study. The students, too, may be viewed as a sample of their age group studying at university. The three age groups intentionally included children before they were formally taught probability (third graders), children who have already been taught probability (fifth graders), and college students who have encountered mathematical issues related to probability during their school years. This should allow investigating also the effects of formal education in probability.

Description of the experiment
We use a game with marble bags that is suitable for both children and college students.
Compared to the classic test involving the tossing of a coin, it allows base-rates different from equiprobability. The task is created by adapting teaching material presented in a book designed to introduce probability to primary school children ( Outcomes are independent because the number of marbles remains always the same since the marbles are replaced after each draw. In this way, the likelihood of the next occurring event has to be estimated upon the initial base-rate. Then, the experimenter, always showing a powerpoint slide (Figure 1 b), tells that the protagonist of the story has repeated the game with the same bag, but in this case, he has obtained a different sequence of outcomes, that is four Blue marbles.
green green green green green green green green

RESULTS
To analyze the extent to which the various biases are operating, different scores have been constructed. We add up the number of times, a student answers in accordance to some type of behaviour to measure the degree of this type. For answers in line with normative behaviour this means: the higher the scale, the higher the degree of this person to act normatively.

Normative reasoning
Answers are coded as correct when consistent with the normative rule: this means • answer "Equally likely" (G/B) when the bag has the same number of blue and green, • answer "Green" (G) when the bag has a prevalence of green, • answer "Blue" (B) when the bag had a prevalence of blue marbles.
For each trial, the number of correct answers is counted inside each age group (Table 2).
Results indicate that normative answers are less frequent in the equally likely (15B & 15G) trials than in the not equally likely (21G & 9B; 3G & 27B) across age groups. This supports the hypothesis that respondents interpret the task as a forced-choice, i.e. they think that they have to choose only one of the two possibilities, and this is especially true for younger children.
Concerning trials different from equiprobability, older children give correct answers throughout the four trials, younger children answer correctly when the normative answer overlaps the positive recency answer (i.e. given the bag with 21 green and 9 blue marbles, and the four green outcomes, they choose the green option vs. given the bag with 3 green and 27 blue marbles, and the four blue outcomes, they choose the blue option).
The correct answer decreases significantly in cases when normative and positive recency fall apart. Taking into account the entire pattern of answers across trials, we can argue that by and large only 40% of the younger students answer correctly taking into account the base-rates. An opposite pattern is found for college students; they give more correct answers when the normative options match the negative recency options, i.e. there seems to be a tendency to re-establish the representativeness of the sequence in college students' answers across trials.

Recency effects
To examine the extent of positive recency bias, wrong answers consistent with the positive recency effect are counted. In the same way, wrong answers consistent with the negative recency effect are used to measure participants' negative recency bias (Figures 3a and b). In equally likely trials, for younger children, positive recency is predominant, for older children the two recency effects are equally frequent, whereas for college students answers due to negative recency are predominant. In not equally likely trials, for older children, the few incorrect answers are attributable to both recency effects, younger children answer consistently with recency effects, and college students give more negative recency answers (Tables 3a and b).

Equiprobability bias
Beyond recency effects, some college students show a marked tendency towards equiprobability answers. According to such an equiprobability bias, persons tend to consider two offered possible outcomes as equally likely. Regardless of the actual green/blue proportion in the bags, the persons prefer the answer B/G (Table 4). That is why college students show a poorer performance than older children: Whereas the two groups do not differ in the predominance of positive and negative recency biases, the equiprobability bias weakens the performance of the college students.

Statistical analysis of gender-and age-related effects
To examine the level of normative reasoning, and heuristic reasoning, composite scores are based on the sum of correct and recency answers (Klaczynski 2001). One point is given for each correct answer. The scores range from 0 to 6, higher scores indicating a higher level of normative reasoning. Concerning heuristic reasoning, one point is given to the participants for each wrong answer consistent with the positive recency effect. In the same way, one point is attributed for each wrong answer consistent with negative recency. Thus, two scores (Negative and Positive Recency) are obtained ranging from 0 to 4. Higher scores indicate a stronger bias towards the direction the score is measuring. To investigate the influence of gender, we control each score for gender; as a result, we find no significant differences related to gender. Hence, we reduce our preliminary plan of a two-way analysis of variance to the sole remaining factor of age.  • ANalysis Of VAriance (or ANOVA) is a powerful and common statistical procedure in the social sciences. One-way ANOVA is used to compare means of several groups, which are signified by different values of one grouping variable (factor).

Males
• The reason this analysis is called Analysis of Variance rather than multi-group means analysis (or something like that) is because it compares group means by comparing variance estimates.
• The comparison is made between the variance between the different groups between the different groups (i.e. groups defined on the independent variable, such as treatment, age, gender) and amongst all the individuals within those groups amongst all the individuals within those groups (i.e. not due to group membership): -Variance between groups > Variance within groups indicates significant differences between groups.
-Variance between groups < Variance within groups indicates non significant differences between groups.

Tamhane Tamhane's post 's post--hoc test hoc test
If more than two groups are analyzed, the one-way ANOVA does not specifically indicate, which pairs which pairs of groups are are significantly different different.
Post-hoc tests are applied to determine such pairs. There are many post-hoc tests. The choice depends on characteristics of the data .
Tamhane's test is suitable also in case if group sizes and observed variances are unequal group sizes and observed variances are unequal. Fig. 4 a. ANOVA scheme-Click to enlarge. Fig. 4 b. Tamhane post-hoc test -Click to enlarge.
Results show a significant effect of age on normative reasoning and on positive recency.
No significant effect of age is found for negative recency; see the details in Table 6 and Figures 5a-c. Tamhane post-hoc comparisons have been applied; significant rankings -designating significant differences between age groups are marked: ** (p < 0.01) or * (p < 0.05).

DISCUSSION AND CONCLUSIONS
Primary school and college students show both normative and heuristic reasoning when asked to estimate the likelihood of the next event after a sequence of independent outcomes.

Age-and gender-related effects
As hypothesized, gaming situations involving drawing from a bag containing the same quantity of two colours of marbles encourage the respondent to choose one of the possibilities.
When base-rate information does not help in choosing between the two options, the previous sequence of outcomes might offer relevant information While we do not find any effects of gender, age considerably influences the answering behaviour: by and large, younger children intuitively estimate the probability of an event on previous outcomes (i.e. they continued the streak), older children were equally distributed in the two types of recency effects, and college students showed negative recency biases (i.e. they interrupted the streak).
Concerning gaming situations with bags with different proportions of two colours, i.e. in which the more likely option was clearly triggered by the numerical information, the entire patterns of answers across the trials indicated that older children gave correct answers suggesting that they mainly rely on base-rates, taking into account the independence notion and disregarding the sequence of previous independent outcomes. Younger children tend to give positive recency answers, whereas college students, aiming to re-establish the representativeness of the sequence, showed negative recency biases.
According to previous studies (Jacobs & Potenza 1991, Davidson 1995, Fischbein & Schnarch 1997, Morsanyi & Handley 2007, our results present a view of development of probabilistic reasoning that runs contrary to what is described by traditional developmental theories (Piaget & Inhelder 1975): Whereas an increase of the ability is found from younger to older children, aspects of college students' performance seems to get worse with age.
Younger children, who have not formally been taught in probability, show a poor performance. They rarely take base-rates into account; instead, they largely use the sequence of previous (independent) events as a cue to estimate the likelihood of future outcomes. On the one hand, it can be claimed that their heuristic mode of reasoning can be explained by the fact that they ignore the independence rule. On the other hand, it may be argued that the positive recency relies on a sound intuition about subjective probability, i.e. previous experiments can be used to estimate the probability of an event. This is an intuition that works for large samples, in accordance with the law of large numbers, but may lead to incorrect answers when applied with small samples or given information -as the ones presented in this study.
Older children, who have already been taught probability, perform well. They take the base-rates into account, and use them to make normatively correct decisions. As a consequence, older children use heuristic modes of reasoning less often than the other two groups.

Ambiguous effects of teaching
One possible explanation for the better achievement of older pupils of grade 5 might be the fact that they possess the skills needed to solve probability problems. As documented for different arithmetic skills (i.e. performance on equivalence problems, McNeil 2007), children seem to apply rules inflexibly while they are learning them and acquiring expertise. In this case, older children's level of proficiency leads them to the correct answer, and the numerical information on base-rates overrides any other information (e.g. the sequence of independent previous outcomes). Nonetheless, sometimes they show the positive recency, as younger children, and the negative recency, as college students.
However, the effect of teaching does not seem undisputed. College students show not only the negative recency, but also the equiprobability bias, which may be referred back to incomplete acquiring of the concepts they were taught as children. This equiprobability bias has been documented using different types of tasks such as the probability of outcomes obtained rolling two dice (Green 1982, Fischbein, et al 1991, William & Amir 1995, Batanero, et al 1996, Canizares & Batanero 1998, and it refers to the application of the notion of equal probability (each of the possible outcomes of an experiment has the same probability of occurring) in situations where the outcomes are not equally likely. This bias is based on a misunderstanding of the concept of randomness, and people focus entirely on the uncertainty and unpredictability aspect of probabilistic events.
According to Lecoutre (1992), the equiprobability bias is a tendency of individuals to think of random events as "equiprobable" by nature, and to also judge outcomes that occur with different probabilities as equally likely. An increase in equiprobability bias in the course of education has been found in the case of both secondary school and university students (see e.g. Batanero, et al 1996), this bias is not only present after formal education in probability, but in fact could be a consequence of that formal education.
The present investigation supports the idea that the connection between age and probabilistic reasoning performance is mediated by the progressive acquisition of both probabilistic competencies and heuristic issues linked to the notion of probability. Young children's heuristic reasoning seems to be caused by the lack of normative competence. Older children perform well applying learned rules. College students could be wrong since they acquire explicit theories of chance and probability, which lead to error, and it could be argued that these wrong theories arise from education. In sum, educational experiences succeed in establishing probabilistic thinking but formal conceptions sometimes have no lasting effect, and some biases could even arise from them.
For this reason, probability is a hard subject to learn and to teach; secondary school and college teachers have to know that biases related to probability still remain after students are taught in probability, and moreover, some biases could arise from the learned but not fully acquired probability concepts. That is, they need to assess students' conceptions about randomness and to discuss with them both the normative and the intuitive aspects comparing correct and wrong intuitions, correct and incorrect rule applications.

Desiderata
Future research should develop didactical activities and materials suitable for both younger and older students (even at college level) that could be used to both assess and discuss these topics. The aim is to enable students to reflect and acquire correct knowledge, and to deal with their intuitive ideas that could interfere with normative reasoning. In more detail, materials similar to those used in the present study could be prepared and tested, arranging situations with different proportions and different sequences of previous outcomes, in order to discuss the concepts of randomness, representativeness, and independence.
Future research directions should also address the role of primary students' mathematical skills (e.g. ratio and fraction ability) in order to evaluate the role of mathematical competencies on probabilistic reasoning related to this kind of task. That is, children should be asked to solve simple event probabilities in order to develop arithmetical proficiency, and to explore the extent that the ability to deal with fractions and ratios is missing and influences performance in this kind of tasks. In this way, it could be possible to ascertain if correct answers depend on mathematical skills, as well as if incorrect answers depend on the lack of these skills. In line with this position, achievement in mathematics should be taken into account in order to compare mathematical performance and probabilistic reasoning performance.
The design of the present experiment, including different base-rates and sequences of previous outcomes, enables reasoning processes to be explored without asking respondents for explanations in order to compare performance of children and adults, avoiding possible biases related to different levels of verbal and cognitive abilities. In particular, when children are asked to explain their answers, they show a tendency to take into account irrelevant aspects in order to embellish or make an explanation more interesting, and this might hide their actual reasoning.
Nevertheless, for didactical purposes it could be interesting to ask respondents to justify their answers in order to discuss their intuitions and normative reasoning. One other area to explore is whether the order of tackling the tasks makes a difference in the responses obtained.
Finally, there are some precautions about the interpretation of the data: • The small sample size allows for a high variability in the results, which may hinder their all-to-easy generalization.

•
The four marbles presented were all times of the same colour; this evokes an artificial touch to the experiment. Such results would not occur if we drew randomly from the bags. The effect of such a restriction of randomness on the answering behaviour of the children may only be hypothesized.

•
The age groups investigated serve only as hotspots. While they are chosen deliberately as before and after first elements and after formal education in the topic of probability, it would be advisable to investigate the development related to age more continuously.
• Perhaps, only a long-term project following-up a carefully selected cohort of young children for 10 years may lead to insights into the actual development of children.
In this way, the present work can be seen as an exploratory study. The results are convincing per se and yield new insights into the development of probability concepts with age.
They also serve as a reminder to be careful about how to teach the basic concepts as -not fully understood -teaching might bias the further acquisition of concepts by learners. Our results await corroboration by further research.