Why Johnny Can ’ t Apply Multiplication ? Revisiting the Choice of Operations with Fractions

The inadequate choice of operations and misleading interpretations of operations are major obstacles for students to mathematize and solve word problems successfully. Based on theoretical considerations on conceptual change concerning interpretations, this article reports on an empirical study conducted with 830 students. It extends existing investigations on the issue in three ways: 1. by focusing on fractions rather than on decimals, 2. by using an enriched test design, including several aspects of competence and various models for multiplication, and 3. by a deeper explorative analysis of 197 reasons for choices given in an operation choice item format with open responses. The article extends existing theoretical approaches to the problem by reconstructing four main strategies for choosing operations.

"One of the greatest difficulties that students encounter in mathematics is solving verbal problems.They do not know how to translate the verbal information into mathematical form.Under the usual presentations in the traditional and modern mathematics curricula this difficulty is to be expected.… On the other hand, if the mathematics is drawn from real problems, the difficulty of translation is automatically disposed of."(Kline, 1973, p.153f.)This quotation from the book "Why Johnny can"t add" (Kline, 1973) is nearly 40 years old and often cited.Meanwhile, the criticized "new mathematics" has been displaced by mathematics that is often drawn from real problems as Kline suggested in many countries.However, the difficulty of translation has not at all disappeared "automatically".
Various empirical studies have shed light on different obstacles for students to solve word problems successfully.This article reports on a study that investigates students" performance in mathematizing word problems with multiplication of fractions as their common mathematical core1 .It is guided by the following research question: How do students chose operations when mathematizing multiplicative word problems with fractions and how can these choices be explained?

State of Research and Theoretical Background Existing Theoretical Approaches and Empirical Findings
Thirty years of international research on students" difficulties with word problems have shed light on various obstacles, challenges and possible explanations for difficulties; among them reading difficulties and constructing adequate situation models, suspension of sensemaking and limited disposition for validating (Verschaffel et al., 2000, Verschaffel et al., 2009).This article focuses on students" choices of operations and their possible backgrounds as one of the crucial aspects for mathematizing (vom Hofe, Kleine, Blum, & Pekrun, 2006).The following summary of research makes clear that this aspect is worth to be investigated for each mathematical operation and number domain separately.
sense that new knowledge is simply added to the prior (as a process of enrichment).Instead, learning often necessitates the discontinuous reconstruction of prior knowledge when confronted with new experiences and challenges.Problems of conceptual change can appear, when learners" prior knowledge is incompatible with new necessary conceptualisations.In the conceptual change approach, the discrepancies between intended mathematical conceptions and real individual conceptions are not seen as individual deficits but as necessary stages of transition in the process of reconstructing knowledge, as epistemological obstacles, in Brousseau"s terms (1997).
Mental models -Grundvorstellungen.Up to 2006, this discussion on conceptual change was held nearly separately from a second influential theoretical approach that emphasized the importance of underlying mental models (Fischbein et al., 1985;Greer, 1994) or "Grundvorstellungen" (GVs, see vom Hofe et al., 2006) for explaining students" difficulties.The notion (mental) model is used here as synonymous to Grundvorstellung.It starts from Fischbein"s use of model as a "meaningful interpretation of a phenomenon or concept" (Fischbein, 1989, p.129) which is more specific than the often cited construct mental model as used by cognitive scientists like Johnson-Laird (1983).Within the theoretical approaches to which this article refers (Fischbein, 1989;vom Hofe et al., 2006;Usiskin, 1991), the formation of models is considered to be of special importance for mathematical concept acquisition (and the processes of interpreting) and especially for solving word problems.
The cognitive process of solving word problems can be modelled in a modelling cycle as illustrated in Figure 1.This process model allows to locate the role of GVs as the main conceptual tool for the process step mathematizing -the step in the modelling cycle in which the transformation from a real-world situation into a mathematical problem is conducted.
Figure 1.Role of mental models (GVs) in the modelling cycle (vom Hofe et al., 2006) Models constitute the meanings of mathematical concepts based on familiar contexts and experiences.They create mental representations of the concept and are crucial for the ability to apply a concept to reality by recognizing the respective structure in real life contexts or by modelling a real life situation with the aid of mathematical structures.Similarly, Usiskin (2008, p.15) describes the role of models for mathematizing.

GV
Mental models for the multiplication of fractions.Problems with models for the multiplication of fractions and decimal numbers were theoretically discussed by Fischbein et al. (1985) and Greer (1994), but their empirical studies focus on decimal numbers.Models for the multiplication of fractions only recently were in view of empirical studies (Usiskin, 2008;vom Hofe et al., 2006;de Castro, 2008;Prediger, 2008a), hence further research is needed.Whereas some approaches focus on one model considered to be central (e.g., Taber, 2007;de Castro, 2008), others emphasize the need for a plurality of models.As problems are structured differently, the variety of different situations or word problems necessitate different models for mathematizing them (Bell et al. 1984;Anghileri & Johnson, 1992;Usiskin, 2008;vom Hofe et al., 2006;Prediger, 2008a).When students are expected to mathematize different situations, they hence need to be able to activate a plurality of mental models.The following models for the multiplication of fractions are often mentioned in the literature (partly with different names or distinctions):  multiplicative comparison (e.g., half as much (Greer,1994));  scaling-up and down (e.g., 2/3 x 5/2 means 5/2 cm compressed on 2/3 of it (Bell et al., 1984;Anghileri & Johnson, 1992;Taber, 2007;Usiskin, 2008));  part-of interpretation (e.g., 2/3 x b means 2/3 of b, for b being a fraction or a whole.If b is a fraction, it is a portion of a portion (Wartha, 2007));  acting across quantities (e.g., distance x speed or quantity x unit price; 2/3 kg x 5/2 €/kg (Bell et al., 1984;Usiskin, 2008));  array, for example area of a rectangle (e.g., 2/3 x 5/4 is the area of a 2/3 cm x 5/4 cm rectangle (Anghileri & Johnson, 1992;Greer, 1994;Usiskin, 2008)).
As all categorizations of a complex field, this categorization is not absolute, but contingent, with some connecting points.For example, other researchers subsume both partof interpretation and scaling-down under the common "operator aspect" (e.g., Hallett, 2008), although the part-of interpretation usually refers to a part that leaves a rest, not to a process of scaling down.It is their difference with reference to discontinuity (see Figure 2) that justified the distinction here.Furthermore, the categorization taken here is the most suitable for the German curriculum.
For investigating student thinking with respect to models, a further terminological distinction is useful: When precise distinctions between prescriptive and descriptive modes of analysis are needed, the term "GVs" is used for mathematically intended models in a prescriptive mode, and the term "individual models" in a descriptive mode for correct or nonappropriate individual (partly idiosyncratic) interpretations (Prediger, 2008a).

Integrating Theoretical Approaches and Findings
In Prediger (2008a), these two theoretical approaches of conceptual change and of models were integrated into a conceptual tool for describing the different mathematical and epistemological qualities of students" difficulties with discontinuities in the transition from natural to fractional numbers.For this integration, the distinction of different layers on the intuitive level of understanding is crucial, which is characterized as the type of partly implicit knowledge that we tend to accept directly and confidently as being obvious (Fischbein, 1989).The layer of conceptions about concrete mathematical laws or properties, here called intuitive rules (like "multiplication makes bigger") should be separated from the layers of meaning, comprising models for operations (like the interpretation "multiplication means repeated addition") and models for fractions (like "3/4 is always 3 parts of a whole of 4 parts").Although solving word problems also has procedural aspects, the layer of mathematizing is assigned to the intuitive level since it necessitates interpretations of mathematical concepts (Prediger, 2008a).

Mental Models and their Model Type (Continuous/Discontinuous)
The distinction of layers allows to re-locate the exact place of discontinuities in the process of conceptual change from natural to fractional numbers (Prediger, 2008a;2012).Although the studies dealing with conceptual change for fractions have considered knowledge on the intuitive level, they were mostly restricted to intuitive rules (Lehtinen et al., 1997;Stafylidou & Vosniadou, 2004;Vosniadou & Verschaffel, 2004).Such a focus tends to conceptualize the transfer of rules from natural numbers to fractions as a problem of hasty generalization.

Natural numbers
Fractions repeated addition repeated addition for natural x fraction (3x5 as 5+5+5, e.g., 3 wands of 5cm length, arranged successively) ??? for fraction x fraction ??? part-of interpretation (2/3 x b means 2/3 of b, for b being a fraction or a whole) multiplicative comparison (twice as much) multiplicative comparison (half as much) scaling up (3x5 means 5 cm is stretched three times as much) scaling up and down (2/3 x 5/2 means 5/2 cm compressed on 2/3 of it) acting across quantities (e.g., distance x speed or quantity x unit price) acting across quantities (e.g., distance x speed or quantity x unit price) area of a rectangle (3x5 as area of a 3 cm x 5 cm rectangle) area of a rectangle (2/3 x5/4 is the area of a 2/3 cm x 5/4cm rectangle) combinatorial interpretation (3x5 as the number of combinations of 3 trousers and 5 shirts) ???In contrast, some researchers (e.g., Fischbein et al., 1985;Prediger, 2008aPrediger, , 2008b;;Greer, 1994) showed the importance of the underlying layer of meaning as more important for locating discontinuities.Already in 1985, Fischbein showed how many students adhered to "repeated addition" as the dominant model for the multiplication of natural numbers also when they worked with decimal numbers.Greer (1994) discussed more models for multiplication of decimals; and in Prediger (2008a), the picture was widened to all relevant models for fractions.
A mathematical analysis of the type of mental models is summarized in Figure 2. The overview makes clear that not all models for multiplication have to be changed in the transition from natural to fractional or decimal numbers.Acting across quantities and the interpretation as an area of a rectangle or as scaling-up can be continued for fractions and decimals as well as the multiplicative comparison.They are shortly called "models of continuous model type" or "continuous models" here.In contrast, the basic model repeated addition is only continuous for at least one natural factor, the combinatorial interpretation only for two natural factors.Vice versa, the basic model of the multiplication of fractions, namely the part-of interpretation, has no direct correspondence for natural numbers.These models are here called "discontinuous models."This distinction of continuous and discontinuous models form the central part of the theoretical framework for the empirical study presented in the following sections.

Main Research Strategies for Explaining Backgrounds of Choices
Roughly resumed, existing studies have adopted three main research strategies for specifying factors that influence the students" choice of operations: (1) Studying effects of factors in word problems by comparing difficulties under controlled variation of operation-choice test items (e.g., Bell et al., 1989); (2) Searching for statistical coherences in a written test that covered different aspects of competence (e.g., vom Hofe et al., 2006;Bell et al., 1981;Prediger, 2008a); (3) Qualitative in-depth analysis by clinical interviews (e.g., Bell et al., 1981;Wartha, 2007).These first two research strategies, although offering insightful findings, can only give statistical coherences (by comparing, in contingency tables or with correlations), but no direct account for causal associations of performance and possible reasons.That is why some quantitative studies have been complemented by qualitative in-depth studies in clinical interviews.But case studies only allow small number of participants.
The study reported here takes an intermediate way and combines some advantages of qualitative and quantitative strategies by a deeper analysis of written responses to open items.
(As these open responses contain the students" explanation of their operation choices, they are shortly called "open explanations" in this article.)On the one hand, analyzing students" open explanations allows an explorative analysis that can generate more than presupposed results.On the other hand, the quantification of constructed codes for a larger number of answers can offer more generalizable results than a case study alone.Although open items or multiple choice items with open responses have often been used in other areas of mathematics education research, their use seems rare so far with respect to explanations of operation choice.

Research Questions and Hypotheses
On the basis of the presented theoretical background, the main research question was splitted into the following subquestions that guided the design of items and the data analysis: Q1.What choices of operations do students (who have completed their fraction curriculum) make when mathematizing multiplicative word problems?
Q2. How is this performance embedded in performances on other layers of dealing with the multiplication of fractions?
Q3. Which (individual) models for multiplication do these students activate while interpreting multiplications or while mathematizing multiplicative word problems?
Q4. How can the wrong choices of operations be explained?
Whereas the questions Q1 and Q2 can be answered by quantitative analysis of correctness of answers, Q3 needs a deeper analysis of answers to explorative items.Question Q4 asks for underlying patterns and associations between different items and explicitly articulated reasons and connects results of Q2 and Q3 with the performance in Q1.Its treatment is operationalized by the following hypotheses which are derived from the literature review and the theoretical background: H1.Word problems that demand discontinuous models are more difficult to solve than those that demand continuous models.
H2. Students" choice of operations is influenced by their intuitive rule on multiplication making bigger or not.
H3. Students" choice of operations is influenced by their ability to give correct interpretations for given operations.

Test Design
The study was based on a paper and pencil test, conducted in regular classes during 35 to 70 minutes (without time restriction).The students were not specifically prepared to the test and could not use their books.
The design of the items was guided by the four research questions Q1-Q4 and the presented theoretical background (overview in Figure 3).The algorithmic level is only shortly touched by the purely skill oriented Item 1.Similarly, the formal level is only reflected by Item 12 concerning the commutativity of multiplication.
As the research questions focus on the intuitive level and especially the mathematizing competences, all other items refer to this level.Item 2 asks in a multi-choice format to confirm or reject the widely studied intuitive rule "multiplication makes bigger" (Bell et al., 1981;1984).Item 3 to 6 refer to meanings.Item 3 asks for assigning suitable multiplicative terms to a rectangle with 3x6 points.Its underlying GV is the area of a rectangle for natural numbers.Item 4, 5 and 6 are designed in an open item format in order not to impose a presupposed mental model.They demand interpretations of a given fraction (Item 4), of a given addition of fractions (Item 5) and of a given multiplication of fractions (Item 6).They intend to enable the researcher to exploratively gain a great variety of really existing individual mental models (see Prediger, 2006, for a justification of the format).
Item 7 to 11 refer to the competence of finding multiplicative terms for given word problems in situations with varying GVs.The different models for the respective items are explicitly named in the headlines in Figure 2 (which were not visible for the students in the original test).As many students solve the word problems by multi-step procedures (e.g., via the rule of three), but cannot identify a suitable single-step term, this single-step term was a core aspect when evaluating the items.Items 7-9 follow the choice of operation methodology, in which a word problem is presented and the task is to decide which operation would be appropriate to find the answer without having to carry out the calculation (as used for example in Fischbein et al., 1985;Bell et al., 1989, and many others).The three multiple choice items are complemented by asking students for their reason of choice.Item 10 and 11 also ask for the choice of operation, but pose some preparatory questions instead of a multiple choice format since more difficulties were expected.

Participants
The sample consisted of 33 whole classes in grade 7 and 9, in sum 830 students, 376 of them in grade 7 (age 12-13 years) and 454 in grade 9 (age 14-15 years), all of them in Nordrhein-Westfalia, the federal state with the largest population in Germany.The school system in Nordrhein-Westfalia selects students in grade 5 according to their achievement levels to three types of schools: higher streamed schools ("Gymnasium," with 33% students of the whole age group), middle level streamed schools ("Realschule," 27%), lower streamed schools ("Hauptschule," 19%).Some comprehensive schools ("Gesamtschule" 16% and private schools, 6%) collect students of all achievement levels (with a focus on "Hauptschule" and "Realschule") and stream them in differentiated courses.For guaranteeing representativity of the sample with respect to general achievement levels, the sample was composed with a numerical distribution on school types that corresponded to the average distribution in the federal state (sample 32% -35% -19% -13%), with a slight trend to higher streamed students (more "Realschule" than "Hauptschule and "Gesamtschule").

Data Analysis
In a first step of data analysis, students" answers to the test were evaluated quantitatively in a point rationing scheme with respect to their correctness.For all items, reached scores (between 0 and 1), means of reached scores, and frequencies of complete solutions were calculated.This step of analysis allowed answers to research questions Q1 to Q3.For treating question Q4, statistical associations between some items and layers were controlled and hypotheses tested.As most variables in the test have an ordinal scale niveau, Spearman"s rank correlation coefficients and chi-square-statistics were chosen for calculating statistical associations.
The second step of data analysis was focused on research questions Q3 and Q4 and dedicated to a deeper analysis of answers to selected open items, especially the selfconstructed word problems in Item 5 and 6 and the reasons given for operation choice in Item 7 to 9. Both were analysed intensively by coding the manifested individual conceptions and explanations for the choice of operations.
The answers to Items 5 and 6 were each coded by two well-trained coders with a preexisting coding scheme (taken from Prediger, 2008a, presented in the next section).For the explanations in Items 7, 8 and 9, the coding scheme first had to be constructed from the data by the author and other researchers through a comparative analysis.Codes were built near the data and then classified by categories which hold for all three items.Some categories could be anticipated by the existing literature (like the pertinacity of the intuitive rule "multiplication makes bigger and division makes smaller," see Bell et al., 1981), but other interesting, unforeseen codes and categories (e.g., restructure strategy, see Figure 7 and 8) had to be constructed in the exploratory process.The two steps of data analysis offered compatible, but nevertheless diverging results.In order to make this effect as visible as possible, the article presents the results of the two steps separately in the last two main sections.

Accounts for Limits, Reliability and Validity of the Study
Although the analysis in two steps allows deeper insights into students" thinking than simply dealing with quantitative measurements, this article explicitly avoids to call this a qualitative analysis.For a future study, we started to collect additional qualitative data by clinical interviews with a sample of tested students.This would allow to support an even richer set of conclusions on the basis of the existing quantitative results.Although this is an important limit of the study presented here, the test analysis still offers interesting results, As suggested by the American Educational Research Association (AERA) Standards for Educational and Psychological Testing (AERA et al., 1999), the test was validated with respect to the degree to which evidence and theory support the interpretations of test results.For accounting for content validity, the test items were critically examined by four external experts (mathematics education researchers) who attested their content validity.Further empirical evidence for validity was given in two pre-studies (one of them published in Prediger, 2008a), in which students" written responses were triangulated by more extensive answers in clinical interviews.
Major emphasis for guaranteeing reliability was put to the most critical step in the data analysis, namely the coding of open answers in Items 5-9.Therefore, all interrater reliabilities of the coding process were controlled and calculated by Cohen"s kappa (Cohen, 1960).The usage of the pre-existing coding scheme for Item 5 and 6 reached a high interrater agreement with Cohen"s kappa of 0.92 and 0.94.The coding scheme for Item 7-9 was constructed in a comparative analysis and a careful, consensual process.To account for reliability, the finalized coding scheme was applied by a third, independent coder.It reached an interrater agreement of Cohen"s kappa 0.83 (0.79-0.86 for the single items).For all other items, scoring and coding was less ambiguous (like for right/false ratings) and reached more than 0.95 of Cohen"s kappa.

Findings of the first Statistical Analysis
Table 1 shows the means of reached normed scores and the frequencies of completely correct solutions (shortly, complete solutions) of all items.They will be presented along the research questions Q2, Q1 and Q4 (results to Q3 are presented in the next section).

Comparing the Performances in Mathematizing to those of other Layers (Q2)
With 56% and 49%, respectively, of complete solutions, Item 1 and 12 were least difficult for the participants, compared to other items on fractions.Only Item 3 (identifying a natural multiplication in a rectangle, 63%) reached a higher frequency of complete solutions.This item as well as Item 4 (explain the meaning of a given fraction, 41%) and Item 5 (pose word problem for an equation with addition, 40%) did not directly concern the multiplication of fractions.
All items concerned with the multiplication of fractions on the intuitive level reached less than 33% complete solutions (most less than 9%), and means of reached scores under 0.39, except for Item 8 (mathematize situation allowing repeated addition because of natural multiplier, 0.59).

Students' Choice of Operations (Q1)
Table 1 shows all means of normed scores and frequencies of complete solutions for the operation choice items (Item 7-11).The discrepancy between the very low frequencies of complete solutions and partly better means of normed scores for the operation choice items can be traced back to low success in explaining the choices (Item part 7b, 8b, 9b).The diagram in Figure 4 shows only the decreasing frequency of correct choices (without explanations) in respective parts of the items (Item 7a, 8a, 9a, 10d and 11b).

Backgrounds for Difficulties in Mathematizing
The diagram in Figure 4 gives first empirical support for Hypothesis H1.It shows that the difficulty of mathematizing word problems with one term varies with the specific word problem.The easiest was Item 8a (mathematize situation allowing repeated addition because of natural multiplier) with 86% correct choices of multiplication.More precisely: 46% chose one of the terms 15 x 2/10 or 2/10 x 15 and 40% both terms. 4t was followed by Item 7a (mathematize situation acting across quantities, 3/4 kg x 1.50 €/kg) with 35% correct choices and Item 11b (mathematize situation of scaling down) with 19% correct terms.Even more difficult was Item 9a (mathematize situation with part of whole number, 2/3 of 36) with 14% correct choices (note that random guess probability was 25%) and Item 10d (mathematize part of a fraction) with 3% correct terms.
The data in Table 2 supports hypotheses H2 only with some restrictions.For finding coherences between choice of operations (in Item 7a, 8a and 9a) and possible reasons in Item 2, the item pairs were considered by two different statistics that fitted to the data"s level of measurement, chi-square and spearman"s rank coefficient.As documented in Table 2, none of Spearman"s rank correlation coefficients was higher than 0.13,which can partly be ascribed to the asymmetry of the data (i.e., low values in most items for most participants).In contrast, the statistics of chi square show highly significant associations for all item pairs (see Table 2).Hence, these results are contradicting.Hypothesis H3 is supported by calculating Spearman"s rank correlation coefficient between layers.The correlation between the layer of mathematizing (Items 7-11) and the layer of meaning of operations (Item 5 and 6) reached 0.91 (Prediger & Matull, 2008, p. 20).

Discussion
The results presented so far show the difficulty of grasping meanings of operations of fractions.Whereas most of the students succeeded in applying their algorithmic skills, they had great difficulties with interpreting multiplication of fractions and choosing the right operation for multiplicative word problems with fractions.
Only 4% of all participants were able to formulate an appropriate word problem for a symbolically given multiplication of fractions.In order to exclude that the difficulties to Item 6 might be simply traced back to an unknown item format, it is worth to compare its score to the one of Item 5, where the number was much higher (40%).Hence, at least 36% of the students had difficulties with the interpretation of multiplication, but not with item format itself.A deeper analysis of the answers is possible by coding them in detail, as presented in the next section.
The success rate in Item 6 is even lower than in an analogous item for decimal numbers in a test conducted with 12 year old Flemish students.22 out of 107 students (21%) were able to formulate an appropriate word problem for 0.7 x 0.2 (de Corte & Verschaffel, 1996, p.229).Although the data of two different samples and tests cannot simply be compared, the big difference between 21% and 4% does not seem to affirm Harel"s et al. (1994, p. 381) assumption about operation choice for fractions being less difficult than for decimals.At least for German students, who are often taught by a curriculum without enough emphasis on the meaning of operations, we can formulate the opposite observation.Multiplicative problems with fractions seem to be more difficult to mathematize than those with decimals (in contrast to Harel et al., 1994, p. 381).This observation seems to hold (at least for the tested German students) also for the operation choice items (Item 8 to 11) where they could not assign a suitable single-step term.In contrast, they succeeded quite well in Item 7, which comprised a decimal number.More elaborate empirical evidence can be given to the hypotheses H1 -H3 that offer different accounts for the difficulties in operation choice as detailed below.
H1 -Word problems that demand discontinuous models are more difficult to solve than those that demand continuous models.Figure 5 shows how the frequency of correct choices varies with the variable model type.The word problems with more correct choices can be structured by continuous models, the word problems with less correct choices by discontinuous models.This is true for the Items 7a, 8a and 9a in multiple choice format as well as for the more difficult Items 10d and 11b in their open format.These patterns support H1.Although the effects shown in Figure 5 are statistically significant and compatible with the theoretical framework, their validity should not be overrated as the test items differed not only with respect to the variable model type, but also in format and number types.Hence, a future test should consequently control variables with respect to this hypothesis.

Word problems that can be structured by continuous models
Item 8a (repeated addition) 86%   Bell et al., 1981).This hypothis can be tested by considering associations between Item 2 and Items 7, 8, 9, measured by values of Chi Square and Spearman"s rank correlation coefficient as printed in Table 2.At least, Chi-square tests for all item pairs were (highly) significant.That means, that the data might affirm the hypothesis H2, formulated by Bell et al. (1981) that one possible obstacle for correct choices of multiplications lies in wrong applications of the rule of multiplication making always bigger.

H2 -Students' choice of operation is influenced by their intuitive rule on multiplication making bigger or not (as shown for decimals in
H3 -Students' choice of operation is influenced by their ability to give correct interpretations for given operations.Even stronger is the association between layers.The overall competence of mathematizing word problems by multiplications (Item 7-11) is significantly associated (with = 0.91) with the competence of formulating appropriate word problems for given calculations (Item 5 and 6).This effect might be interpreted in the following way: Those students who are able to interpret the multiplication properly, can also use it for mathematizing.
Although these statistical results are significant and relevant, they are not yet prioritized according to their relevance for the problems.Additionally, statistical correlational coherences cannot give valid accounts for causal associations.For understanding reasons, a deeper analysis is needed as provided in the next section.

Results and Discussion Concerning Students' Interpretation of Given Equations
For giving a more detailed answer to research question Q3, the self-constructed word problems to Items 5 and 6 were coded and classified with respect to the articulated individual models.The diagram in Figure 6 shows the frequencies of answers and gives four examples.It shows that interpreting additions was much easier for the participants than interpreting multiplications.
30 students (4%) formulated a word problem with a correct part-of interpretation (Code P), 8 found others like geometrical interpretations by rectangular areas (Code R, 1%) or proportional reasoning (Code PR, 0.1%).It is interesting to see which GVs (mentioned in Figure 2) did not appear.No single student successfully activated an individual model of multiplicative comparison, of scaling up and down, and of acting across quantities.Among the partly correct models were 1% part-of interpretations with wrong questions (Code Q) and 3% who used "times" as operator in the story which might be meant in the sense of multiplicative comparison (Code C). 7% of the students used fractions in a senseless way (like in the second example in Figure 6, Code S), 154 students (19%) formulated additive word problems instead of multiplicative ones, like in the third example (Code A).Moser Opitz traces additive interpretations for multiplicative equations back to students" attempt of transferring the repeated addition model to the multiplicative equation (Moser Opitz, 2007, p.206f).
There is no simple answer to the question whether these results support hypothesis H1.It is very evident that interpreting an additive equation is much easier than interpreting a multiplicative equation.This supports the hypothesis insofar as the models for addition are all continuous, and mistakes made while formulating word problems mostly refer to difficulties with fractions, especially the problem of different referent wholes (with 9% the major part of wrong and partly wrong answers) like "2/3 of the class participate at a math contest, and 1/6 of another class.How many are they together?Be careful with reducing!".
But the analysis of articulated individual models for Item 6 shows that this does not mean that all continuous models are equally easily articulated by the students and 5% (4% correctly and 1% partly correct) of the students successfully articulated the part-of interpretation of multiplication although this is a discontinuous model.Obviously, the choice of model is not only influenced by the model type but also by the prior learning opportunities, and in most German textbooks, the part-of interpretation is used for introducing the multiplication of fractions.

Results Concerning Students' Strategies for their Choice of Operation
Complementary to the search for statistical associations between factors and items in the first step, the exemplary analysis of students" explanations to Items 7 and 9 offers insights into students" reasons and strategies for choosing operations.The term "strategies" was chosen in order to signify a more unconscious idea or mechanism for choosing than the term "reasons for choice" would indicate.
The analysis was restricted to Item 7 and 9 since for Item 8 many answers did not give access for the interpreters to students" thinking while choosing an operation.Many students expressed an impression that the choice is evident, like Kim: "Because you just have to take this" or "Here, it must become more because they are more children." The analysis was conducted for 197 written answers given to Item 7b and 9b.This subgroup of answers was carefully chosen with respect to their explanatory power -answers that allowed no secure access to students strategies were excluded.Although representativeness for the whole sample is not completely guaranteed, this selection already offers interesting insights.Item 7 is the classical item used for supporting hypothesis H2 -the relevance of wrong order rules ("division makes smaller").The analysis of explanations given by students offers the opportunity of verifying this hypothesis with causal instead of statistical coherences.
Figure 7 shows all given codes and categories for explanations in Item 7b, illustrated with examples and ordered due to the choice of operation in Item 7a.It is remarkable that the majority of interpretable explanations for correct choices of multiplication are mathematically problematic (50% guessing strategy and 25% wrong others).For those students who choose division for calculating 2/3 of 36, the order strategy was the most reconstructable strategy (43%).In contrast, the explanations for choosing subtraction could be interpreted by the restructuring strategy in most cases (78%).When students calculate 1.5 -3/4 or 1.5 -1/4, they deal with changing referent wholes for the two numbers.Figure 8 shows the analogous analysis for Item 9 and those answers that could be interpreted with respect to the underlying choosing strategy.50% interpretable explanations for choosing multiplication argued by referring to a keyword strategy: "of is always times".Some answers showed that understanding is not the only warrant for this declarative knowledge, others referred for example to the warrant by authority: "our teacher has said".For explaining the choice of division, two conceptions were dominant: 23% students restructured the situation and considered 2/3 as being a part of 36 or asked how often the 2/3 fitted into the 36.51% answers referred to a keyword strategy.Only 6% referred to an order strategy, these finding contradict H2 concerning the order.In sum, there are not many mathematically sustainable explanations for correct choices of multiplication, this supports the already presented finding that many students have limited understanding of the meaning of multiplication, even if they can choose a suitable operation for given word problems.
The order strategy that is most prominently discussed in the literature seems to be relevant, but not to the extend attributed to it in the literature.In Item 7, it covered 43% of the interpretable explanations, but in Item 9 (for which the chi square test between Item 2 and Item 9 was also significant), only 6% of the interpretable answers referred to an order strategy for their incorrect choice of division.This finding shows that the importance of the order strategy must be relativized.There are some students that change to the layer of laws and properties for deciding about the choice of operation, but not the majority.
Most interesting were the many different ways by which students restructured the situation models of the word problems with their idiosyncratic conceptions of parts.Although being diverse, these explanations could be subsumed under the restructure strategy.They contain the choice of division when 3/4 kg is understood as a part of 1.50€, or 2/3 of 36 being transformed into the question "how often does 2/3 fit into 36?".
For subtraction, the restructure strategy was mostly connected to problems with fractions being referred to different referent wholes, for example, when calculating 1.5 -1/4, reference is made to the unit 1€ and to 1kg.78% wrong choices of subtraction for Item 7 were explained in such a way.The examples make clear that the restructure strategy roots in problems on the most basic layer of meaning of fractions.
The third important strategy was the keyword strategy.28% of all interpretable answers referred to a focus on verbal cues (Nesher & Teubal, 1975).In Item 9, students showed three diverging explanations for referring to the keyword "of":  of-tasks are minus-tasks (in 33% of explanations for choosing subtraction in Item 9): "2/3 of 36, thus minus"  of-tasks are times-tasks (in 60% of explanations for choosing multiplication in Item 9): "Because of-tasks are times-tasks (our teacher has said)."  of-tasks are dividing-tasks (in 50% of explanations for choosing division in Item 9): "You want to know, how much 2/3 of 36 is.Then you have to calculate "division", and you get the result." For the guessing strategy, we consider the schools on different achievement levels separately.Whereas no student of the higher streamed schools (Gymnasium) wrote "I have guessed", 14% of the lower streamed students did.Appearingly, students of the Gymnasium know that guessing is not a legitimate answer in mathematics classrooms, even when they might do.Summing up, the reconstruction of strategies shows that there is no uni-dimensional account for students" choices of operations.None of the hypotheses H1 to H3 can be supported exclusively by these findings, but all of them seem to be true for some students.In sum, the reconstructed four strategies can be located on different layers of the intuitive level.Hence they mainly support the result that reasons for wrong choices of operations (which are difficulties on the layer of mathematizing) can lie on different deeper levels.
Whereas the order strategy (covering 14% of all interpretable answers in Item 7 and 9 together) can unambiguously be assigned to the layer of laws and attributes, the restructure strategy can be assigned to the layer of meaning of fractions (covering 26% of all interpretable answers in Item 7 and 9 together).
The guessing strategy and the keyword strategy at first sight indicate problems simply on the layer of mathematizing.Guessing is necessary when no other reference point is available, and searching for keywords implies that the person has not referred to other layers of the intuitive level.Although the analysis of reasons alone cannot give insights into students" conceptions on other layers, their choosing strategy showed that they did not refer to it.
At this point, it is instructive to reconsider the raw data and draw connections to the results of other items: None of the participants who referred to a keyword strategy for choosing division in Item 9 has gained any point in Item 6.This shows that wrong keyword strategies might be associated with non-appropriate individual models for the multiplication of fractions.
In sum, these results offer findings about connections between layers of the intuitive level that go deeply beyond statistical contingencies.They relativize the connections drawn by more exclusive hypotheses H2 and H3 as they are too uni-dimensional if considered solely.

Summary and Educational Consequences
Why Johnny can't apply multiplication of fractions for word problems with different models?
Overall, the question can be embedded in a general picture of disconcerting findings.Students perform quite poorly whenever intuitive knowledge is needed.Performance on the intuitive level is low for all items that refer to the multiplication of fractions (≤33% of complete solutions).Especially the results of the operation choice items (Item 7 to 11) show that "Johnny" is not an exception, since many students cannot assign a multiplicative term to the posed word problems.As shown in Figure 4, only Item 8a (mathematize situation allowing repeated addition because of natural multiplier) could be mathematized by 86% of the participants, whereas all others had less than 35% of correct choices.
Item 6 allows to locate the difficulties on the layer of individual meanings of operations.Only 5% of the students could formulate an appropriate word problem for a given multiplication.Morris Kline (1973, p.12) already emphasized the schools" failure to present meanings.But it is not meaning in general but meaning for discontinous models that pose the crucial problems.The results for the parallel item for addition show that students are more successful in extending their individual models for addition from natural numbers to fractions than for multiplication.These findings can be explained by the model types.Models for addition with naturals can be continued to fractions, whereas the dominant model repeated addition is of discontinous model type.Apparently, most students have insufficiently extended their repeated addition model for multiplication of natural numbers by adequate models for multiplication with fractions.This offers an answer to the title question: "Johnny" can"t apply multiplication since he has not successfully mastered the necessary conceptual change for the models of multiplication.Most students (95%) could not even articulate one adequate model for multiplication.The association between the layer of meaning of operations and the layer of operation choice could be empirically substantiated by a correlation coefficient of 0.91.The fact that the effect is stronger for Item 7a und 9a than for Item 8a supports again the hypotheses of difference between discontinuous and continuous models.
The test of different alternative hypotheses has brought statistical evidence for further possible backgrounds on other layers of intuitive knowledge about multiplication of fractions.Associations between the layer of laws and attributes and the layer of operation choice could be found by statistically significant associations between Item 2 and Item 7, 8, 9.But the detailed analysis of explanations given by the students relativizes this hypothesis, since only 14% of the 197 explanations (for which the underlying choosing strategies are reconstructable) refer to this order strategy.
More dominant are keyword strategies and restructure strategies which can be assigned to deeper layers of intuitive knowledge.The restructure strategy is rooted in various difficulties on the layer of meaning of fractions and was reconstructable in 26% of interpretable cases, the guessing strategy (9%) and the keyword strategy (28%) can be traced back to the problem that students (although they might have adequate conceptions on deeper layers) did not make use of them for choosing an operation.
Overall, the distinction of layers of knowledge provided a useful basis for reconstructing reasons for difficulties with operation choice.The findings make clear that there is no unidimensional account for the phenomenon, instead, decently different sources could be found and applied to different students and different word problems.
Three hypotheses on possible backgrounds could be tested and provided statistically significant results on high effects.The causal relevance of each of them could be explored by the detailed analysis of explanations.In this way, hypothesis-proving parts and explorative parts can be triangulated support each other.

How can Johnny Learn to Apply Multiplication? -Outlook on Educational Consequences
In typical German classrooms, students acquire formal and algorithmic knowledge on operations with fractions, whereas their knowledge on the intuitive level is often restricted to meanings of fractions as parts of wholes.This hinders applying the learned algorithmic knowledge for word problems, as for this, not only meanings of fractions are needed but also meanings of operations with fractions.The presented study can contribute in three ways to help "Johnny" learn to construct meanings and than apply multiplication.
1. Easily handable diagnostic tools.The Items 5-11 of the study offer useful diagnostic tools for teachers to get to know about students" conceptions of operations.Especially the item format "Give an own word problem for a given multiplication" (Item 6) can be used easily in the classroom and offers important insights into individual models for the multiplication.The operation choice item with open explanation is another format that help to find starting points for classroom or individual discussions on students conceptions.
2. Importance of learning opportunities for constructing multiple models.But of course, assessing deficits alone is not enough to change classrooms.The study confirms the often formulated claim that more emphasis must be put on giving learning opportunities for constructing meaning (e.g., Kline, 1973;Usiskin, 1991;vom Hofe et al., 2006, de Castro, 2008;Taber, 2007).Beyond that, the study contributes to an enforced sensitivity that one suitable mental model is not enough, since different situations necessitate different models.Further design research is needed to develop adequate learning pathways for constructing multiple models for multiplication (first results on our design research study are presented soon).

Reflect on continuities and discontinuities of models with students.
As the study has confirmed the importance of model types (continuous or discontinuous) for explaining the difficulty of operation choice, the next step in a design research process will concern students" explicit reflection on the conceptual change of models.Already in Prediger (2008a), some activities have been proposed for initiating reflection on the discontinuity of selected models.The importance of treating those obstacles as opportunities for reflection is supported by conceptual change researchers that emphasise the meta-conceptual awareness as an important condition for successful processes of change (Vosniadou, 1999).Further research should show this effect empirically.

Figure 2 .
Figure 2. (Dis-)Continuities of models for multiplication in the transition from natural to fractional numbers 3

Figure 3 .
Figure 3. Overview on all test items (original test in German and without headlines)

Figure 4 .
Figure 4. Frequency of correct operation choices

Figure 5 .
Figure 5. Decreasing correct choices in different itemsassociation with model type

Figure 6 .
Figure 6.Individual models articulated in Item 5 and 6

Table 1
Scores of items, ordered due to level and difficulty Note:

Table 2
Associations between Item 2 and Items 7,8,9 -measured by values of Chi square ² (with level of significance p) and Spearman's rank correlation coefficient 