Investigating Students’ Processes of Noticing and Interpreting Syntactic Language Features in Word Problem Solving through Eye-Tracking

Investigating Students’ Processes of Noticing and Interpreting Syntactic Language Features in


INTRODUCTION
In the reading and understanding of word problems, different kinds of obstacles have been identified in earlier studies (for overviews, see Reusser, 1997;Verschaffel, Greer, & de Corte, 2000). Among these obstacles, language obstacles have been shown to be a crucial part, and many assessment studies contributed to identifying potential language obstacles and showing their negative impact on students' solution rates (Abedi & Lord, 2001;Dyrvold, Bergqvist, & Österholm, 2015;Haag, Heppt, Stanat, Kuhl, & Pant, 2013; further insights will be provided in Section Syntactic Features as Potential Obstacles in Reading and Understanding Word Problems), especially for unexpected syntactic language features, which can change the mathematization of a word problem (Boonen, van der Schoot, van Wesel, de Vries, & Jolles, 2013).
However, little is known about students' processes of noticing and interpreting syntactic language features (Solano-Flores, 2010). Existing qualitative video studies (e.g., Dröse, 2019;Dröse & Prediger, 2020) have been able to capture how students interpret the syntactic features, but have only achieved limited access to whether they even notice subtle differences, for instance, between an expected or less expected syntactical structure in the word problem.
In order to contribute to closing this research gap, we used eye-tracking methodology as a promising tool for capturing students' noticing processes in more detail. Eye-tracking methodology captures a person's eye movement using the reflection of infrared lights in the cornea to show where a person is looking at a particular moment in time. According to the so-called eye-mind hypothesis (Just & Carpenter, 1980), eye movements are closely related to the cognitive focus and thus provide insights into the viewer's cognitive processes. Even though the eye-mind hypothesis has also limitations (see Anderson, Bothell, & Douglass, 2004), it has been very influential in educational research, especially in tasks related to reading. Thus, eye-tracking is a well-established method in general education research (Lai, Tsai, Yang, Hsu, Liu, Lee, Lee, Chiou, Liang, & Tsai, 2013) and is becoming increasingly popular in mathematics education research (Schindler, Lilienthal, Chadalavada, & Ögren, 2016). Some studies have investigated students' comprehension of word problems, but mainly with respect to comprehension strategies (Hegarty, Mayer, & Green, 1992;Hegarty, Mayer, & Monk, 1995;Strohmaier, Lehner, Beitlich, & Reiss, 2019;Verschaffel, de Corte, & Pauwels, 1992) rather than syntactic language obstacles.
In this paper, we present a study that explores the use of eye-tracking for investigating fifth graders' processes of noticing syntactic language features. The study was conducted by an interdisciplinary group of mathematics education researchers and psycholinguists working together to investigate (1) students' eye movements as indications for their processes of noticing and interpreting syntactic features during word problem solving as reflected by their eye movements and verbal answers and (2) changes in students' eye movement patterns after an intervention.
In the first section, the theoretical background is presented, focusing on syntactic features as potential obstacles in reading and understanding word problems and embeds the focus on the processes of noticing and interpreting into the theoretical framework of syntactic language awareness. It then outlines how the processing of syntactic features can be examined through eye-tracking. Afterwards, the methods and design of the eye-tracking intervention study are presented. Thereafter, empirical findings are provided, which are then discussed with regard to their limitations and future research.

THEORETICAL BACKGROUND Syntactic Features as Potential Obstacles in Reading and Understanding Word Problems
For the current study, we focus on specific word problems: not rich open-modelling problems but mathematical tasks addressing real-life situations in a closed format in text form with concise problem questions-they can contain one step, two steps, or multiple steps of combining information (Dröse & Prediger, 2020). Solving arithmetic word problems requires at first reading the text to form a textbase and afterwards grasping the situation and constructing an adequate situation model, as a kind of mental model that is the interpretation done by the reader of the word problem text corresponding to the author's interpretation. The situation model is afterwards reduced to a problem model and abstracted into a mathematical model. The connective structures and solution equation derived from the mathematical models can be solved by calculation to get the numerical answer. In the end, the numerical answer needs to be interpreted to get an answer to the problem question (Cummins, Kintsch, Reusser, & Weimer, 1988;Reusser, 1989Reusser, , 1993Thevenot, 2010).
For these kinds of arithmetic word problems as well as other kinds of word problems, students' challenges in reading, understanding, and solving word problems have often been documented empirically (Daroczy, Wolska, Meurers, & Nuerk, 2015;Verschaffel et al., 2000). These obstacles can be categorized as follows (for overviews, see Daroczy et al., 2015;Reusser, 1997;Verschaffel et al., 2000): (1) language obstacles in reading and understanding either lexical or syntactical features of the text (2) strategic obstacles in using sustainable subject-specific reading and understanding strategies instead of strategies to process surface level characteristics of the text into a situation model (3) knowledge obstacles in applying content specific general knowledge of real-world phenomena for generating a situation or problem model (4) conceptual obstacles in transferring the problem model into the correct mathematical model (5) habitual and meta-cognitive obstacles in having no adequate solution plan as a meta-cognitive strategy or lacking motivation for starting working on the word problem.
While Obstacles 2-5 have already been investigated in detail, this study focuses on one aspect of Obstacle 1, syntactical language obstacles. Many assessment studies and interview studies have identified language obstacles in word problems in lexical and syntactic features (Abedi & Lord, 2001;Dyrvold et al., 2015;Haag et al., 2013;Haag, Heppt, Roppelt, & Stanat, 2015;Martiniello, 2009;Prediger, Wilhelm, Büchter, Gürsoy, & Benholz, 2018;Wolf & Leon, 2009): • Lexical features that may possibly be challenging in word problems: kind of words (academic words, unknown words), prepositions, pronouns (including agreement in number and grammatical gender), and comparative or composites • Syntactic features that may possibly be challenging in word problems: noun phrases, relative and conditional clauses, passive voice, adverbial structures, models, and subject-object structure Those potential obstacles might negatively influence the process of reading and understanding word problems (Haag et al., 2015;Prediger et al., 2018). The intervention focuses on different syntactic features: On the one hand, elliptic comparisons, meaning comparisons where the comparison word is left out (Schleppegrell, 2007;Halliday, 2004;Boonen et al., 2013), for instance, in the second sentence, "than each penguin chick" is left out after the comparison with "more": • Each of the 5 penguin chicks eats 4 fishes per day. Each of the 12 grown-up penguins eats 5 fishes.
• Each of the 5 penguin chicks eats 4 fishes per day. Each of the 12 grown-up penguins eats 5 fishes more.
On the other hand, the main focus of the current paper is a typical syntactic feature that makes an important mathematical difference in word problems: passive voice that involves changes from subtraction to addition: • Lina has 5 sweets. She gives 2 to her brother. How many does she have now?
 5 -2 = 3 • Lina has 5 sweets. She is given 2 by her brother. How many does she have now?  5 + 2 = 7 One syntactic feature specific to the German language is the order of subject and object (Beese & Gürsoy, 2012) (See Figure 1).

Figure 1.
Less expected language structure (left) and expected language structure (right) in the petting zoo problem: syntactic cases determine subject-object structure (German original from Dröse et al., 2018) The German language allows two types of subject-object orders, the "expected structure," subject-verb-object, and the "less expected structure," object-verb-subject. The type can be identified by subtle syntactical details, namely the grammatical cases of the pronouns as shown in the examples in Figure 1. These subtle differences have a significant impact on the mental model of the situation and the mathematical model as they provoke using either an operation or the inverse operation (Dröse, 2019). In Figure 1, the change of "[To] her brother Jonas, gives she € 1" to "Her brother Jonas gives her € 1" implies the opposite exchange of money in the mental model of the situation and therefore implies either adding or subtracting € 1 for Paula or Jonas (see possible expression or multi-step calculation) in the mathematical model.
As presented above, the potentially challenging syntactic language features in word problems are well documented from the text side (see Bergqvist, Dyrvold, & Österholm, 2012;Haag et al., 2013). Apart from task-related attributes (as language features of the task) Leiss, Plath, and Schwippert (2019) also identified process attributes (as readings strategies) and personal attributes (as reading performance) as being important for constructing an adequate situation model. The students' side in particular still requires further research: Leiss, Plath, and Schwippert (2019) request developing teaching-learning arrangements for fostering understanding word problems and further research about students' cognitive processes for overcoming obstacles.
In terms of language obstacles, unfamiliar syntactic features are a kind of linguistically encoded information that is typically overlooked or not fully understood by some students (Haag et al., 2015;Gürsoy, Benholz, Renk, Prediger, & Büchter, 2013;Prediger et al., 2018), for instance, by not paying attention to linguistically encoded information: "The little details are important … because if some details are omitted, you might work with a simplified and incorrect model of the mathematical situation embedded in the word problem" (Nortvedt, 2011, p. 265).
The process of focusing one's awareness on meaning-related syntactic features and potential obstacles within this process can influence the process of reading and understanding word problems (Gürsoy et al., 2013). Boonen et al. (2013) have further shown that students' ability to deal with less expected syntactic features is a strong predictor for their word problem solving performance as a whole. This sets the focus of the current study, and with this paper we intend to contribute to reducing the research gap in examining students' processes in overcoming language obstacles by investigating two cognitive sub-processes necessary for students to deal with unexpected syntactic features. They are introduced in the next section.

Noticing and interpreting as characteristic processes in the context of language awareness
In linguistics and language education research, students' attentiveness for linguistically encoded detailed information (e.g., for less expected syntactic language features) is widely understood as a part of language awareness. The Association for Language Awareness (ALA) defines language awareness as "explicit knowledge about language, and conscious perception and sensitivity in language learning, language teaching and language use" (García, 2017, p. 264, quoted on ALA homepage). Dröse and Prediger (2020) narrowed this wide definition to syntactic awareness, in other words, the language awareness of syntactic features. The adoption of the linguistic conceptualizations has allowed the authors to draw upon linguistic results about two typical processes involved in enacting language awareness (by adapting and integrating conceptualization from Bialystok, 1986;Fandrych, 2005aFandrych, , 2005bPortmann-Tselikas, 2001): noticing and interpreting. Applying these terms to world problems is novel to our approach, and benefits of this adaptation have already been described in Dröse and Prediger (2020): (1) Noticing describes the process of identifying relevant syntactic language features. In the context of this study, noticing means focusing one's attention to specific syntactic features of a word problem that are crucial for constructing a mental model of the situation. Obstacles in noticing have been documented, as students might overlook syntactic language features and not read them in detail as they do not consider them relevant (Boonen et al., 2013;Dröse, 2019;Kaulvers, Schlager, Isselbächer-Giese, & Klein, 2016).
(2) Interpreting describes the process of revealing the implication of a syntactic language feature for the mental model of the situation. The interpretation of those features can be based on explicit or implicit syntactic knowledge (Brimo, Apel, & Fountain, 2017). Obstacles in interpreting are traced back to missing syntactic knowledge, for instance, of second language learners (Beese & Gürsoy, 2012).
In line with Smith (2008) and Wildemann, Akbulut, and Bien-Miller (2016), we assume that general language awareness and the more specific construct of syntactic awareness can be modelled as a continuum "ranging from either no awareness at all … to a heightened, intense awareness" (Smith, 2008, p. 181). Modelling awareness in a continuum means that it is likely that awareness can be enhanced by interventions (Dröse & Prediger, 2020;Wildemann et al., 2016). The following section presents an intervention designed for this purpose that help to investigate whether changes in noticing and interpreting can be captured by eye-tracking.
Although the specific language structures in view do not exist in English, the investigation of students dealing with these structures is still relevant for an international audience as it is only exemplary for other linguistically encoded detailed information in any language.

Fostering students' processes of noticing and interpreting through the variation principle
In order to investigate how students' noticing and interpreting syntactic structures is reflected on eye-tracking data (the changeability of the eye movement pattern; see below), an intervention was developed and iteratively elaborated for fostering fifth graders' processes of noticing and interpreting (Dröse & Prediger, 2020). Alongside strategic scaffolding (see Prediger & Krägeloh, 2015; for this study the principle of strategic scaffolding is described in Dröse & Prediger, 2021), the core design principle of the intervention is the variation principle.
The principle of strategic scaffolding is used to foster reading and understanding strategies. It is based on the idea that a more skilled person supports a learner to reach the zone of proximal development, a state that the student would not be able to reach at first without help but will be able to after a sufficient period of time and support (Wood, Bruner, & Ross, 1976, p. 90). Apart from a skilled person to support a learner, in different research traditions other approaches have also been investigated, such as scaffolding through specific tools named scaffolds. Each kind of scaffold should be faded out in the end (Hannafin, Land, & Oliver, 1999).
The variation principle has been widely used in mathematics education, for instance, in the Chinese tradition of Bianshi teaching (Pang, Bao, & Ki, 2017) and the European tradition of Variation Theory (Marton & Pang, 2006). Its aim is to focus students' attention to "essential features of a concept by differentiating them from non-essential features" (Gu, Huang, & Gu, 2017, p. 14). According to the variation principle, tasks or examples are varied and compared so that students start to pay "attention to the objects of learning of a particular content systematically" (Huang, Barlow, & Prince, 2016, p. 154). This attention can either be drawn to the learning content by keeping them invariant while other features of the task vary (Pang, Bao, & Ki, 2017) or by varying features of the learning content in view while keeping other task characteristics invariant (Marton & Pang, 2006). In language education, a similar approach of contrasting language structures has been used to focus students' attention to syntactic features (Melzer, 2013).
Based on these existing approaches in mathematics and language education, the designed intervention for fifth graders (Dröse, Prediger, & Marcus, 2018) applies the variation principle to word problem text features in order to promote students' abilities to notice and interpret the slightly varied syntactic features. Following the variation principle, three main characteristics are taken into consideration for the intervention (see also Dröse & Prediger, 2021): • Characteristic 1: Systematically comparable word problems: This assumes that there is a difference in relevant language features between given tasks that students can notice and identify by systematically contrasting the given word problems (Pang et al., 2017). Which language features are most relevant depends on the tasks, as we focus only on those features that need to be noticed and interpreted to form an adequate mental model of the situation.
• Characteristic 2: Variation of language features: In order to identify relevant language features, they are systematically varied, which means that two versions of the word problem are designed: Versions A and B. The language features in view are the only aspect varying between Versions A and B (Marton & Pang, 2006). This systematic variation combined with Characteristic 1 opens up the possibility of noticing the syntactic features, for instance by comparing the two versions of a word problem being presented at the same time. In Figure 1, students can notice the pronouns as being the only difference between Versions A and B.
• Characteristic 3: Invariance of language features: In order to generalize the impact of the language feature in view, it is kept invariant during a sequence of word problems while other features (mathematical operations, numbers, etc.) vary (Pang et al., 2017). These characteristics focus on the process of interpreting of syntactic features as the central theme in a reflection with students on the mental model of the situation.
An intervention using all three characteristics focuses on different syntactic features such as elliptic comparisons, which means a comparison where the comparison word is left out (Boonen et al., 2013;Halliday, 2004;Schleppegrell, 2007), for instance, in the second sentence "than each penguin chick" is left out after the comparison with "more": • Each of the 5 penguin chicks eats 4 fishes per day. Each of the 12 grown-up penguins eats 5 fishes.
• Each of the 5 penguin chicks eats 4 fishes per day. Each of the 12 grown-up penguins eats 5 fishes more.
The intervention uses the variation principle to sensitize students for these subtle differences: By comparing the two phrases above or by comparing the two word problems from Figure 1, they become aware of the subtle differences in the syntactic features.
Empirical evidence for the efficacy of the designed intervention could be provided in a controlled trial with repeated measure design and paper-and-pencil pre-and post-tests for 277 fifth graders (Dröse & Prediger, 2021). The intervention group of fifth graders (n = 118) who followed the designed intervention revealed a high effect size for their intra-group learning gains (d = 0.87) and significantly higher learning gains than the control groups (Ftime x group (1, 274) = 13.41, p< 0.01, 2 = 0.02). Therefore, this intervention can be used in the current study to disentangle the students' processes of coping with the syntactic features in detail.
As the qualitative analysis of design experiments revealed (Dröse & Prediger, 2020), students' processes of interpreting seemed to have undergone changes while working in the intervention with varied word problems. However, the interpretation of video data revealed significant limitations in inferring students' mental processes of noticing from their utterances, so the next study was planned with an eye-tracking methodology, which will be presented in the next section.

Examining the Processing of Syntactic Language Features Through Eye-Tracking
Eye-tracking methodologies have been used in educational research for longer than a decade (see Lai et al., 2013, for an overview), because the method is considered to provide reliable evidence for cognitive processes related to understanding and learning. It has also been applied to focus on mathematical word problems (Hegarty et al., 1992(Hegarty et al., , 1995van der Schoot, Bakker Arkema, Horsley, & van Lieshout, 2009;Verschaffel et al., 1992). The methodologies are connected to the eye-mind hypothesis (Just & Carpenter, 1980): Individual eye fixations are assumed to be closely related to the cognitive focus of attention.
Within the last years, eye-tracking methodologies have also been increasingly applied in mathematics education research, as they allow deeper insights into the students' cognitive processes (see overview in Norqvist, Jonsson, Lithner, Qwillbard, & Holm, 2019;Strohmaier, MacKey, Obersteiner, & Reiss, 2020). Studies have been reported about proofs (Inglis & Alcock, 2012), geometry (Schindler & Lilienthal, 2019), fraction comparison strategies (Obersteiner & Tumpek, 2016), reading heuristic worked-out examples (Beitlich, Obersteiner, & Reiss, 2015), and complex word problems for adults (Strohmaier et al., 2019). In both psycholinguistics and mathematics education, reading word problems has been described as a suitable research focus for eyetracking methodologies, as reading activities are guided by the eyes and heavily depend on the focus of attention (overview in Rayner, Pollatsek, Ashby, & Clifton, 2012). Using eye-tracking, the eye movements are captured unobtrusively while the student reads and thinks about a word problem text. Technical challenges might be that the eye tracker must be carefully calibrated beforehand and that the students cannot look away from the screen for the whole duration of the experiment.
The captured eye movements are usually not continuous and regular, thus they can be characterized by • fixations on specific locations in a written text (which typically last about 100 to 500 ms); • saccades, which are quick, straight movements to the next fixated location; • revisits, which are movements going back against the reading direction to a previously read word (Rayner et al., 2012); and • reading time, defined as the time the student needed to solve a particular task.
The temporal or counting measures usually applied for the analysis of eye movements consist of the event detection of fixations and saccades. A saccade is defined as a rapid change in the coordinates where the person is looking to, and a fixation is defined as the period between two saccades, with a minimal duration of 80 ms and a maximal dispersion of 100 pixels. Based on these parameters of automatic event detection, it is possible to obtain the following measures for the analysis of the eye-tracking data for a pre-defined area of interest (AOI): • Count of fixations on the AOI

• Count of saccades between the AOIs
• Count of revisits to the AOI, in other words, each time a participant returns their gaze to an AOI after leaving it • Net dwell time, as the total amount of gaze duration on an AOI, in other words, sum of all time spent looking at the AOI, continuously or discontinuously, for the whole duration of the task (Andrá, Lindström, Arzarello, Holmqvist, Robutti, & Sabena, 2013) Previous eye-tracking studies on word problems have already shown that the semantic structure of a word problem can influence the eye-tracking pattern; for example, lexical structures that are inconsistent with the mathematical operation of the word problem are read more slowly, with more saccades and revisits (Hegarty et al., 1992;Verschaffel et al., 1992). In addition, unsuccessful problem solvers have been shown to adopt superficial strategies such as direct translation strategies that become visible by more total revisits to the numbers and relational terms, whereas successful problem solvers have more revisits to other relevant words in the text that might have an impact on the mental model of the situation (Hegarty et al., 1995).
With respect to language obstacles, Rayner and colleagues conducted several studies on lexical features: Inhoff and Rayner (1986) showed that the reading of words is influenced by whether the words are expectable in their context, which is also dependent on the frequency of their use. This also supposes that the less expected language structures in view might have an impact on the reading process, as they are used less frequently. The predictability of words also has an influence on eye movement processes (Rayner & Well, 1996). Rayner and Well (1996) showed that words that are expectable in a given context are often left out or fixated on only briefly. As subject and object pronouns might be inferred from reading each of them in the order of the sentence, this might have an impact dependent on both the characteristics of the given structure (expected or less expected) and the characteristics of the reader (e.g., reading proficiency). The characteristics of the reader have also been considered when examining the noticing and interpreting of syntactic features via eye-tracking in Dröse, Prediger, Neugebauer, Delucchi Danhier, & Mertins (submitted). This study shows that the noticing is influenced by characteristics of the reader (reading proficiency and multilingual background) combined with characteristics of the syntactic structure (expected or less expected); these results about lexical features will now be investigated with respect to transferability to syntactic features.
Apart from eye-tracking studies on prototype word problems, Strohmaier et al. (2019) conducted an eye-tracking study with adults on complex word problems using global measures such as reading speed, number of saccades, proportion of regression, and mean fixation duration to compare word problems of different problem difficulty and adults with different performances. Strohmaier et al. (2019) were able to infer from their experiment that the chosen measures were sufficient for displaying the processes of encoding a text base and forming a situation or problem model. In addition, they provided empirical evidence for their hypotheses that the process of word problem solving are influenced by the difficulty of both the word problem text and individual characteristics. They were able to show that the tested adults made longer fixations in more difficult word problems. The more difficult the problems were, the more the revisits increased, but the increase was stronger for low-performing adults than for high-performing adults. This means that apart from the complexity of a word problem, the success and therefore the correctness of a mathematization and mathematical solution might also have an impact on reading processes. Therefore, it seems worthwhile to keep the performance of successful or unsuccessful problem solving as a relevant variable for investigating students' processes of noticing and interpreting. In addition, Strohmaier et al. (2019) describe their studies' results as a confirmation of Inglis and Alcock's (2012) emphasis on revisits as a relevant measure in their study on reading proofs. The current paper will therefore pay particular attention to the measure of revisits in investigating word problem reading processes via eyetracking.
Although the above-mentioned studies revealed substantial findings on students' strategies of coping with expected or less expected semantic structures of word problems and lexical features, so far little is known about syntactic structures. From the existing results, we hypothesize that in the key sentence for the mathematization of the situation, the net dwell time on the pronouns might depend on whether they are an expected structure or not. Net dwell time was chosen since it considers length of fixations, revisits, and saccades on a target, so it is indirectly also a good measure of reading speed. Since pronouns are frequent but short words in the German language, it is easy to overlook them, unless the participants detect a less common syntactic structure. The syntactic structure of the sentences is thus systematically varied in our study.
Additionally, because all previous studies have only tested the participants once, little is known about whether the patterns of the eye movements of the participants can change over time, for instance, after an intervention, which we will call the changeability of the eye movement pattern.

Research Questions
The current state of research suggests that it might be worthwhile to also extend the eye-tracking research on word problems, which has so far mainly focused on semantic structures and lexical language features, to syntactic language awareness, namely, to students' processes of noticing syntactic features, and relate these syntactic features to the processes of interpreting language features (captured by students' utterances and mathematizations). The existing research on students' noticing of expected and less expected lexical features (Inhoff & Rayner, 1986;Rayner & Well, 1996) has suggested that this difference could also be crucial for syntactic features. This raises the methodological challenge of how processes of noticing syntactic features in word problems can be characterized by eye-tracking methodology and then be related to interpreting the text, as the correct interpretation cannot be derived directly from eye-tracking data. The eye-tracking study of Strohmaier et al. (2019) provided the first measures focusing on revisits. Therefore, one of the aims of the current study is to also investigate in how far the process of noticing can be captured through these eye-tracking measures.
Apart from this methodological challenge, the processes of noticing and interpreting should be investigated for word problems with expected and less expected syntactic features. Hence, the features should be compared systematically, which suggests our first research question: RQ1. Which differences occur in students' fixation times on pronouns in items with expected or less expected subject-objectorder syntactic structures?
This comparison might reveal processes of noticing syntactic language features by taking correct and incorrect mathematizations into account during the examination of the process of interpreting syntactic language features. That means that these differences are also compared with respect to correct or incorrect problem mathematizations: We aim to capture whether students choose the correct operation, not whether they complete it. This focus resonates with the study of Strohmaier et al. (2019), which indicated the connection between the correctness of mathematization, rather than the completion of the operation, and the reading processes. From this, we derive the second research question: RQ2. How are these differences in expected or less expected structures connected to students' correct or incorrect mathematizations?
A second extension of the existing eye-tracking research will then be given a longitudinal perspective, asking whether an intervention can change students' eye movement patterns, and therefore possibly the processes of noticing and interpreting. This intervention has already been shown to have an impact on performance in paper-and-pencil pre-and post-tests (Dröse & Prediger, 2021) from the pre-intervention eye-tracking experiment (abbreviated pre-ET) to the post-intervention eye-tracking experiment (post-ET). This suggests the third research question: RQ3. How do the identified patterns change from pre-ET to post-ET? Figure 2 gives an overview of the research design of the longitudinal eye-tracking study with pre-and post-ET with 10 fifth graders. In order to investigate the changeability of the eye movement pattern, the longitudinal eye-tracking study with a focus sample of 10 students was embedded in a large intervention study whose entire sample consisted of 126 students. We briefly present the larger intervention, the measures for background variables, and the sampling procedure for the focus sample. The noticing of syntactic features was investigated using eye-tracking methodology. We present the eye-tracking tasks, the eye-tracking equipment, and the methods of data analysis.

Embedding in a Larger Intervention Study
The longitudinal eye-tracking study was embedded in a large intervention study whose entire sample consisted of 126 students. The intervention in view aimed at enhancing students' ability to deal with word problems, such as the one shown in Figure 1, drawing upon two design principles: strategic scaffolding (not in view of this article) and raising students' language awareness for syntactic features using the variation principle (for a detailed description of the intervention see Dröse, 2019;Dröse & Prediger, 2021).
The intervention lasted five to six classroom lessons of 90 minutes each. Significant effects on students' subject-specific comprehension of word problems have been shown using a paper-and-pencil pre-and post-test design, in which students in the intervention group showed significantly higher learning gains in solving the word problems than students in a no-treatment control group (Dröse, 2019;Dröse & Prediger, 2021).

Measures of Background Variables
The following background variables were assessed to control for differences among the students in the intervention sample and the focus sample: • Age, gender, socio-economic status and multilingual background were captured using a self-report questionnaire.
Participants' multilingual backgrounds were operationalized by listing all languages spoken at home. Socio-economic status was assessed using the book scale as an economical and reliable instrument (r = 0.80, Paulus, 2009).
• Reading proficiency was captured using a speed-reading comprehension test (Lenhard & Schneider, 2006), with a retest reliability of r = 0.93 and a maximum score of 120.
• Strategy use for reading and understanding word problems was accessed using a self-constructed test (see Dröse & Prediger, 2021). The test contains four rich word problems for which written solutions were coded and evaluated with respect to the use of the requested strategies for (S1) finding relevant information concerning the problem question, (S2) identifying the meaning of the relevant information and processing it, and (S3) the relationship between information in multistep word problems and processing it. The maximum score for these problems was 12, and the coding reached an interrater reliability of Cohen's = 0.95.

Participants: Intervention Sample and Focus Sample
The eye-tracking study was conducted with a focus sample of 10 fifth graders who were selected from the entire intervention sample based on the following criteria: • Students who were representative for the whole sample concerning background variables. As the data in Table 1 shows, the focus sample of eye-tracking students did not differ in reading proficiency, strategy use, and age from the whole intervention sample, but had a slightly higher proportion of monolingual students and students of higher socio-economic status.
• Students whose eye-movements could be captured via eye-tracking. Recordings with recording accuracy over 1.5 degrees were excluded from the study. In some cases, calibration was not successful. These cases also had to be excluded from the study.
• Students who participated in both the pre-ET and the post-ET. Due to illness, the sample of participating students was reduced. Participation was voluntary and parental consent was required.

Eye-Tracking Tasks and Procedure
In the eye-tracking experiments before and after the intervention (pre-ET and post-ET), the students worked on eight word problems individually. In between the presentations of two items, fillers were shown with hidden object games. The word problems were systematically contrasted: Each item list contained (a) one-step or two-step problems (examples shown in Figure  3), with one sentence being varied among four possibilities, according to (b) grammatical gender of the pronouns (male/female) and (c) expected subject-object position (subject-verb-object) or less expected subject-object position (object-verb-subject). In Figure 3 the AOI are printed in bold. The item lists have been randomly compiled out of two possibilities each for pre-ET and post-ET.

Eye-Tracking Equipment and Procedures
The eye movements were recorded with a remote eye tracker (SMI RED250) using a sampling rate of 250 Hz. We used a mobile one-computer set-up, comprising a laptop and a monitor to be able to record the participants at their school. The participating children sat in front of the monitor at a distance of approximately 65 cm, and their eyes were tracked without having to wear any accessories (such as eye-tracking glasses or a chin-rest), permitting them to speak freely. This was to assure that the recordings were as stress-free and non-invasive as possible since the children were quite young (10 to 11 years old). Monocular recording captured only the right eye. The monitor presented the stimuli with a ratio of 16:9 and a resolution of 19, 20 x 1080 pixel. The one computer with remote set-up also allowed for more accuracy and reliability in the recorded eye-tracking data than other eyetracking equipment (e.g., eye-tracking glasses), which was of importance since the AOIs were relatively small (pronouns that are single words). The eye-tracking measures are described in the next section.

One step word problem:
Emma and Jonas are friends. Emma has 20 € and Jonas has 15 €.

((Insert here Sentence A, B, C or D))
How much money does Jonas have?

Two step word problem:
Emma and Jonas are friends. Emma has 20 € and Jonas has 15 €. Emma gets 10 € from her parents.

((Insert here Sentence A, B, C or D))
How much money does Jonas have?

Sentence A: (Expected, male)
His friend Emma gives him kindly 5 €.

Sentence B: (Less expected, male)
His friend Emma gives he kindly 5 €. (literal translation meaning "he gives her")

Methods of Data Analysis and Measures
The eye-tracking data was analyzed using BeGaze software (Version 3.7). The tracking ratio for the pre-ET measures was on average 90.34% (SD 8.53%) and for the post-ET 93.98% (SD 3.30%). This is an acceptable tracking ratio, especially taking into consideration that the participants were young and thus move more heavily and their eyes are closer.
In order to be able to determine fixation times and count revisits, the areas of interests (AOI) must be pre-defined (Strohmaier et al., 2019). We chose the pronouns of Sentences A, B, C, and D, as they have to be focused on to disentangle the subject-object relationship of the sentence before choosing the mathematization (addition or subtraction). The subject pronoun formed the subject AOI, the object pronoun the object AOI.
For a first rough overview of analyzing the process of noticing syntactic features, we determined the most common time measure for the fixation length because "time spent gazing in each area of interest is a relevant measure of how the participants regard and process the information" (Norqvist et al., 2019, p. 8). In line with these authors, we chose the net dwell time as measure best capturing the noticing of syntactic language features. The net dwell time is defined as the sum of sample durations for all gaze data samples that hit a particular AOI, in other words, the sum of the durations from all fixations and saccades that hit the AOI, in a quotient over the total viewing time for each trial. By considering relative times, we accounted for the fact that some participants took longer than others to solve the tasks (because of reading some parts of the texts again).
In the statistical analysis, the determined net dwell time was compared for all items between (1) pre-ET and post-ET, (2) subject-AOI and object-AOI, and (3) correct or incorrect mathematization. The correct of incorrect mathematization was analyzed based on the students verbally given answers. The process of interpreting has to have taken place successfully in order to mathematize the mental model correctly. Different patterns of noticing and interpreting can be identified using eye-tracking data and verbally given answers: (1) Small amount of net dwell time combined with an incorrect mathematization: Unsuccessful noticing (without awareness of syntactic features) (2) Small amount of net dwell time combined with a correct mathematization: Successful noticing (without awareness of syntactic features) and successful interpretation (3) Higher amount of net dwell time combined with a correct mathematization: Successful noticing (with awareness of syntactic features) and successful interpretation (4) Higher amount of net dwell time combined with an incorrect mathematization: Successful noticing (with awareness of syntactic features) and unsuccessful interpretation.
The comparisons of the data in order of (1) pre-ET and post-ET and (2) subject AOI and object AOI also reveal tendencies, similarities, and differences in the processing of noticing and interpreting (see Section Results of the Eye-tracking Analyses), even if they do not become statistically significant due to the limited number of items and participants. Because of the small sample size t-test results are not reported in this paper.
In a deeper analysis of the dynamics of the reading process (similar to Schindler & Lilienthal, 2019), students' scan paths were traced systematically from the videos. Taking into account the high relevance of revisits for reading mathematical texts (Inglis & Alcock, 2012;Strohmaier et al., 2019), scan paths allow deeper insights into students' reading processes when considering the different reading cycles.
To analyze the scan paths, it is convenient to distinguish reading cycles, in other words, each time a participant reads the text linearly from the beginning over again. In this study, students showed up to four global reading cycles of the sentence. In a second step of analysis, the revisits to the AOIs were captured for each reading cycle. This procedure not only captured revisits, but also their placement in a reading cycle, which allows for deeper comparison of pre-ET and post-ET (see Section Results of the Eyetracking Analyses).

RESULTS OF THE EYE-TRACKING ANALYSES
The presented analysis starts with investigating the classical condensed measure for fixation length and average net dwell times. Afterwards, we will unpack these results in more detail, with dynamic measures counted in the students' scan paths. This illustrates the complexity of processes for three cases and then counts the revisits for all students will be presented.

Measured net dwell times
The classical measure for capturing students' noticing is the fixation length which is operationalized here as net dwell time on the AOIs for the subject and object pronouns. In order to investigate the methodological question of how processes of noticing syntactic features in word problems can be characterized using eye-tracking methodology and be related to interpreting the text (since correct interpretation cannot be derived from eye-tracking data directly), we compare the net dwell time for all conditions listed in the research questions: expected or less expected subject-object order (RQ1), for item reading processes that end with students correct or incorrect mathematizations (RQ2), and for the pre-ET and post-ET (RQ3). Table 2 presents these results for the absolute net dwell time (in milliseconds), and Table 3 presents the same data for the relative net dwell time (i.e., in percentages of the whole reading time). Both reveal differences and similarities in the net dwell time for the subject AOI and object AOI fixation for different kinds of tasks and mathematizations.
A first look shows that the absolute net dwell time for an AOI varies between 412 ms and 662 ms, which corresponds to a relative net dwell time of between 1.3% and 2.5%. This means that the students fixated on the subject or object pronouns in 1.3% to 2.5% of their measured whole reading time.

Analysis of differences and similarities in net dwell time on AOI in pre-ET
A comparison of the items with expected or less expected subject-object order in the pre-ET shows differing success in mathematization, with 72.5% correct mathematizations for items with expected structure and 62.5% for items with less expected structure. With respect to the relative net dwell time, the average measures seem to be similar: students fixated on the AOI in items with expected structures slightly shorter (1.8% for subject and object AOIs) than in items with less expected structures (2.0% on subject AOIs and 2.2% on object AOIs).
However, the differences get much larger when processes of reading items with correct mathematizations are distinguished from those with incorrect mathematizations: For items with expected structures that were correctly mathematized, students fixated on both AOIs equally long (1.9%), but on the object AOIs much longer in items with less expected structures (1.7% on subject and 2.4% on object). In processes of reading items with expected structures that were incorrectly mathematized, students fixated on both AOIs for equally short amounts of time (1.3% subject and 1.4% on object), but on subject AOIs much longer in items with less expected structures (2.5% subject and 2.0% object). This means that in correctly mathematized items with less expected structures, the net dwell time on object AOIs was higher than for subject AOIs, whereas for incorrectly mathematized items the relative net dwell time on subject AOIs was higher than on object AOIs. Therefore, small differences across different items can be identified with regard to the success of mathematization. Hence, we see differentiated patterns that do not allow identification of long fixation with successful noticing or vice versa. Rather, the success of mathematization seems to depend on what to fixate on and in which context.

Analysis of changes in AOI fixation from pre-ET to post-ET
When comparing pre-ET to post-ET, the percentage of correct mathematizations increases only slightly for items with expected structures (from 72.5% to 75%), but much more for items with less expected structures (from 62.5% to 77.5%), which are even mathematized slightly better than the items with expected structures.
Additionally, Table 2 shows the dominant pattern of decreasing absolute net dwell time for all kinds of items (with a decrease varying from 58.37 ms to 276.89 ms). However, these decreases correspond to an overall decrease in the measured whole reading time, so they are not reflected by the relative net dwell times in Table 3. The relative net dwell times vary from -0.8% to + 0.6%, and students seem to make the longest fixations on the object AOIs for items with less expected structures (2.5 and 2.6% relative net dwell time for correctly and incorrectly mathematized items, respectively).
For incorrectly mathematized items, the relative net dwell time on the AOIs is higher in post-ET than in pre-ET. This might indicate that students pay more attention to the syntactic features in view; therefore, it is likely that processes of noticing take place. However, the students do not always interpret the features correctly, so the items are not always mathematized correctly. Hence, these results suggest that students can notice syntactic features after the intervention, but more than 25% of the items are still not interpreted.
This might indicate that students become faster in noticing syntactic features and that many students become more successful in interpreting and mathematizing syntactic features. The point of increased speed in particular has not yet been able to be shown in earlier publications (Dröse & Prediger, 2020).  As this analysis did not distinguish the students but rather grouped the processes of items according to the correctness of the mathematization, the next section changes the perspective and considers students.

Case Analysis of Three Students' Noticing and Interpreting Syntactic Features
For a deeper and student-related answer to the methodological question of how processes of noticing syntactic features in word problems can be characterized by eye-tracking methodology and be related to interpreting the text (since correct interpretations cannot be derived from eye-tracking data directly, we start by presenting three cases of students. The three focus students, Lars, Timo, and Mike, were chosen because they reveal different but typical processes of noticing and interpreting in their AOI fixation in the second, third, and fourth readings of the word problem Sentences A, B, C, and D. In order to examine the presented average values in terms of the dynamics of the reading process, Figure 4 shows different scan paths during students' reading of items with less expected structures (Sentence Types B and D) in pre-ET and post-ET. In the scan paths, fixations are represented by circles, the bigger the circle the longer the fixation. Saccades between fixations are represented by straight lines (even though the saccade itself was not necessarily linear). The main purpose of this representation is to provide a visual representation of the number, order, and length of fixations in an intuitive way. The presented scan paths illustrate typical reading processes of items with less expected structures, which can be summarized and contrasted with respect to the visible processes of noticing and interpreting (Dröse & Prediger, 2020) as follows: In the pre-ET, three different patterns can be seen taking the scan paths and the mathematizations of the pronouns into considerations: • (S1a): Scan path S1a shows that Lars did not focus on the AOIs "Seiner" and "er" in the second read and afterwards, and he also did not mathematize their meaning correctly. We infer that he did not notice the pronouns as relevant for the mathematization of the whole word problem. As a consequence, he could not interpret them correctly. [Pattern 1 of noticing and interpreting] • (S2a): Timo focused on both AOIs during his second read and afterwards, but he did also not mathematize their meaning correctly. Possibly, in contrast to Lars, he did notice the pronouns; however, he might have had problems in interpreting the meaning of their grammatical cases as indicating subject and object. [Pattern 4 of noticing and interpreting] • (S3a): Mike focused on the AOI "er" during his second read. He showed a scan path similar to that of Timo, but he also mathematized the sentence correctly. This implies that he was able to notice the pronouns and interpret their grammatical cases correctly. [Pattern 3 of noticing and interpreting] In order to get first insights into RQ3 (How do the identified patterns change from the pre-ET to the post-ET?), the three pre-ET scan paths can be compared to the post-ET scan paths. In post-ET, all three students focus on the AOIs during second read, whereas in the third and fourth read, they rarely fixate on the AOIs. All three students solve between one and all items with less expected structures correctly. In those reading processes the scan paths reveal the following patterns: • (S1b): In post-ET, Lars focused on the AOIs and mathematized them correctly. This indicates that he was now able to notice and interpret them during the process of word problem solving. [Pattern 3 of noticing and interpreting] • (S2b): Timo incorrectly mathematized most of the items with less expected structures in post-ET. But for those items that he mathematized correctly, his focus on the pronouns was visible during the second, third, and fourth reads. This indicates that he noticed the pronouns, but required high effort to interpret those structures correctly. [Patterns 3 and 4 of noticing and interpreting] • (S3b): Mike focused on the AOIs, but only in the first read, and mathematizes them correctly. This shows that he noticed and interpreted the pronouns during the first read and even did not need to refocus on them in the further reads. [Pattern 2 of noticing and interpreting] So far, we have only investigated six scan paths from Figure 4, two for each of the three students. Table 4 provides two generalizations: (a) it gives an overview of the three students' revisits of all 48 items (eight in pre-ET and eight in each post-ET), and (b) expands these results to the other seven students. The table shows the percentages of items with revisits to subject and/or object AOIs during the second, third, and fourth reads and the percentage of correct mathematizations. Table 4 also allows comparisons between pre-ET and post-ET and between items with more and less expected structures and provides the data on the percentage of items in which the interpretation of the syntactic features was mathematized correctly. The order of students is arranged based on their percentage of correctly mathematized word problems with less expected structures. The colors show nearly no, slight, and high numbers of revisits to the AOIs. The cases of the three students illustrate how processes of noticing and interpreting can be revealed through eye-tracking in pre-ET. Similarities and differences can be found among all 10 students: (1) The counted revisits in Lars' pre-ET was low, which means he refocused the subject/object pronouns during the process of reading and solving the word problems. This indicates that he did not pay particular attention to the syntactic features in view. In connection with the low rate of correct mathematizations, this indicates (as described above) that those features were not noticed in detail. For Lars, this equally applies for items with more and less expected structures [Pattern 1 of noticing and interpreting]. Sina and Simon show the same pattern of revisits to AOIs in their scan paths, but in contrast to Lars, they correctly mathematized more than 75% of the word problems with less expected structures. This indicates that they were able to interpret those structures correctly and therefore also must have noticed them. However, the structures might be so familiar to them that they did not need to fixate again to notice them [Pattern 2 of noticing and interpreting].
(2) Timo's revisits in the pre-ET show a different pattern in his re-reading processes: In items with more and less expected structures, Timo shows many revisits to the chosen AOIs. This suggests that Timo already paid particular attention to those features. Only for items with expected structures did he often mathematize correctly [Pattern 3], whereas the items with less expected structures seem to have been noticed but not correctly interpreted (as already described for the exemplary scan path) [Pattern 4]. We infer that noticing was likely to take place successfully, whereas interpreting was not successful for items with less expected structures. Eric, Lea, Fynn, and Alex show a similar pattern of revisiting the AOIs, but mathematized only up to 50% of the word problems with less expected structures correctly.
(3) Mike's revisits and percentage of correct mathematizations are different from the other ones. He shows many revisits for items with expected structures and less expected structures, even if not as many as Timo. He also mathematized nearly all items correctly. Hence, the indications from the exemplary scan path that the process of noticing and the process of interpreting have been gone through successfully are likely to be extendable to all items. Ahmed shows a similar pattern, his number of revisits is also not as high as Mike's during the second read [Pattern 3].
This comparison shows that the revisits cannot be interpreted alone but in connection with the correct mathematizations. Taking both into consideration, patterns different from the original Patterns 1-4 might be revealed.
Extending the analysis of revisits from the pre-ET to the post-ET, similar changes become visible for most of the 10 students: • At first, nearly all revisit the subject and object pronouns in the second read of the post-ET (apart from Alex), which indicates that they pay attention to those syntactic features and therefore possibly notice them (as described in Lars' scan path above). Timo, Mike, and Simon show a higher percentage of revisits than the other seven students. • Second, all of them increased the percentage of correct mathematizations for more and less expected structures, which is considered an indicator that the students' processes of noticing and interpreting were completed more successfully in post-ET than in pre-ET. These findings resonate with the qualitative analysis of students' utterances in design experiments in Dröse and Prediger (2020), and Dröse (2019).
Thus, capturing the revisits in eye-tracking scan paths seems to provide valid information and seems to enrich and deepen the insights into students' processes of noticing and interpreting without communicative or cooperative situations and even without self-report by students.
Apart from the individual changes in noticing and interpreting syntactic features, the presented case studies hint at possible existing differences between items with more and less expected structures and between correctly and incorrectly mathematized items. These distinctions will be further examined in the next section for the overall development in noticing and interpreting via eye-tracking.

Methodological challenge: How can processes of noticing syntactic features in word problems be characterized by eye-tracking methodology and be related to interpreting the text?
The results indicate that the processes of noticing and interpreting of syntactic features can only be partially captured by net dwell time; it is better characterized by revisits and scan paths: Revisits have been shown to be an important measure for investigating the encoding towards a textbase and the construction of a situation or problem model by Inglis and Alcock (2012) and Strohmaier et al. (2019) and, on the other hand, by the analysis of the students' scan paths. Both measures (revisits and analysis of scan paths) seem more adequate than the measure of average net dwell time. Whereas the average net dwell time revealed no big changes when measured relative to the whole reading time, analyzing the scan paths and the revisits in second, third, and fourth reads of the word problem text revealed interesting patterns. This methodological lesson learned corresponds to the methodological conclusions by Inglis and Alcock (2012)  From the perspective of syntactic language awareness, the eye-tracking data can be used to measure students' noticing. The process of noticing can be revealed though revisits to pronouns, combined with the correct or incorrect mathematization that indicates the processes of interpreting. The percentages of revisits in pre-ET show that there are students who do not notice the relevant syntactic features as they do not revisit them and do not mathematize them correctly (e.g., Lars). These results could strengthen earlier hypotheses gained in purely observational studies that syntactic language features might be overlooked by students (Gürsoy et al., 2013;Haag et al., 2015).
Furthermore, there are students (e.g., Timo) who revisit the pronouns and therefore pay special attention to the relevant syntactic structures, but nonetheless do not necessarily mathematize them correctly. This indicates that noticing a feature can be clearly separated from interpreting it successfully. In this aspect also, the eye-tracking data resonate with hypotheses gained in observational studies (Dröse & Prediger, 2020).
Lastly, there are students like Mike who are already capable of noticing and interpreting the features correctly in pre-ET. In post-ET, nearly all of the 10 students focused on the pronouns during the second read, which indicates that after the intervention, they paid more attention to the linguistically encoded detailed information and therefore noticed it more often. In addition, there were more correct mathematizations of the structures, indicating that the processes of interpreting were being completed successfully. This deepens the insights into the evaluation of the intervention as already presented in Dröse and Prediger (2021). However, there seem to be differences in items with more and less expected structures and in items with correct and incorrect mathematizations. The positive effects of the intervention have been traced back to previous studies in the field of bianshi teaching and variation theory (Gu, Huang, & Gu, 2017;Huang, Barlow, & Prince, 2016;Marton & Pang, 2006;Pang, Bao, & Ki, 2017).
RQ1. + RQ2. Which differences occur in students' fixation of pronouns in items with more or less expected structures? And how are these differences in more or less expected structures connected to students' correct or incorrect mathematizations?
Items with expected syntactic structures are focused on for slightly shorter amounts of time than items with less expected structures in pre-ET. This resonates with previous findings concerning the influence of the expectedness and predictability of lexical features on reading processes and eye movements (Rayner & Well, 1996). For items with expected structures, both AOIs are focused on for equally lengths of time regardless of correct or incorrect mathematization. This indicates that both AOIs are equally considered and that the success of mathematizations might be mostly attributed to processes of interpreting the syntactic features. For items with less expected structures, correctly mathematized items have a slightly higher focus on the object AOI, whereas incorrectly mathematized items have a slightly higher net dwell time on the subject AOI. These findings extend the findings of Strohmaier et al. (2019) concerning the connections between the performance in word problem solving and eye movements. To interpret those findings, the analysis had to be deepened to the revisits in the scan paths.

RQ3. How do the identified patterns change from the pre-ET to the post-ET?
Comparing pre-ET and post-ET reveals that the reading process of the given word problems got faster, as there was a reduction of net dwell time on all AOIs. Meanwhile, the relative dwell time on the AOIs did not decrease, but there was a slight tendency for an increasing relative dwell time on the pronouns. This might indicate that more processes of noticing took place. In addition, more correct mathematizations can be documented, revealing more processes of successful interpretation of the syntactic structures. Nevertheless, the students still did not all interpret the syntactic features correctly. These results underpin and extend the findings from previous studies (Dröse, 2019;Dröse & Prediger, 2021).

Limitations and Future Research
The presented study is limited due to the small sample size and the limited number of eight items. Future research will have to extend the eye-tracking study to a higher number of participants in pre-ET and post-ET. Furthermore, syntactic features other than subject-object positions (e.g., active-passive construction and comparatives) should be taken into account. In the designed teaching-learning arrangement, elliptic omission of comparative phrases is also chosen as a learning content. The video-based qualitative analysis by Dröse (2019) indicates that there are similarities but also differences in the students' learning process for both syntactic structures. Therefore, these differences might be further investigated using eye-tracking. Extending the study to other syntactic features also implies extending it to other grades and age levels and other mathematical learning content (see Prediger & Zindel, 2017, for functional relationships-without eye-tracking).