5 Method: LongSAL - the Longitudinal Study

To investigate the research questions and hypotheses discussed in Chapter 4, we conducted the LongSAL study. The following sections discuss the study design, apparatus and procedure.

5.1 Study Design and Participants

LongSAL (Longitudinal Search as Learning study) is a remote, exploratory, longitudinal study that was conducted between January and June 2022 (Spring semester) at the School of Information, University of Texas at Austin (UT Austin). The study was approved by The University of Texas at Austin Institutional Review Board (Submission ID: STUDY00002136, Date Approved: December 8, 2021).

Participants were recruited from the student pool enrolled in the required undergraduate core-course: Ethical Foundations for Informatics (Fleischmann et al., 2022). 18 participants originally signed up for the study; 10 participants fully completed all the phases of the study, and the remaining 8 dropped off at different points during the semester. Students enrolled in the course had to submit a research paper of 2,000-2,500 words as the final project for the course. There were has four checkpoints spread across the semester to submit the drafts in progress: (i) paper proposal, (ii) outline, (iii) rough draft, and (iv) final paper. Writing the research paper required choosing an informatics ethical dilemma, and applying three ethical perspectives covered in the course to explore potential solutions to the selected dilemma. This involved searching and navigating information online, finding at least 20 relevant external sources, combining ideas, and weaving a narration around the information found in the selected sources.

The study design was informed by running a pilot study during Summer 2021 semester, in partnership with two courses at UT Austin School of Information: Information in Cyberspace, and Academic Success in the Digital University. More details of the pilot study are presented in Appendix A.

5.2 Apparatus

5.2.1 YASBIL Browsing Logger

The YASBIL browsing logger (Bhattacharya & Gwizdka, 2021) was utilised for this study. YASBIL (Yet Another Search Behaviour and Interaction Logger)12 is a two-component logging solution for ethically recording a user’s browsing activity for Interactive IR user studies. It was developed by the author in early Spring 2021, and was employed in the pilot study for data collection and testing. YASBIL comprises a Firefox browser extension and a WordPress plugin. The browser extension logs browsing activity in the participants’ machines. The WordPress plugin collects the logged data into the researcher’s data server. YASBIL captures participant’s behavioural data, such as webpage visits, time spent on pages, identification of popular search engines and their SERPs, tracking mouse clicks and scrolls, and the order and sequences of these events. The logging works on any webpage, without the need to own or have knowledge about the HTML structure of the webpage. To protect the privacy of participants, the logger software can be switched on or off by the participant. Participants received regular reminders to turn YASBIL on only when they were searching for information related to the course.

YASBIL offers ethical data transparency and security for participants, by enabling them to view and obtain copies of the logged data, as well as securely upload the data to the researcher’s server over an HTTPS connection. Although developed using the cross-browser WebExtension API 13, YASBIL currently works in the Firefox Web Browser. So participants were instructed to install Firefox and YASBIL on their machines when they volunteered to participate in the study.

5.3 Procedure

Longitudinal study procedure.

Figure 5.1: Longitudinal study procedure.

The longitudinal study consisted of six data collection components, as illustrated in Figure 5.1. They comprise three asynchronous questionnaires (QSNR1, QSNR2, QSNR3), two remote synchronous study phases over Zoom video conferencing software (PHASE1, PHASE3), and a set of four asynchronous longitudinal tracking phases (PHASE2a, PHASE2b, PHASE2c, PHASE2d). These phases are discussed in detail in the following sections.

5.3.1 QSNR0: Recruitment Questionnaire (Appendix B.1)

Participants were recruited for the study via the recruitment questionnaire (QSNR0). The questionnaire contained questions about demographic information of the participant pool. The description of the study and the link to the questionnaire was posted in the Canvas Learning Management System used for the I303 course.

5.3.2 QSNR1: Entry Questionnaire

After recruitment, participants completed the entry questionnaire (QSNR1). The purpose of QSNR1 was to capture their individual-differences, or moderating variables, at the beginning of the semester. Details of the data captured in SUR1 are described below, with references to sections in the Appendix, where the full-text of the questionnaire can be found.

5.3.2.2 Motivation (Appendix B.3)

Adapted from the Intrinsic Motivation Inventory (IMI) by (Ryan, 1982), which is a multidimensional measurement device intended to assess participants’ subjective experience related to a target activity (the assignments for the course they are taking). The instrument assesses participants’ interest/enjoyment, perceived competence, effort/importance, pressure/tension, perceived choice, and value/usefulness, while performing a given activity, thus yielding six subscale scores. The pressure/tension and the perceived choice components were not included in the entry questionnaire QSNR1, and were present in the mid-term (QSNR2) and exit (QSNR3) questionnaires.

5.3.2.3 Self-regulation (Appendix B.4)

Adapted from the Self-Regulation Questionnaire (SRQ) by (J. M. Brown et al., 1999), which assess seven self-regulatory processes through self-report: receiving relevant information, evaluating the information and comparing it to norms, triggering change, searching for options, formulating a plan, implementing the plan, and assessing the plan’s effectiveness (Section 2.5.4).

5.3.2.4 Metacognition (Appendix B.5)

Adapted from the Metacognivite Awareness Inventory (MAI), originally proposed by (Schraw & Dennison, 1994) as a 52-item true / false questionnaire, and later revised by (Terlecki & McMahon, 2018) to use five-point Likert scales. The instrument measures two components of cognition through self-report: knowledge about cognition, and regulation of cognition (Section 2.5.2).

After completing QSNR1 offline, participants were instructed to prepare for the initial synchronous phase, PHASE1, by installing Firefox web browser and the YASBIL extension on their machines. This was a one-time step. If a participant could not find the time for this step, they were informed that an extra 5-10 minutes would be taken in the beginning of PHASE1 to complete this step.

The entry questionnaire and the software installation took about 10-15 minutes to complete. Participants were compensated with USD 5 for their time for completing this step. The questionnaire was published to the I-303 course students in the first week of the Spring 2022 semester.

5.3.3 PHASE1: Initial Phase

The PHASE1 of the data collection took place in the beginning of the semester. The data-collection took place over a Zoom video call combined with YASBIL browsing logger installed in the participants’ machines. Participants were asked to share their screen for the whole duration of the phase. Their screens and audio were recorded for the entire duration. They had the freedom to turn off their video. The total time for PHASE1 was expected to not exceed 1.5 hours (90 minutes). Participants were compensated with USD 25 for this phase. The different components of PHASE1 are described below.

5.3.3.1 Training Search Task

Participants performed a training search task to familiarize themselves with how to operate the YASBIL browser extension to log their browsing activity. The training task took around 2-5 minutes.

5.3.3.2 PHASE1-FINANCE and PHASE1-UBUNTU: Two Actual Search Tasks

Prompts for the search task that was repeated in the final phase, on the topic of financial literacy.

Figure 5.2: Prompts for the search task that was repeated in the final phase, on the topic of financial literacy.

Participants performed two search tasks: PHASE1-FINANCE, and PHASE1-UBUNTU. The PHASE1-FINANCE task was repeated at the end of the semester as PHASE3-FINANCE task. The PHASE1-UBUNTU task was not repeated, and instead the PHASE3-BIAS task tooK its place. This helps to answer the research question RQ2 (Chapter 4). The order of the two search tasks were randomized.

The repeated search task FINANCE was on the topic of financial literacy, a topic that we posit can be considered as universally important to college students, and part of lifelong learning. The prompts for the PHASE1-FINANCE and PHASE3-FINANCE tasks are presented in Figure 5.2. The non-repeated search tasks were on topics that were taught in the I303 course: Ubuntu ethics (for PHASE1) and Algorithmic Bias (for PHASE3). The prompts for these tasks are present in Figure 5.3.

Prompts for the non-repeating search tasks. Topics were selected from the I-303 course content.

Figure 5.3: Prompts for the non-repeating search tasks. Topics were selected from the I-303 course content.

Each search task began with a pre-task questionnaire (Appendix C.1), which asked participants to self-rate their pre-search knowledge-level and interest on the topic. Then participants turned on the YASBIL browsing logger and started searching. The deliverable for each search task was a written summary (artefact). After participants are satisfied with the quality of the deliverable, they turned off YASBIL browsing logger, and proceeded to the post-task questionnaire.

The post-task questionnaire (Appendix C.2) asked participants to self-rate their perceived learning and search outcomes, search experience, interest and motivation, and overall perceptions. The pre-task and post-task questionnaires are adapted from (Collins-Thompson et al., 2016; Crescenzi, 2020).

5.3.3.3 Memory Span Test

PHASE1 concluded with the assessment of the participant’s working memory capacity (WMC) using a memory span task (Francis et al., 2004). Memory span assessment was kept in the synchronous phase because it is a timed task, and needs to conducted in a controlled (experimenter observed) condition. The task has 25 trials. On each trial participants saw a list of items presented one at a time in random order and were asked to recall the items in the same order in which they were presented. If they got a list correct, the list length increased by 1 for that type of material. If they got a list incorrect, the list length decreased by 1.

The type of material participants were asked to recall were: digits, letters that sound dissimilar, letters that sound similar, short words, and long words. The outcome score was the list length of the last list that participants could correctly recall.

5.3.4 PHASE2A - PHASE2D: Longitudinal Tracking Phase

Final project description, setting up the longitudinal tracking phase of the study throughout the duration of the Spring 2022 semester. Text taken from I-303 course syllabus (Fleischmann et al., 2022); emphasis and annotations our own.

Figure 5.4: Final project description, setting up the longitudinal tracking phase of the study throughout the duration of the Spring 2022 semester. Text taken from I-303 course syllabus (Fleischmann et al., 2022); emphasis and annotations our own.

The four-part longitudinal tracking phases PHASE2A - PHASE2D were conducted asynchronously over the duration of the semester, to understand the change (or lack thereof) of participants’ search behaviour and knowledge gain over time. Whenever participants worked on different parts of their final project (Ethical dilemma research paper for the I-303 course), as described in Figure 5.4, they used Firefox web browser, and logged their browsing activity using YASBIL browsing logger. To protect their privacy, participants were regularly instructed to turn YASBIL on only when they were searching for information related to coursework. After each checkpoint assignment, participants self-uploaded an anonymized version of the working-draft of their research paper, and answered a post-task questionnaire. The post-task questionnaire were similar to those used in the PHASE1 and PHASE3 search tasks, where participants self-reported, among other things, their perceived learning outcome and perceived search outcome (Collins-Thompson et al., 2016). Participants received reminder emails before the deadline of each assignment, to remind them to use Firefox, turn YASBIL on, and upload the anonymized working-draft. To prevent participant drop-off, a staggered payment model was adopted during PHASE2. Participants received USD 5 each when they completed PHASE2A and PHASE2B, USD 10 for PHASE2C, and USD 15 for PHASE2D, for a total of USD 35 for entire PHASE2.

5.3.5 QSNR2: Mid-Term Questionnaire

The mid-term questionnaire QSNR2 took place around the mid-point of the semester (Week 8-9). The purpose was to track whether any of the participants’ individual difference measures (motivation, metacognition, and self-regulation) changed during the first half of the semester. This questionnaire was essentially a replica of the Entry Questionnaire QSNR1, with two modifications. First, the consent form and the demographics sections were absent. Second, the Intrinsic Motivation Inventory (IMI) included the ‘pressure/tension’ and the ‘perceived choice’ subscales, as these scales are more meaningful after an activity has taken place (Ryan, 1982). The IMI was also be reworded to reflect the mid-point of the semester. Participants were compensated with USD 10 for completing this step.

5.3.6 PHASE3: Final Phase

The Final Phase PHASE3 was similar in structure to the Initial Phase (PHASE1), and took place at the end of the semester, after all the course related tasks were completed by the participant. The purpose of the session is to record the ‘evolved’ search behaviour, and final knowledge state. Participants performed two search tasks: PHASE3-FINANCE and PHASE3-BIAS.

At the end of PHASE3, a semi-structured interview was conducted. The questions were aimed to collect the participants’ reflections on their searching and learning experience throughout the semester, w.r.t. to the I303 course. While a full-scale qualitative analysis of the interview responses is beyond the scope of this dissertation, some preliminary qualitative quotes are presented in the results and discussion sections, to support the quantitative results as necessary.

5.3.6.1 PHASE3-FINANCE and PHASE3-BIAS: Two Actual Search Tasks

Of the two search tasks, the topic of one was repeated from PHASE1 (financial literacy, Figure 5.2), while the topic of the other came from the course material: algorithmic bias (Figure 5.3). In both search tasks, participants were given the option of not searching if they felt confident enough to answer the search task questions from their prior knowledge (Crescenzi, 2020). The deliverables for each search-task, as before, was a written summary (artefact).

Similar to PHASE1, participants were asked to share their screen for the whole duration of the phase. Their screen and audio was recorded for the same. They had the freedom to turn off their video. The total time for PHASE3 was expected to not exceed 1.5 hours (90 minutes). Participants were compensated with USD 30 for PHASE3. At the end of PHASE3, participants were instructed to complete the Exit Questionnaire QSNR3 as soon as convenient.

5.3.7 QSNR3: Exit Questionnaire

The exit questionnaire QSNR3 took place after the Final Phase PHASE3. The purpose was to record the final state of the participants’ individual difference measures (motivation, metacognition, self-regulation), and whether these characteristics changed during the second half of the semester. As before, QSNR3 questionnaire was essentially be a replica of QSNR2, with the Intrinsic Motivation Inventory (IMI) reworded to reflect the end-point of the semester Participants were be compensated with USD 15 for their time for completing this step.

After QSNR3 was complete, participants received a bonus compensation of USD 30, if they completed all the phases of the LongSAL study without missing anything.

5.4 Measures to Address Ethical Concerns

  • Participation in the study (which was voluntary and compensated separately) and participation in the I303 course (which was required for graduation from the Informatics major) were sufficiently disentangled. The course instructors were never aware of which students participate in the course, and did not share any student data with the researchers. This avoided any undue pressure or expectation on the students.

  • Participants logged their browsing activity using a Firefox browser extension YASBIL, which was been developed by the authors. The extension has an ON-OFF button, which put the participants in full control of when they wished to start and stop the logging. Participants had been sufficiently trained to use the browser extension, and were repeatedly reminded to log data only when they were working on the research paper assignments for the course, and not at other times.

  • This study has been approved by The University of Texas at Austin Institutional Review Board (Submission ID: STUDY00002136, Date Approved: December 8, 2021).

After data collection for all the phases was complete, data analysis was performed on the collected data, which is discussed in the next chapter.

References

Bhattacharya, N., & Gwizdka, J. (2021). YASBIL: Yet another search behaviour (and) interaction logger. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2585–2589.
Brown, J. M., Miller, W. R., & Lawendowski, L. A. (1999). The self-regulation questionnaire. In V. L. & J. T. L. (Eds.), Innovations in clinical practice: A sourcebook (Vol. 17, pp. 281–292). Professional Resource Press/Professional Resource Exchange.
Collins-Thompson, K., Rieh, S. Y., Haynes, C. C., & Syed, R. (2016). Assessing learning outcomes in web search: A comparison of tasks and query strategies. Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval, 163–172.
Crescenzi, A. M. C. (2020). Adaptation in Information Search and Decision-Making under Time Pressure [PhD thesis, The University of North Carolina at Chapel Hill University Libraries]. https://doi.org/10.17615/YT6K-AC37
Fleischmann, K., Verma, N., Gursoy, A., Bautista, J. R., & Day, J. (2022). I 303 : Ethical foundations for informatics [syllabus]. School of Information, University of Texas at Austin.
Francis, G., MacKewn, A., & Goldthwaite, D. (2004). CogLab on a CD. Wadsworth Publishing Company.
Ryan, R. M. (1982). Control and information in the intrapersonal sphere: An extension of cognitive evaluation theory. Journal of Personality and Social Psychology, 43(3), 450.
Schraw, G., & Dennison, R. S. (1994). Assessing Metacognitive Awareness. Contemporary Educational Psychology, 19(4), 460–475. https://doi.org/10.1006/ceps.1994.1033
Terlecki, M., & McMahon, A. (2018). A Call for Metacognitive Intervention: Improvements Due to Curricular Programming in Leadership. Journal of Leadership Education, 17(4), 130–145. https://doi.org/10.12806/V17/I4/R8