Tutoring and the Big Picture

The period since the end of the COVID lockdown has been difficult for K-12 schools. Students have struggled to attain traditional rates of academic progress. Student absenteeism has reached alarming levels. Where we live some districts have struggled to find teachers to fill vacant positions and other districts are now cutting staff because of massive budget overruns. Educators feel a lack of support from politicians and groups of parents complaining about various issues and sometimes pulling their children out of one school to enroll them in another. 

I certainly have no answer to this collection of challenges as it seems a proposed solution to one problem only leads to complaints due to consequences from that recommendation for a different issue. I look to educational research for potential solutions and there are certainly strategies with proven positive consequences in specific areas, but this is as far as it seems practical to go.

Tutoring has been one of the recent areas I have been investigating. My specific interest has been in the potential of AI as a substitute for human tutors, but the ample research on the effectiveness of human tutors has also been a fruitful ground for considering what proven experiences and roles AI might be applied to approximate. I have continued to review what we know about the effectiveness of human tutors and what it is tutors provide that students do not receive from their traditional classroom experiences.

For those wanting to review this research themselves, I recommend a fairly recent working paper by Nickow and colleagues (2020). This paper is both a general descriptive review of how tutoring is practiced and a meta-analysis of a large number of quality studies (96) of the effectiveness of this addition to K12 instruction. The meta-analysis considers only the impact of adult tutors, but the descriptive review does include some comments on both peer tutoring and technology-based tutoring. The technology comments were written before recent advances in AI. My summary of this paper follows.

Key Findings

I will begin with the overall conclusion. While there are more and less effective implementations and situations, adult tutoring consistently results in positive outcomes. The authors concluded that tutoring was responsible for substantial and consistent positive effects on learning outcomes, with an overall effect size of 0.37 standard deviations. The results are the strongest for teacher-led and paraprofessional tutoring programs, with diminished but still positive effects for nonprofessional and parent-led tutoring. Notably, tutoring is most effective in the earlier grades and shows comparable impacts across both reading and mathematics interventions, although reading tends to benefit more in earlier grades and math in later ones. A large proportion of the studies and tutoring activity appears focused in the first couple of years of school and in the subject area of reading. There are also far more studies of tutoring in the early grades than in secondary school. Only 7% of the studies included students above 6th grade.

Categories of tutors

The study by Nickow, Oreopoulos, and Quan (2020) identifies four categories of tutors:

  1. Teachers: These are certified professionals who have undergone formal training in education. They are typically employed by schools and likely have a deeper understanding of the curriculum, teaching strategies, and student assessment.
  2. Paraprofessionals: These individuals are not certified teachers, but they often work in educational settings. Paraprofessionals may include teaching assistants, educational aides, or other support staff in schools. They often work under the supervision of certified teachers.
  3. Nonprofessionals: This category typically includes volunteers who may not have formal training in education. They could be community members, older students, or others who offer tutoring services. Their level of expertise can vary widely.
  4. Parents: Parents often play a tutoring role, especially in the early years of a child’s education. Parental involvement in a child’s education can have a significant impact, but the level of expertise and the ability to provide effective tutoring can vary greatly among parents.

The study found that paraprofessionals are the largest provider of tutoring services, followed by nonprofessionals. The effectiveness of tutoring was found to be higher when provided by teachers and paraprofessionals compared to nonprofessionals and parents. Tutors who have teaching certificates and classroom experience were rare, but most effective in the research studies. As mentioned in the introduction to this post, the question of how schools must apply their budgets appears highly influential in whether known sources of impact can be applied.

Mechanisms of Tutoring Effectiveness

This paper reviewed mechanisms potentially responsible for the effectiveness of tutoring. While these hypothesized mechanisms were not evaluated in the meta-analysis, I find such lists useful in considering other approaches such as AI. Which of these mechanisms could AI tutoring address and which would not be possible?

The paper discusses several mechanisms through which tutoring enhances student learning:

  1. Increased Instruction Time: Tutoring provides additional instructional time, crucial for students who are behind.
  2. Customization of Learning: The one-on-one or small group settings of tutoring allow for tailored instruction that addresses the individual needs of students, a significant advantage over standard classroom settings where diverse learning needs can be challenging to meet. The authors propose that tutoring might be thought of as an extreme manipulation of class size. I know that studies of class size have produced inconsistent results extremely small class sizes are not investigated in such research.
  3. Enhanced Engagement and Feedback: Tutoring facilitates a more interactive learning environment where students receive immediate feedback, enhancing learning efficiency and correction of misunderstandings in real time.
  4. Relationship Building: The mentorship that develops between tutors and students can motivate students, fostering a positive attitude toward learning and educational engagement.

Program Characteristics

The document evaluates the variation in the effectiveness of tutoring programs based on several characteristics:

  1. Tutor Type: Teacher and paraprofessional tutors are found to be more effective than nonprofessionals and parents, suggesting that the training and expertise of the tutor play a critical role in the success of tutoring interventions.
  2. Intervention Context: Tutoring programs integrated within the school day are generally more effective than those conducted after school, possibly due to higher levels of student engagement and less fatigue.
  3. Grade Level: There is a trend of diminishing returns as students age, with the most substantial effects seen in the earliest grades. This finding emphasizes the importance of early intervention.

Summary

This research and analysis paper might be summarized by proposing that schools cannot go wrong with tutoring. The paper suggests several policy implications based on its findings. Given the strong evidence supporting the effectiveness of tutoring, especially when conducted by trained professionals, the authors advocate for increased investment in such programs. They suggest that education policymakers and practitioners should consider tutoring as a key strategy for improving student outcomes, particularly in early grades and for subjects like reading and mathematics

Reference

Nickow, A., Oreopoulos, P., & Quan, V. (2020). The impressive effects of tutoring on prek-12 learning: A systematic review and meta-analysis of the experimental evidence. https://www.nber.org/papers/w27476.

Loading

AI tutoring based on retrieval generated augmentation

Here is a new phrase to add to your repertoire – retrieval generated augmentation (RAG). I think it is the term I should have been using to explain my emphasis in past posts to my emphasis on focusing AI on notes I had written or content I had selected. Aside from my own applications, the role for retrieval generated augmentation I envisioned is as an educational tutor or study buddy.

RAG works in two stages. The system first retrieves information from a designated source and then uses generative AI to take some requested action using this retrieved information. So, as I understand an important difference, you can interact with a large language model based on the massive corpus of content on which that model was trained or you can designate specific content to which the generative capabilities of that model will be applied. I don’t pretend to understand the specifics, but this description seems at least to be descriptive. Among the benefits is a reduction in the frequency of hallucinations. When I propose using AI tools in a tutoring relationship with a student, suggesting to the tool that you want to focus on specific information sources seems a reasonable approximation to some of the benefits a tutor brings. 

I have tried to describe what this might look like in previous posts, but it occurred to me that I should just record a video of the experience so those with little experience might see for themselves how this works. I found trying to generate this video an interesting personal experience. It is not like other tutorials you might create in that it is not possible to carefully orchestrate what you present. What the AI tool does cannot be perfectly predicted. However, trying to capture the experience as it actually happens seems more honest.

A little background. The tool I am using in the video is Mem.ai. I have used Mem.ai for some time to collect notes on what I read so I have a large collection of content I can ask the RAG capabilities of this tool to draw on. To provide a reasonable comparison to how a student would study course content, I draw some parallels based on the use of note tags and note titles. Instead of using titles and tags in the way I do, I propose a student would likely take course notes and among the tags label notes for the next exam with something like “Psych1” to indicate a note taken during the portion of a specific course before the first exam to which that note might apply. I hope the parallels I explain make sense.

Loading

Content focused AI for tutoring

My explorations of AI use to this point have resulted in a focus on two applications – AI as tutor and AI as tool for note exploration. Both uses are based on the ability to focus on information sources I designate rather than allowing the AI service to rely on its own body of information. I see the use of AI to interact with the body of notes I have created as a way to inform my writing. My interest in AI tutoring is more related to imagining how AI could be useful to individual students as they study assigned content.

I have found that I must use different AI services for these different interests. The reason for this differentiation is that two of the most popular services (NotebookLM and OpenAI’s Custom GPTs) limit the number of inputs that can be accessed. I had hoped that I could point these services at a folder of notes (e.g., Obsidian files) and then interact with this body of content. However, both services presently allow only a small number of individual files (10 and perhaps 20) can be designed as source material. This is not about the amount of content as the focus of this post involves using these two services to interact with a single file of 27,000 words. I assume in a year the number of files will be less of an issue.

So, this post will explore the use of AI as a tutor applied to assigned content as a secondary or higher ed student might want to do. In practice, what I describe here would require that a student would have access to a digital version of assigned content not protected in some way. For my explorations, I am using the manuscript of a Kindle book I wrote before the material was converted to a Kindle book. I wanted to work with a multi-chapter source of a length students might be assigned.

NotebookLM

NotebookLM is a newly released AI service from Google. The AI prompts can be focused on content that is available in Google drive or uploaded to the service. This service is available at no cost, but it should be understood that this is likely to change when Google is ready to offer a more mature service. Investing time in this service rather than others allows the development of skills and the exploration of potential, but in the long run some costs will be involved.

Once a user opens NotebookLM and creates a notebook (see red box surrounding new notebook), external content to be the focus of user prompts can be added (second image). I linked Notebook to the file I used in preparation for creating a Kindle book. Educators could create a notebook on unprotected content they wanted students to study.

The following image summarizes many essential features used when using NotebookLM. Starting with the right-hand column, the textbox near the bottom (enclosed in a red box) is where prompts are entered. The area above (another red box) provides access to content used by the service in generating the response to a prompt. The large area on the left-hand side displays the context associated with one of the areas referenced with the specific content used highlighted. 

Access to a notebook can be shared and this would be the way an educator would provide students access to a notebook prepared for their use. In the image below, you will note the icon (at the top) used to share content, and when this icon is selected, a textbox for entering emails for individuals (or for a class if already prepared) appears.

Custom GPTs (OpenAI)

Once you have subscribed to the monthly payment plan for ChatGPT – 4, accessing the service will bring up a page with the display shown below. The page allows access to ChatGPT and to any custom GPTs you have created. To create a Custom GPT you select Explore and then select Create a GPT. Describing the process of creating a GPT would require more space than I want to use in this post, but the process might best be described as conversational. You basically interact by describing what you are trying to create and you upload external resources if you want prompts to be focused on specific content. Book Mentor is the custom GPT I created for this demonstration.

Once created, a GPT is used very much in the same way a NotebookLM notebook is used. You use the prompt box to interact with the content associated with that GPT.

What follows are some samples of my interactions with the content. You should be able to see the prompt (Why is the word layering used to describe what the designer does to add value to an information source?)

Prompts can generate all kinds of ways of interaction (see a section below that describes what some of these interactions might be). One type I think has value in using AI as a tutor is to have the service ask you a question. An example of this approach is what is displayed in the following two images. The first image describes a request for the service to generate a multiple-choice question about generative activity which I then respond (correctly) and receive feedback. The second image shows the flexibility of the AI. When responding to the question, I thought a couple of the responses could be correct. After I answered the question and received feedback, I then asked about an answer I did not select wondering why this option could not also be considered correct. As you see in the AI reply, the system understands my issue and acknowledges how it might be correct. This seems very impressive to me and demonstrates that the interaction with the AI system allows opportunities that go beyond self-questioning.

Using AI as tutor

I have written previously about the potential of AI services to interact with learners to mimic some of the ways a tutor might work with a learner. I make no claims of equivalence here. I am proposing only that tutors are often not available and an AI system can challenge a learner in many ways that are similar to what a human tutor would do. 

Here are some specific suggestions for how AI can be used in the role of tutor

Summary

This post describes two systems now available that allow learners to work with assigned content that mimics how a tutor might work with a student. Both systems would allow a designer to create a tool focused on specific content that can be shared. ChatGPT custom GPTs require that those using a shared GPT have an active $20 per month account which probably means this approach would not presently be feasible for common application. Google’s Notebooks can be created at no cost to the designer or user, but this will likely change when Google decides the service is beyond the experimental stage. Perhaps the capability will be included in present services designed for educational situations.

While I recognize that cost is a significant issue, my intent here is to propose services that can be explored as proof of concept and those educators interested in AI opportunities might explore future productive classroom applications of AI. 

Loading

Why is tutoring effective?

We know that tutoring is one of the most successful educational interventions with meta-analyses demonstrating the advantage to be between .3 and 2.3 standard deviations. In some ways, the explanation of this advantage seems obvious as it provides personal attention that cannot be matched in a classroom. The challenges in applying tutoring more generally are the cost and availability of personnel. One of my immediate interests in the AI tools that are now available is in exploring how students might make use of these tools as a tutor. This is different from the long-term interest of others in intelligent tutoring systems designed to personalize learning. The advantage of the new AI tools is that these tools are not designed to support specific lessons and can be applied as needed. I assume AI large language chatbots and intelligent tutoring will eventually merge, but I am interested in what students and educators can explore now?

My initial proposal for the new AI tools was to take what I knew about effective study behavior and know about the capabilities of AI chatbots and suggest some specific things a student might do with AI tools to make studying more productive and efficient. Some of my ideas were demonstrated in an earlier post. I would still suggest interested students try some of these suggestions. However, I wondered if an effort to understand what good tutors do could offer some additional suggestions to improve efficiency and move beyond what I had suggested based on what is known about effective study strategies. Tutors seem to function differently from study buddies. I assumed there must be research literature based on studies of effective tutors and what it is that these individuals do that less effective tutors do not. Perhaps I could identify some specifics a learner could coax from an AI chatbot. 

My exploration turned out to be another example of finding that what seems likely is not always the case. There have been many studies of tutor competence (see Chi et al, 2001) and these studies have not revealed simple recommendations for success. Factors such as tutor training or age differences between tutor and learner do not seem to offer much as whatever is offered as advice to tutors and what might be assumed to be gained from experience do not seem to matter a great deal.

Chi and colleagues proposed that efforts to examine what might constitute skilled tutoring begin with a model of tutoring interactions they call a tutoring frame. The steps in a tutoring session were intended to isolate different actions that might make a difference depending on the proficiency with which the actions are implemented.

Steps in the tutoring frame:

(1) Tutor asks an initiating question

(2) Learner provides a preliminary answer 

(3) Tutor gives confirmatory or negative feedback on whether the answer is correct or not

(4) Tutor scaffolds to improve or elaborate the learner’s answer in a successive series of exchanges (taking 5–10 turns)  

(5) Tutor gauges the learner’s understanding of the answer

One way to look at this frame is to compare what is different that a tutor provides from what happens in a regular classroom. While steps 1-3 occur in regular classrooms, tutors would typically apply these steps with much greater frequency. There are approaches classroom teachers could apply to provide these experiences more frequently and effectively (e.g., ask questions and pause before calling on a student, make use of student response systems allowing all students to respond), but whether or not classroom teachers bother is a different issue from whether effective tutors differ from less effective tutors in making use of questions. The greatest interest for researchers seems to be in step 4. What variability exists during this step and are there significant differences in the impact identifiable categories of such actions have that impact learning?

Step 4 involves a back-and-forth between the learner and tutor that goes beyond the tutor declaring the initial response from the learner as correct or incorrect. Both teacher and learner might take the lead during this step. When the tutor controls what unfolds, the sequence that occurs might be described as scaffolded or guided. The tutor might break the task into smaller parts, complete some of the parts for the student (demonstrate), direct the student to attempt a related task, remind the student of something they might not have considered, etc. After any of these actions, the student could respond in some way.

A common research approach might evaluate student understanding before tutoring, identify strategy frequencies and sequence patterns during a tutoring session, evaluate student understanding after tutoring, and see if relationships can be identified between the strategy variables and the amount learned.

As I looked at the research of this type, I happened across a study that applied new AI not to implement tutoring, but to search for patterns within tutor/learner interaction (Lin et al., 2022). The researchers first trained an AI model by feeding examples of different categories identified within tutoring sessions and then attempted to see what could be discovered about the relationship of categories within new sessions. While potentially a useful methodology, the approach was not adequate to account for differences in student achievement. A one-sentence summary from that study follows; 

More importantly, we demonstrated that the actions taken by students and tutors during a tutorial process could not adequately predict student performance and should be considered together with other relevant factors (e.g., the informativeness of the utterances)

Chi and colleagues (2001)

Chi and colleagues offer an interesting observation they sought to investigate. They proposed that researchers might be assuming that the success of tutoring is somehow based on differences in the actions of the tutors and look for explanations in narratives based on this assumption. This would make some sense if the intent was to train or select tutors. 

However, they propose that other perspectives should be examined and suggest the  effectiveness of tutoring experiences is largely determined by some combination of the following:

  1. the ability of the tutor to choose ideal strategies for specific situations. (Tutor-Centered) 
  2. the degree to which the learner engages in generative cognitive activities during tutoring in contrast to the more passive, receptive activities of the classroom (Learner-Centered), and
  3. the joint efforts of the tutor and learner. (Interactive)

In differentiating these categories, the researchers proposed that in the learner-centered and interactive labels, the tutor will have enabled an effective learning environment to the extent that the learner asks questions, summarizes, explains, and answers questions (learner-centered) or interactively as the learner is encouraged to interact by speculating, exploring, continuing to generate ideas (interactive).

These researchers attempted to test this three-component model in two experiments. In the first, the verbalizations of tutoring sessions were coded for these three categories and related to learning gains. In the second experiment, the researchers asked tutors to minimize tutor-centered activities (giving explanations, providing feedback, adding additional information) and instead to invite more dialog – what is going on here, can you explain this in your own words, do you have any other ideas, can you connect this with anything else you read, etc. The idea was to compare learning gains with tutoring sessions from the first study in which the tutor took a more direct role in instruction. 

In the first experiment, the researchers found evidence for the impact of all three categories of tutor session benefits, but codes for learner-centered and interactive had benefits for performance outcomes relying on deeper learning. The second experiment found equal or greater benefits for learner-centered and interactive events when tutor-focused events were minimized.

The researchers argued that tutoring research that focuses on what tutors do may have yet to find much regarding what tutors should or not do may be disappointing because the focus should be on what learners do during tutoring sessions. Again, tutoring is portrayed as a follow-up to classroom experiences so the effectiveness of experiences during tutoring sessions should be interpreted given what else is needed in this situation. 

A couple of related comments. Other studies have reached similar conclusions. For example, Lepper and Woolverton (2002) concluded that tutors are most successful when they “draw as much as possible from the students” rather than focus on explaining. The advocacy of these researchers for a “Socratic approach” is very similar to what Chi labeled as interactive. 

One of my earlier posts on generative learning offered examples of generative activities and proposed a hierarchy of effectiveness among these activities. At the top of this hierarchy were activities involving interaction.  

Using an AI chatbot as a tutor:

After my effort to read a small portion of the research on effective tutors, I am more enthusiastic about the application of readily available AI tools to the content to be learned. My post which I presented more as a way to study with such tools, could also be argued as a way for a learner to take greater control of a learner/AItutor session. In the examples I provided, I showed how the AI agent could be asked to summarize, explain at a different level, and quiz the learner over the content a learner was studying. Are such inputs possibly more effective when a learner asks for them? There is a danger that a learner does not recognize what topics require attention, but an AI agent can be asked questions with or without designating a focus. In addition, the learner can explain a concept and ask whether his/her understanding was accurate. AI chats focused on designated content offer students a responsive rather than a controlling tutor. Whether or not AI tutors are a reasonable use of learner time, studies such as Chi, et al. and Lepper et al. suggest that more explanations may not be what students need most. Learners need opportunities that encourage their thinking.

References

Chi, M. T., Siler, S. A., Jeong, H., Yamauchi, T., & Hausmann, R. G. (2001). Learning from human tutoring. Cognitive science, 25(4), 471-533.

Fiorella, L., & Mayer, R. (2016). Eight Ways to Promote Generative Learning. Educational Psychology Review, 28(4), 717-741.

Lepper, M. R., & Woolverton, M. (2002). The wisdom of practice: Lessons learned from the study of highly effective tutors. In Improving academic achievement (pp. 135-158). Academic Press.

Lin, J., Singh, S., Sha, L., Tan, W., Lang, D., Gaševi?, D., & Chen, G. (2022). Is it a good move? Mining effective tutoring strategies from human–to–human tutorial dialogues. Future Generation Computer Systems, 127, 194-207.

Loading

ChatPDF as tutor

Educators concerned about AI and unable to generate productive ways their students could use AI tools need to check this out. The tool is called ChatPDF and is available using a browser or an iPad. At this point, it is free and available without an account.

Once connected you upload a pdf.  I wanted to give it a significant challenge and something I could evaluate easily for accuracy so I took a chapter I had written (the chapter on learning as applied to technology from the textbook I wrote with my wife (Integrating technology for meaningful learning) and uploaded it as a pdf file. I then began to ask for explanations, examples, and questions relevant to that chapter. I responded to the questions the AI tool generated and had my answers evaluated. What I have long thought potentially valuable AI was the role AI might play in functioning as a tutor. How can learners get flexible assistance when studying that they can shape to their needs? How can students discover what their needs are and then have their challenges addressed? 

While the system did require that I restart a couple of times, perhaps because I was working from a coffee shop with a sketchy connection, I was very impressed with the quality of the system. By quality, I was primarily interested in the accuracy of the content. Were the explanations accurate and different enough from the wording in the chapter to offer a reasonable opportunity for a learner to achieve a better understanding? Were the questions posed more than simplistic keyword vocabulary checks? Was the system flexible enough to understand me even when I got a little sloppy? 

Any educator should evaluate similar issues for themselves using material they might assign. I understand that content they might like to evaluate may not be available in pdf format, but as I understand the developers there is already a Google docs version and soon to be a Word version. 

There are a few differences between the browser and app versions I observed. The app version references short segments following its replies and the browser version gives a page number. I guess my preference would be the page number as I see value in a learner being able to go back and forth between the book (PDF) and the AI tool. As I have read about this tool there was a warning about the difficulty of the system making connections across different sections of a document and this must apply to transfer/applications external to the document as well. I make no claim that using this AI tool as a tutor is the equivalent of being able to work with a knowledgeable human tutor, but I would argue few students have the advantage of a knowledgeable human tutor at all times.

Take a look. Some example questions and responses the system generated are included in the following images.

The following image may be a little difficult to read, but I was trying to show the text segments the system tells you it primarily used to generate the content it displayed.

Loading