Improve AI Generated Multiple-Choice Questions

In a recent episode of the EdTech Situation Room, host Jason Neiffer made a very brief observation that educators could improve the effectiveness of AI-generated multiple-choice questions by adding a list of rules the AI tool should apply when writing the questions. This made sense to me. I understood the issue that probably led to this recommendation. I have written multiple times that students and educators can use AI services to generate questions of all types. In my own experience doing this, I found too many of the questions used structures I did not like and I found myself continually requesting rewrites excluding a type of question I found annoying. For example, questions that involved a response such as “all of the above” or a question stem asking for a response that was “not correct”. Taking a preemptive approach made some sense and set me on the exploration of how this idea might be implemented. Neiffer proposed an approach that involved making use of an online source for how to write quality questions. I found it more effective to maybe review such sources, but to put together my own list of explicit rules.

My approach used a prompt that looked something like this:

Write 10 multiple-choice questions based on the following content. The correct answer should appear in parentheses following each question. Apply the following rules when generating these questions. 

There should be 4 answer options.

“All of the Above” should not be an answer option.

“None of the Above” should not be an answer option.

All answer options should be plausible.

Order of answer options should be logical or random.

Question should not ask which answer is not correct. 

Answer options should not be longer than the question.

I would alter the first couple of sentences of this prompt if I was asking the AI service to use its own information base or I wanted to include a content source that should be the focus of the questions. If I was asking for questions generated based on the large language content alone, I would include a comment about the level of the students who would be answering the questions (e.g., high school students). For example, questions about mitosis and meiosis without this addition would include concepts I did not think most high school sophomores would have covered. When providing the AI service the content to be covered, I did not use this addition.

Questions based on a chapter

I have been evaluating the potential of an AI service to function as a tutor by interacting with a chapter of content. My wife and I have written a college textbook so I have authentic content to work with. The chapter is close to 10,000 words in length. In this case, I loaded this content and the prompt into ChatPDF, NotebookLM and ChatGPT. I pay $20 a month for ChatGPT and the free versions of the other two services. All proved to be effective.

ChatPDF

NotebookLM

With NotebookLM, you are allowed to upload multiple files that a prompt uses as a focus for the chat. For some reason rather than including my entire prompt, I had better results (suggested by the service) when I included the rules I wanted the system to apply as a second source rather than as part of the prompt.

ChatGPT

The process works a little differently with ChatGPT. I first copied the text from the pdf and pasted this content into the prompt window. I then scrolled to the beginning of this content and added my prompt. I could then ask the service to produce multiple question samples by asking for another 10 or 20 questions. I found some interesting outcomes when asking for multiple samples of questions. Even the format of the output sometimes changed (see the position of the answer in the following two examples).

**4. According to Clinton (2019), what is a potential impact of reading from a screen on metacognition?**

(A) Increased understanding

(B) Enhanced critical thinking

(C) Overconfidence and less effort

(D) Improved retention

(**C**)

**7. Which skill is considered a “higher order thinking skill”?**

(A) Word identification

(B) Critical thinking (**Correct**)

(C) Fact memorization

(D) Basic calculation

From sample to sample, some of the rules I asked ChatGPT to use were ignored. This slippage seemed unlikely in the initial response to the prompt.

What is an important consideration when designing project-based learning activities?**

(A) The amount of time available to students

(B) The availability of resources

(C) The level of student autonomy

(D) All of the above

(**D**)

Summary

The quality of multiple-choice questions generated using AI tools can be improved by adding rules for the AI service to follow as part of the prompt to generate questions. I would recommend that educators wanting to use the approach I describe here generate their own list of rules depending on their preferences. The questions used on an examination should always be selected for appropriateness, but the AI-based approach is a great way to easily generate a large number of questions to serve as a pool from which an examination can be assembled. Multiple choice exams should include a range of question types and it may be more efficient to write application questions because an educator would be in the best position to understand the background of students and determine what extension beyond the content in the source material would be appropriate.

Loading

AI Tutoring Update

This is an erratum in case my previous posts have misled anyone. I looked up the word erratum just to make certain I was using the word in the correct way. I have written several posts about AI tutoring and in these posts, I made reference to the effectiveness of human tutoring. I tend to provide citations when research articles are the basis for what I say and I know I have cited several sources for comments I made about the potential of AI tutors. I have not claimed that AI tutoring is the equal of human tutoring, but suggested that it was better than no tutoring at all, and in so doing I have claimed that human tutoring was of great value, but just too expensive for wide application. My concern is that I have proposed that the effectiveness of human tutoring was greater than it has been actually shown to be.

The reason I am bothering to write this post is that I have recently read several posts proposing that the public (i.e., pretty much anyone who does not follow the ongoing research on tutoring) has an inflated understanding of the impact of human tutoring (Education Next, Hippel). These authors propose that too many remember Bloom’s premise of a two-sigma challenge and fail to adjust Bloom’s proposal that tutoring has this high level of impact on student learning to what the empirical studies actually demonstrate. Of greater concern according to these writers is that nonreseachers including educational practitioners, but also those donating heavily to new efforts in education continue to proclaim tutoring has this potential. Included in this collection of wealthy investors and influencers would be folks like Sal Kahn and Bill Gates. I assume they might also include me in this group while I obviously have little impact in compared to those with big names. To be clear, the interest of Kahn, Gates, and me is really in AI rather than human tutoring, but we have made reference to Bloom’s optimistic comments. We have not claimed that AI tutoring was as good as human tutors, but by referencing Bloom’s claims we may have led to false expectations. 

When I encountered these concerns, I turned to my own notes from the research studies I had read to determine if I was aware that Bloom’s claims were likely overly optimistic. It turns out that I had read clear indications identifying what the recent posters were concerned about. For example, I highlighted the following in a review by Kulik and Fletcher (2016). 

“Bloom’s two sigma claim is that adding undergraduate tutors to a mastery program can raise test scores an additional 0.8 standard deviations, yielding a total improvement of 2.0 standard deviations.”

My exposure to Bloom’s comments on tutoring originally had nothing to do with technology or AI tutoring. I was interested in mastery learning as a way to adjust for differences in the rate of student learning. The connection with tutoring at the time Bloom offered his two-sigma challenge was that mastery methods offered a way to approach the benefits of the one-to-one attention and personalization provided by a human tutor. Some of my comments on mastery instruction and the potential of technology for making such tactics practical are among my earlier posts to this site. Part of Bloom’s claim being misapplied is based on his combination of personalized instruction via mastery tactics with tutoring. He was also focused on college-aged students in the data he cited. My perspective reading the original paper many years ago was not “see how great tutoring is”. It was more tutoring on top of classroom instruction is about is good as it is going to get and mastery learning offers a practical tactic that is a reasonable alternative.

As a rejoinder to update what I may have claimed, here are some additional findings from the Kulik and Fletcher meta-analysis (intelligent software tutoring).

The studies reviewed by these authors show lower benefits for tutoring when outcomes are measured on standardized rather than local tests, sample size is large, participants are at lower grade levels, the subject taught is math, a multiple-choice test is used to measure outcomes, and Cognitive Tutor is the ITS used in the evaluation.

However, on a more optimistic note, the meta-analysis conducted by these scholars found that in 50 evaluations intelligent tutoring systems led to an improvement in test scores of 0.66 standard deviations over conventional levels. 

The two sources urging a less optimistic perspective point to a National Board of Educators Research study (Nickow and Colleagues, 2020) indicating that human tutoring for K-12 learners was approximately .35 sigma. This is valuable, but not close to the 2.0 level.

Summary

I have offered this update to clarify what might be interpreted based on my previous posts, but also to provide some other citations for those who now feel the need to read more original literature. I have no idea whether Kahn, Gates, etc. have read the research that would likely indicate their interest in AI tutoring and mastery learning was overly ambitious. Just to be clear I had originally interpreted the interest of what the tech-types were promoting as mastery learning (personalization) which was later morphed into a combination with AI tutoring. This combination was what Bloom was actually evaluating. The impact of a two-sigma claim when translated into what such an improvement would actually mean in terms of rate of learning or change in a metric such as assigned grade seems improbable. Two standard deviations would move an average student (50 percentile) to the 98th percentile. This only happens in Garrison Keeler’s Lake Wobegon. 

References:

Bloom, B. S. (1984). The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Educational researcher, 13(6), 4-16.

Kulik, J. A., & Fletcher, J. D. (2016). Effectiveness of intelligent tutoring systems: a meta-analytic review. Review of educational research, 86(1), 42-78.

Nickow, A., Oreopoulos, P., & Quan, V. (2020). The impressive effects of tutoring on prek-12 learning: A systematic review and meta-analysis of the experimental evidence. https://www.nber.org/papers/w27476

Loading

AI, Tech Tools, and the Writing Process

Student access to AI has created a situation in which educators must consider when AI should and should not be used. I think about this question by considering the difference between what skill or skills are the focus of instruction and whether AI will replace a skill to improve the efficiency of the writing task or will support a specific skill in some way. It may also be useful to differentiate learning to write from writing to learn. My assumption is that unless specific skills are used by the learner those skills will not be improved. Hence when AI is simply used to complete an assignment a learner learns little about writing, but may learn something about using AI. 

Writing Process Model

The writing process model (Flower & Hayes, 1981) is widely accepted as a way to describe the various component skills that combine to enable effective writing. This model has been used to guide both writing researchers and the development of instructional tactics. For researchers, the model is often used as a way to identify and evaluate the impact of the individual processes on the quality of the final project. For example, better writers appear to spend more time planning (e.g., Bereiter & Scardamalia, 1987). For educators and instructional designers, understanding the multiple processes that contribute to effective writing and how these processes interact is useful in focusing instruction. 

Here, the writing process model will be used primarily to identify the subskills to be developed as part of learning to write and writing to learn and I will offer my own brief description of this model. It is worth noting that other than composing and rewriting products, other uses of technology to improve writing and increase the frequency of writing experiences seldom receive a lot of attention (Gillespie, Graham, Kiuhara & Hebert, 2014).

The model

The model identifies three general components a) planning, b) translation, and c) reviewing. 

Planning involves subskills that include setting a goal for the project, gathering information related to this goal which we will describe as research, and organizing this information so the product generated will make sense. The goal may be self-determined or the result of an assignment. 

Research may involve remembering what the author knows about a topic or acquiring new information. Research should also include the identification of the characteristics of the audience. What do they already know? How should I explain things so that they will understand? Finally, the process of organization involves establishing a sequence of ideas in memory or externally to represent the intended flow of logic or ideas.

What many of us probably think of as writing is what Flower and Hayes describe as translation. Translation is the process of getting our ideas from the mind to the screen and this externalization process is typically expected to conform to conventions of expression such as spelling and grammar.

Finally, authors read what they have written and make adjustments. This review may occur at the end of a project or at the end of a sentence. In practice, authors may also call on others to offer advice rather than relying on their own review.

One additional aspect of the model that must not be overlooked is the iterative nature of writing. This is depicted in the figure presenting the model by the use of arrows. We may be tempted, even after initial examination of this model, to see writing as a mostly linear process – we think a bit and jot down a few ideas, we use these ideas to craft a draft, and we edit this draft to address grammatical problems. However, the path to a quality finished product is often more circuitous. We do more than make adjustments in spelling and grammar. As we translate our initial ideas, we may discover that we are vague on some points we thought we understood and need to do more research. We may decide that a different organizational scheme makes more sense. This reality interpreted using our tool metaphor would suggest that within a given project we seldom can be certain we have finished the use of a given tool and the opportunity to move back and forth among tools is quite valuable.

Tech tools and writing

Before I get to my focus on AI tools, it might be helpful to note that technology tools used to facilitate writing subprocesses have existed for some time. For example, spelling and grammar checkers, outline and concept mapping, note-taking and note-storage, citation managers, online writing environments allowing collaboration and commenting, and probably many other tools that improve the efficiency and effectiveness of writing and learning to write. Even the use of a computer allows advantages such as storage of digital content in a form that can easily be modified rather than the challenge of making improvements to content stored on paper. The digital alternative to paper changes how we go about the writing process. I have written about technology for maybe 20 years and one of the bextbooks offered the type of analysis I am offering here not about AI tools, but about the advantages of writing on a computer and using various digital tools. 

A tool can substitute for a human process or a tool can supplement or augment a human process. This distinction is important when it comes to writing to learn and learning to write. When the process is what is to be learned, this substitution is likely to be detrimental as it allows a learner to skip needed practice. In contrast, augmentation often allows the opposite as a busy work activity or some incapability is taken care of allowing more important skills to become the focus. 

Here are the types of tools I see as supporting individual writing processes. 

Planning – Organization and Research

Prewriting involves developing a plan for what you want to get down on paper (or screen in this case). A writer goes about these two subprocesses in different ways. You can think or learn about a topic (research) and then organize these ideas in some way to present. Or, you can generate a structure of your ideas (organize) and then research the topics to come up with the specifics to be included in a presentation. Again, these are likely iterative processes no matter which subskill goes first.

One thing AI does very well is to propose an outline if you are able to generate a prompt describing your goals. You could then simply ask the AI service to generate something based on this outline, but this would defeat the entire purpose of learning about the topic by doing the research to translate the outline into a product or developing writing skills by expanding the outline into a narrative yourself.

Since I am writing about how AI might perform some of the subskills identified by the writing process model, I asked ChatGPT to create an outline using the following prompt. 

“Write an outline for ways in which ai can be used in writing. Base this outline on the writing subprocesses of the writing process model and include examples of AI services for the recommended activity for each outline entry.”

The following shows part of the outline ChatGPT generated. I tend to trust ChatGPT when it comes to well established content and I found the outline although a little different from the graphic I provided above to be quite credible and to offer reasonable suggestions. As a guide for writing on the topic I described, it would work well. 

I had read that AI services could generate concept maps which would offer a somewhat different way to identify topics that might be included in a written product. I tried this several times using a variety of prompts with ChatGPT’s DALLE. The service did generate a concept map, but despite making several follow-up requests which ChatGPT acknowledged, I could not get the map to contain intelligible concept labels. Not helpful.

Translation

Tools for improving the translation process have existed in some form for a long time. The newest versions are quite sophisticated in providing feedback beyond basic spelling and grammatical errors. I write in Google docs and make use of the Grammarly extension.

I should note that Grammarly is adding AI features that will generate text. Within the perspective I am taking here I have some concerns about these additions. Since I am suggesting that writing subskills can be replaced or supported, student access to Grammarly could allow writing subskills the educator was intending students to perform themselves to be performed to some degree by the AI. 

If you have not tried Grammarly, the tool identifies different types of modifications the tool proposes different modifications the writer might consider changing (spelling, missing or incorrect punctuation, alternate wording, etc.) and will make these modifications if the writer accepts the suggestion. The different types of recommendations are color-coded (see following image). 

Revision

I am differentiating changes made while translating (editing) from changes made after translating (revision). Minor changes such as spelling and grammar would seem more frequently fixed as edits by this distinction and major modifications made (addition of examples, restructuring of sections, deletion of sections, etc.) while revising. Obviously, this is a simplistic differentiation and both types of changes occur during both stages). 

I don’t know if I can confidently recommend a role for AI for this stage. Pre-AI, one might recommend that a writer share their work with a colleague and ask for suggestions. The AI version of Grammarly seems to be moving toward such capabilities. Already, a writer can ask AI to do things like shorten a document or generate a different version of a document. I might explore such capabilities out of curiosity and perhaps to see how modifications differ from my original creations, but for work that is to be submitted for evaluation of writing skill would that be something an educator would recommend? 

I have also asked an AI tool to provide an outline, identify main ideas or generate a summary of a document I have written just to see what it generates. Does the response to one of these requests surprise me in some way? Sometimes. I might add headings and subheadings to identify a structure I thought was not as obvious as I had thought. 

Conclusion:

My general point in this post was that questions of whether learners can use AI tools when assigned writing tasks should be considered in a more complex way. Rather than the answer being yes or no, I am recommending that learning to write and writing to learn are based on subprocesses and the AI tool question should be considered in response to a consideration of whether the learner was expected to be developing proficiency in executing a subprocess. In addition, it might be important to suggest that learning how to use AI tools could be a secondary goal. 

Subprocess here were identified based on the Writing Process Model and a couple of suggestions were provided to illustrate what I mean by using a tool to drastically reduce the demands of one of the subprocesses. There are plenty of tools out there not discussed and my intention was to use these examples to get you thinking about this way of developing writing skills.

References:

Bereiter, C., & Scardamalia, M. (1987). An attainable version of high literacy: Approaches to teaching higher-order skills in reading and writing. Curriculum inquiry17(1), 9-30.

Flower, L., & Hayes, J. R. (1981). A cognitive process theory of writing. College composition and communication32(4), 365-387.

Gillespie, A., Graham, S., Kiuhara, S., & Hebert, M. (2014). High school teachers use of writing to support students’ learning: A national survey. Reading and Writing27, 1043-1072.

Loading

Obsidian/Smart Connections Workflow

I use Obsidian plus the plugin Smart Connections to inform my blog writing activities. I write for educational practitioners and academics so I try to carefully base my content on sources that I have read and in many cases intend to cite in the content I generate. With this goal, Obsidian represents an archive I have developed over several years to store and organize notes from hundreds of books, journal articles, and websites. I explore my collection in different ways sometimes seeking notes on a specific article I want to emphasize and sometimes exploring to locate what I have read that is relevant to a topic that I might want to include but perhaps do not recall at the time. 

In some cases, I want to use an AI tool to support my writing. I seldom use AI to actually generate the final version of content I post, but I may explore the possible organization of material for something I want to write or I might use an AI tool to generate an example of how I might explain something based on the notes I have made available to the AI tool. 

The combination of Obsidian augmented by the Smart Connections plugin allows me to implement a workflow I have found useful and efficient. I have several specific expectations of this system:

  1. I have already read the source material and taken some notes or generated some highlights now stored in Obsidian. I want to write based on this content.
  2. I may not recall relevant sources I have stored in Obsidian because of the passage of time and the accumulation of a large amount of material. I want the AI system to understand my goals and locate relevant content. 
  3. I want the AI system to identify specific sources from the content I have reviewed rather than the large body used to train the LLM. I want the system to identify the specific source(s) from this material associated with specific suggestions so that I am aware of the source and can cite a source if necessary.
  4. When a specific source has been identified I want to be able to go directly to the original document and the location within that document that is the location for the note or highlight that prompted the inclusion in the AI content so that I can reread the context for that note or highlight.

Obsidian with the Smart Connections plugin does these things and is to some extent unique because all of the material (the original content) is stored locally (actually within iCloud which functions as an external harddrive) allowing the maintenance of functioning links between the output from Smart Connections, the notes/highlights stored in Obsidian, and the original documents (pdfs of journal articles, Kindle books, web pages). 

I do not know for certain that the Obsidian-based approach I describe is the only way to take the approach I take. I am guessing my approach works in part because I am not relying on an online service and online storage. I also use Mem.ai because it allows me to focus on my own content, but linking back to source documents does not work with this service. Mem.ai does include the AI capabilities as part of the subscription fee, but I don’t know when this might be an advantage. The Smart Connections plugin does require the use of an OpenAI API (ChatGPT) and there is a fee for this access.

Example:

Here is an example of what working with the Obsidian/Smart Connections setup is like. I am working on a commentary on the advantages and disadvantages of K12 students having access to AI in learning to write and writing to learn. I propose that writing involves multiple subprocesses and it is important to consider how AI might relate to each of these subprocesses. My basis for the list of subprocesses is based on the classic Flower and Hayes Writing Process Model. I had written a description of the Writing Process Model for a book I wrote and this section of content was stored within Obsidian as well as notes from multiple sources on AI advantages and disadvantages in the development of writing skills. I have not read a combination of the writing process model with ideas about the advantages and disadvantages of AI so this is the basis for what I think is an original contribution.

The following is a screenshot of Obsidian. The Smart Connection appears as a panel on the right side of the display. The left-hand panel provides a hierarchical organization of note titles and the middle panel provides access to an active note or a blank space for writing a new note. 

In the bottom textbox of the Smart Connections panel, I have entered the following prompt:

Using my notes, how might AI capabilities be used to improve writer functioning in the different processes identified by the writing process model. When using information from a specific note in your response, include a link to that note. 

Aside from the focus of the output, two other inclusions are important. First, there is the request to “use my notes”. This addition is recommended to ensure a RAG (retrieval augmented generation) approach. In other words, it asks the AI service use my notes rather than the general knowledge of the AI system as the basis for the output. The second supplemental inclusion is the request to include a link to that note which is intended to do just what it says – add links I can use to to see where ideas in the output came from.

The output from Smart Connections is in markdown. I copied this output into a new blank note and the links included are now active.

I purposefully selected a note that initially was part of a web page for this final display. I had originally used a tool that allowed the annotation of web pages and then the exporting of the annotated and highlighted content as a markdown file I added to Obsidian. This file included the link from the note file back to the online source. As you can see, the link from Obsidian brought up the web page and with the assistance of the activated service added as an extension to my browser displays what I had highlighted within this web page. Interesting and useful.

Conclusion:

We all have unique workflows and use digital tools in different ways because of differences in what we are trying to accomplish. What I describe in this post is an approach I have found useful and I have included related comments on why. I hope you find pieces of this you might apply yourself.

Loading

AI tutoring based on retrieval generated augmentation

Here is a new phrase to add to your repertoire – retrieval generated augmentation (RAG). I think it is the term I should have been using to explain my emphasis in past posts to my emphasis on focusing AI on notes I had written or content I had selected. Aside from my own applications, the role for retrieval generated augmentation I envisioned is as an educational tutor or study buddy.

RAG works in two stages. The system first retrieves information from a designated source and then uses generative AI to take some requested action using this retrieved information. So, as I understand an important difference, you can interact with a large language model based on the massive corpus of content on which that model was trained or you can designate specific content to which the generative capabilities of that model will be applied. I don’t pretend to understand the specifics, but this description seems at least to be descriptive. Among the benefits is a reduction in the frequency of hallucinations. When I propose using AI tools in a tutoring relationship with a student, suggesting to the tool that you want to focus on specific information sources seems a reasonable approximation to some of the benefits a tutor brings. 

I have tried to describe what this might look like in previous posts, but it occurred to me that I should just record a video of the experience so those with little experience might see for themselves how this works. I found trying to generate this video an interesting personal experience. It is not like other tutorials you might create in that it is not possible to carefully orchestrate what you present. What the AI tool does cannot be perfectly predicted. However, trying to capture the experience as it actually happens seems more honest.

A little background. The tool I am using in the video is Mem.ai. I have used Mem.ai for some time to collect notes on what I read so I have a large collection of content I can ask the RAG capabilities of this tool to draw on. To provide a reasonable comparison to how a student would study course content, I draw some parallels based on the use of note tags and note titles. Instead of using titles and tags in the way I do, I propose a student would likely take course notes and among the tags label notes for the next exam with something like “Psych1” to indicate a note taken during the portion of a specific course before the first exam to which that note might apply. I hope the parallels I explain make sense.

Loading

Content focused AI for tutoring

My explorations of AI use to this point have resulted in a focus on two applications – AI as tutor and AI as tool for note exploration. Both uses are based on the ability to focus on information sources I designate rather than allowing the AI service to rely on its own body of information. I see the use of AI to interact with the body of notes I have created as a way to inform my writing. My interest in AI tutoring is more related to imagining how AI could be useful to individual students as they study assigned content.

I have found that I must use different AI services for these different interests. The reason for this differentiation is that two of the most popular services (NotebookLM and OpenAI’s Custom GPTs) limit the number of inputs that can be accessed. I had hoped that I could point these services at a folder of notes (e.g., Obsidian files) and then interact with this body of content. However, both services presently allow only a small number of individual files (10 and perhaps 20) can be designed as source material. This is not about the amount of content as the focus of this post involves using these two services to interact with a single file of 27,000 words. I assume in a year the number of files will be less of an issue.

So, this post will explore the use of AI as a tutor applied to assigned content as a secondary or higher ed student might want to do. In practice, what I describe here would require that a student would have access to a digital version of assigned content not protected in some way. For my explorations, I am using the manuscript of a Kindle book I wrote before the material was converted to a Kindle book. I wanted to work with a multi-chapter source of a length students might be assigned.

NotebookLM

NotebookLM is a newly released AI service from Google. The AI prompts can be focused on content that is available in Google drive or uploaded to the service. This service is available at no cost, but it should be understood that this is likely to change when Google is ready to offer a more mature service. Investing time in this service rather than others allows the development of skills and the exploration of potential, but in the long run some costs will be involved.

Once a user opens NotebookLM and creates a notebook (see red box surrounding new notebook), external content to be the focus of user prompts can be added (second image). I linked Notebook to the file I used in preparation for creating a Kindle book. Educators could create a notebook on unprotected content they wanted students to study.

The following image summarizes many essential features used when using NotebookLM. Starting with the right-hand column, the textbox near the bottom (enclosed in a red box) is where prompts are entered. The area above (another red box) provides access to content used by the service in generating the response to a prompt. The large area on the left-hand side displays the context associated with one of the areas referenced with the specific content used highlighted. 

Access to a notebook can be shared and this would be the way an educator would provide students access to a notebook prepared for their use. In the image below, you will note the icon (at the top) used to share content, and when this icon is selected, a textbox for entering emails for individuals (or for a class if already prepared) appears.

Custom GPTs (OpenAI)

Once you have subscribed to the monthly payment plan for ChatGPT – 4, accessing the service will bring up a page with the display shown below. The page allows access to ChatGPT and to any custom GPTs you have created. To create a Custom GPT you select Explore and then select Create a GPT. Describing the process of creating a GPT would require more space than I want to use in this post, but the process might best be described as conversational. You basically interact by describing what you are trying to create and you upload external resources if you want prompts to be focused on specific content. Book Mentor is the custom GPT I created for this demonstration.

Once created, a GPT is used very much in the same way a NotebookLM notebook is used. You use the prompt box to interact with the content associated with that GPT.

What follows are some samples of my interactions with the content. You should be able to see the prompt (Why is the word layering used to describe what the designer does to add value to an information source?)

Prompts can generate all kinds of ways of interaction (see a section below that describes what some of these interactions might be). One type I think has value in using AI as a tutor is to have the service ask you a question. An example of this approach is what is displayed in the following two images. The first image describes a request for the service to generate a multiple-choice question about generative activity which I then respond (correctly) and receive feedback. The second image shows the flexibility of the AI. When responding to the question, I thought a couple of the responses could be correct. After I answered the question and received feedback, I then asked about an answer I did not select wondering why this option could not also be considered correct. As you see in the AI reply, the system understands my issue and acknowledges how it might be correct. This seems very impressive to me and demonstrates that the interaction with the AI system allows opportunities that go beyond self-questioning.

Using AI as tutor

I have written previously about the potential of AI services to interact with learners to mimic some of the ways a tutor might work with a learner. I make no claims of equivalence here. I am proposing only that tutors are often not available and an AI system can challenge a learner in many ways that are similar to what a human tutor would do. 

Here are some specific suggestions for how AI can be used in the role of tutor

Summary

This post describes two systems now available that allow learners to work with assigned content that mimics how a tutor might work with a student. Both systems would allow a designer to create a tool focused on specific content that can be shared. ChatGPT custom GPTs require that those using a shared GPT have an active $20 per month account which probably means this approach would not presently be feasible for common application. Google’s Notebooks can be created at no cost to the designer or user, but this will likely change when Google decides the service is beyond the experimental stage. Perhaps the capability will be included in present services designed for educational situations.

While I recognize that cost is a significant issue, my intent here is to propose services that can be explored as proof of concept and those educators interested in AI opportunities might explore future productive classroom applications of AI. 

Loading

Turning AI on my content

In reviewing the various ways I might use AI, I am starting to see a pattern. There are uses others are excited about that are not relevant to my life. There are possible uses that are relevant, but I prefer to continue doing these things myself because I either enjoy the activity or feel there is some personal benefit beyond the completion of a given project. Finally, there are some tasks for which AI serves a role that augments my capabilities and improves the quality or quantity of projects I am working on. 

At this time, the most beneficial way I use AI is to engage an AI tool in discussing a body of content I have curated or created as notes and highlights in service of a writing project I have taken on. There are two capabilities here that are important. First, I value the language skills of an AI service, but I want the service to use this capability only as a way to communicate with me about the content I designate. I am not certain I know exactly what this means as it would be similar to saying to an expert with whom I was interacting tell me about these specific sources without adding in ideas from sources I have not asked you to explore. Use your general background, but use this background only as a way to explain what these specific sources are proposing. What I mean is don’t add in stuff to address my prompt that does not exist within the sources I gave you.

Second, if I ask an AI service about the content I have provided, I want the service to be able to identify the source and possibly the specific material within a source that was the basis for a given position taken. Think of this expectation as similar to the expectation one might have in reading a scientific article to which the author provides citations for specific claims made. My desire here is to be able to evaluate such claims myself. I have a concern in simply basing a claim on the language of sources not knowing the methodology responsible for producing data used as a basis for a claim. For serious work, you need to read more than the abstract. Requiring a precise methodology section in research papers is important because the methodology establishes the context responsible for the generation of the data and ultimately the conclusions that are reached. Especially in situations in which I disagree with such conclusions, I often wonder if the methodology applied may explain the differences between my expectations and the conclusions reached by the author. Human behavior is complex and variables that influence behavior are hardly ever completely accounted for in research. Researchers do not really lie with statistics, but they can mislead by broad conclusions they share based on a less-than-perfect research method. There are no perfect research methods hence the constant suggestion that more research is needed. 

Several services approximate the characteristics I am looking for. I will identify three such services. I had hoped to add a fourth, but I intended to subscribe to the new OpenAI applications recently announced, but the $20 a month subscription fee necessary to use these functions was recently suspended so I will have to wait to explore these functions until OpenAI decides to expand the user base. 

The three services I have worked with include NotebookLM, Mem.ai, and Smart Connections with Obisidan. I have written about Mem.ai and Smart Connections in previous posts, so I will use NotebookLM for extended comments and then offer quick examples produced by the other two services.

NotebookLM

NotebookLM recently was made available to me so I have had less experience using this tool than the other two I have included. 

At this time you can create a notebook based on 5 pdfs or 10 text segments. There is a limit to how much content these units of information can total, but the pdfs I used were research articles so this is a lot of content. In the following image, icons for the 5 pdfs appear in the column on the left of the screen. All of the pdfs have been selected for the interaction I intend with the AI tool and a check beside each pdf indicates this is the case. The window for submitting my prompts appears at the bottom of the image.

The second image shows the results from the prompt I have submitted (I will explain my interest in interacting with these pdfs following this description). The response to this query is provided above the prompt window (I will share the text) and below this window you will see that the summary was based on 10 “citations”. Selecting one of these citations brings up the content that now appears in the left-hand column. This window shows the content of the pdf and some summary comments about that document generated by the AI. The AI also proposes additional prompts a user might want to apply to that content. 

The task I asked all three services to address was related to a recent post I wrote describing generative cognitive tasks and comparisons among the impacts of these generative activities. The pdfs were research studies I had reviewed and were the basis for that post. I had detailed familiarity with this content from both reviewing and annotating this material and writing the previous post and I use this approach of first engaging a process without AI to provide a basis for commenting on what the AI provides. The prompt that follows is what I submitted to NotebookLM to see what would be generated. I will use a similar prompt with the other two AI services/

Prompt: Use only my notes, generate a 400 word summary of generative learning activities and include what is known about the relative effectiveness of different activities.

NoteBook response

Generative learning is a type of learning in which students are actively engaged in creating new knowledge by interacting with the material. This is in contrast to passive learning, in which students simply listen to a lecture or read a textbook.

Generative learning activities are designed to encourage students to think critically about the material and to make connections between new information and their existing knowledge.

There is a significant body of research evidence that supports the effectiveness of generative learning activities. A meta-analysis of 500 studies found that generative learning activities resulted in better learning outcomes than traditional, passive learning methods (Coffey et al., 2003).

Some of the most effective generative learning activities include:

  • Summarizing: Students are asked to summarize a text in their own words. This activity helps students to identify the main ideas of the text and to understand the relationships between them.
  • Questioning: Students are asked to generate questions about a text. This activity helps students to think critically about the text and to engage with the material in a deeper way.
  • Discussion: Students are asked to discuss a text with their classmates. This activity helps students to share their understanding of the text and to learn from each other.

The relative effectiveness of different generative learning activities depends on the specific learning goals. For example, summarizing is a good activity for helping students to remember information, while questioning is a good activity for helping students to understand the material in a deeper way.

Generative learning activities can be used in a variety of settings, including K-12 classrooms, higher education, and corporate training. They are a valuable tool for helping students to learn and remember new information in a more effective way.

***

You will note in this material that NoteBook included a reference. This was not one of the pdfs I provided for review so I thought it might have been included in the reference section of one of the pdfs. I asked Notebook to provide a full citation and it generated the following. 

Coffey, J. E., & McDonald, J. (2003). The effects of generative learning activities on student achievement: A meta-analysis. Contemporary Educational Psychology, 28(4), 240-263. doi:10.1016/j.cedpsych.2003.09.001

While this citation looks authentic and has the proper APA format, it is made up. I tried searching Google Scholar and used my library online access to search the location in Contemporary Educational Psychology and no such article exists. Google asks for user feedback and I provided the fake citation in my evaluation. 

***

Mem.AI

My use of Mem.ai and Smart Connections require a little different approach. The content stored in these services results from my exporting notes and highlights from the research articles I read. Both services are directed at my entire collection of notes so this is different from what NotebookLM could access. However, the citations indicated that the output was based on very similar source material.

Prompt: Use only my notes, generate a 400 word summary of generative learning activities and include what is known about the relative effectiveness of different activities.

Mem.ai response

Generative learning is a process where learners actively engage with material, creating new knowledge and connections. This approach involves strategies such as summarization, teaching, self-testing, and self-explaining, which require learners to select and organize relevant information, elaborate on the material based on personal knowledge, and integrate new information with existing knowledge 1^.

Fiorella and Mayer (2016) identified eight strategies that promote generative learning, including summarization, mapping, drawing, imagining, self-testing, self-explaining, teaching, and enacting. These strategies help learners to create a product based on existing knowledge 1^.

Brod (2021) focused on age differences in generative learning strategies and emphasized the importance of producing a meaningful product that goes beyond simple paraphrasing or retelling. For instance, highlighting does not result in new content, but summarization does. Brod’s review suggests that summarizing becomes useful from high school onwards, while testing is beneficial for all ages 2^.

Chi’s (2009) framework differentiates learning activities into active, constructive, and interactive categories. Active engagement with the material can activate existing knowledge and store new input, while constructive activities involve creating a product based on existing knowledge. Interactive activities involve engaging with others to produce a product that integrates external input with personal knowledge 1^.

Fiorella (2023) proposed a new framework of sense-making that distinguishes among three sense-making modes—explaining, visualizing, and enacting—that serve unique and complementary cognitive functions. The explaining mode generalizes knowledge, the visualizing mode organizes knowledge, and the enacting mode simulates knowledge 3^.

In conclusion, generative learning involves actively engaging with the material to create new connections and deeper understanding. The effectiveness of different generative learning activities can vary based on factors such as age and the specific learning context. However, strategies such as summarization, teaching, self-testing, and notetaking with revision and elaboration can promote generative learning and enhance understanding of the material.

Summarize my notes on generative learning Tue 4/18

Brod’s generative Fri 4/14

Fiorellaet2023 Fri 10/13

1^: Summarize my notes on generative learning

2^: Brod’s generative

3^: [Fiorellaet2023](https://mem.ai/m/cqo3CLWb5tkb8syut

At the end of each paragraph in this output, you see a number. The number corresponds to the citation at the bottom. About the three links that take a user to that note, you see the names of the notes. So, this output is carefully linked to sources. I will provide an example of what my notes look like in association with the output from Smart Connections, but each note contains a citation of the article from which I created that note. 

***

Smart Connections (within Obsidian)

Obsidian contains my notes from research articles I have read and also highlights from other sources provided to Obsidian through Readwise. I tell you this again because I do not have a comparison among AI services based on exactly the same base of resources. However, again, you will note similar sources are found. Note also that the prompt is a little different to get Smart Connections to include citations. 

Prompt: Use only my notes, generate a 400 word summary of generative learning activities and include what is known about the relative effectiveness of different activities. Include links to the individual source notes within your summary

Response:

Based on your notes, generative learning involves actively making sense of new information by mentally reorganizing and integrating it with one’s prior knowledge (Fiorella and Mayer, 2016). It is a constructive process that requires learners to select important information, reorganize it, and relate it to what they already know (Fiorellaetal2016). 

Fiorella and Mayer (2016) identified eight strategies that promote generative learning: summarizing, mapping, drawing, imagining, self-testing, self-explaining, teaching, and enacting. Summarizing involves selecting the main ideas, organizing them into a logical structure, and integrating new information with prior knowledge (Fiorellaetal2016). It is more effective for relatively simple concepts and requires training to use effectively (Fiorellaetal2016). 

Mapping and drawing also involve transforming information into different forms of representation (Fiorellaetal2016). These strategies help learners build connections and organize the material (Fiorellaetal2016). 

Self-testing, or retrieval-based learning, requires learners to actively retrieve information from memory without access to the provided materials (Fiorellaetal2016). It strengthens memory and makes it more accessible in the future (EndresHigh). 

Self-explaining involves explaining the content of a lesson to oneself during learning (Fiorellaetal2016). It promotes meaningful learning outcomes when learners engage in reflective knowledge building (Fiorellaetal2016). 

Teaching, on the other hand, involves selecting relevant information, organizing it into a coherent structure, and elaborating on it by incorporating existing knowledge (Fiorellaetal2016). It sets itself apart from other generative strategies by focusing on helping others learn (Fiorellaetal2016). 

Based on the research by Fiorella and Mayer (2016), it is more appropriate to consider each generative strategy as suitable for certain types of learning and learning contexts rather than regarding one strategy as the most effective (Fiorellaetal2016). 

In a study by Koh, Lee, and Lim (2018), the learning benefits of teaching were investigated. The results showed that teaching improved learning outcomes when it involved retrieval practice, but not when learners used a prepared script (Generative > TeachingAsRetrieval). This supports the idea that retrieval practice is an effective generative learning activity. 

Overall, generative learning activities have been shown to be effective in promoting meaningful learning outcomes. However, the effectiveness of each activity may vary depending on the complexity of the material and the training provided to learners. It is important to consider the specific learning goals and contexts when selecting and implementing generative learning strategies (Fiorellaetal2016). 

Based on your notes, here are the individual notes that provide more information on each generative learning activity:

– [[Generate1]]: Provides an overview of generative learning theory and the eight strategies identified by Fiorella and Mayer (2016).

– [[Summarizing]]: Describes the summarizing strategy and its effectiveness in selecting main ideas and creating quality summaries.

– [[Teaching]]: Discusses the goal of teaching and its distinction from other generative strategies.

– [[Self-Testing]]: Explains the self-testing strategy and its role in retrieval-based learning.

– [[Promote generative learning]]: Provides a link to the article by Fiorella and Mayer (2016) that identifies the eight strategies and discusses their effectiveness.

– [[Fiorellaetal2016]]: Summarizes the key points from Fiorella and Mayer’s (2016) article on generative learning strategies.

– [[EndresHigh]]: Discusses the importance of task differences in retrieval practice and the effects of different types of recall tasks.

– [[Yang2021]]: Highlights a meta-analysis on the effectiveness of retrieval practice in classroom learning and raises questions about the mechanism and student perception of its benefits.

***

Note: The output of Smart Connections is in markdown and the terms included in double parentheses are links that connect to the source note. So, if the summary above was opened in an app that would interpret markdown, the text within the double parentheses would appear as a link and the link would take me to a file stored on my computer. The file is named Generate1.

Here is an example of one of the original notes that was identified as source material. 

Generative learning makes sense of new information by reorganizing it and relating it to existing knowledge. This position comes from Wittrock, but is similar to other theorists (Mayer, Piaget). This specific article identified eight learning strategies that promote generative learning and provides a review of research relevant to each strategy.

[[Summarizing]]

Mapping

Drawing

Imagining

[[Self-Testing]]

Self-Explaining

[[Teaching]]

Enacting

The first four strategies (summarizing, mapping, drawing, and imagining) involve changing the input into a different form of representation.

The final four strategies (self-testing, self-explaining, teaching, and answering practice questions) require additional elaboration. 

Fiorella, L., & Mayer, R. E. (2016). Eight ways to promote generative learning. _Educational Psychology Review, 28(4), 717-741.

***

Summary

Keeping in mind my recognition that the AI of the three AI services was applied to slightly different content, I would argue that Smart Connections and Mem.ai are presently more advanced than NotebookLM. Eventually, I assume a user will be able to direct NotebookLM at a folder of files so the volume of content would be identical. Google does acknowledge that Notebook is still in the early stages and access is limited to a limited number of individuals willing to test and provide feedback. The content generated by all of the services was reasonable, but NoteBook did hallucinate a reference. 

My experience in comparing services indicates it is worth trying several in the completion of a given task. I have found it productive to keep both Smart Connections and Mem.ai around as the one I find most useful seems to vary. I do pay to use both services.

Loading