Evaluating AI Tools

Like so many other educators, I am trying to decide what I think of AI tools. Granted, the tools produce some reasonable prose. However, how accurate is the content generated? Here is an example.

I recently generated a post focused on the addition of an AI article summarization capability added to MEMEX Garden. The summarization I offered as my example was based on this article describing the new AI capabilities being added to Kahn Academy resources. I understand there are several references here to AI, but the focus should be on the summary of the article. The summary appears below.

Khan Academy has launched a limited pilot of a new learning guide called Khanmigo for educators, powered by GPT-4, the next generation of OpenAI’s large language model technology. GPT-4 is the successor to OpenAI’s GPT-3.5, which ChatGPT is built upon. It can accept images as input, handle up to 25,000 words of text at once, and is designed with more safety and alignment features, among other changes.

Khanmigo is designed to help teachers create personalized learning plans for their students. It uses GPT-4 to analyze student data and provide recommendations for activities and resources that will help them learn more effectively. The tool also includes features like progress tracking and assessment tools to help teachers monitor student progress.

The pilot program is currently limited to a small group of educators, but Khan Academy plans to expand it in the future. The company believes that Khanmigo has the potential to revolutionize education by providing teachers with powerful tools to help them personalize learning for each student.

The summary sounds great, but after reading the article I realized I had a very different interpretation of the focus of the article. By my interpretation, the key capability of the new AI capability of Kahn’s Khanmigo was to engage interactively with students as tutor. Asking about their understanding and then attempting to guide their thinking by asking questions that would encourage them to think in a different way about what they were trying to understand. My interpretation would have suggested that the most important capability was to serve as a tutor to the students more than providing guidance to the teachers.

So, compare my interpretation with the second paragraph I have bolded in the summary provided by the AI. I would encourage your attention to the original article to see if you agree with my assessment.

I copied a few of the statements from the original article to show where I got my interpretation.

“In a blog post, Khan Academy Founder Sal Khan wrote: “When GPT-4 is carefully adapted to a learning environment like Khan Academy, it has enormous potential. It can guide students as they progress through courses and ask them questions like a tutor would. AI can assist teachers with administrative tasks, which saves them valuable time so they can focus on what’s most important — their students.”

I think there is a big difference between arguing that a product helps the student versus helps the teacher simply because these positions mean very different things to me as someone interested in the history of mastery learning and the role of tutors in this instructional approach. Is this quibbling? If my interpretation is correct, I don’t think this is a difference of no consequence.

Loading