The quest to build a better AI tutor

by Jill Barshay, The Hechinger Report
April 6, 2026

It’s easy to get swept up in the hype about artificial intelligence tutors. But the evidence so far suggests caution.

Some studies have found that chatbot tutors can backfire because students lean on them too heavily, get spoonfed solutions and fail to absorb the material. Even when AI tutors are designed not to give away answers, they haven’t consistently produced better results than learning the old-fashioned way without AI.

Still, researchers who have produced these skeptical studies haven’t given up hope. Some are still experimenting, trying to build better AI tutors.

One promising idea has less to do with how an AI tutor explains concepts and more with what it asks students to practice next.

A team at the University of Pennsylvania, which included some AI skeptics, recently tested this approach in a study of close to 800 Taiwanese high school students learning Python programming. All the students used the same AI tutor, which was designed not to give away answers.

But there was one key difference. Half the students were randomly assigned to a fixed sequence of practice problems, progressing from easy to hard. The other half received a personalized sequence with the AI tutor continuously adjusting the difficulty of each problem based on how the student was performing and interacting with the chatbot.

The idea is based on what educators call the “zone of proximal development.” When problems are too easy, students get bored. When they’re too hard, students get frustrated. The goal is to keep students in a sweet spot: challenged, but not overwhelmed.

The researchers found that students in the personalized group did better on a final exam than students in the fixed problem group. The difference was characterized as the equivalent of 6 to 9 months of additional schooling, an eye-catching claim for an after-school online course that lasted only five months. The AI tutor’s inventor, Angel Chung, a doctoral student at the Wharton School, acknowledged that her conversion of statistical units was “not a perfect estimate.” (A draft paper about the experiment was posted online in March 2026, but has not yet been published in a peer-reviewed journal.)

Still, this is early evidence that small tweaks — in this case, calibrating the difficulty of the practice problems to the student — can make a difference.

Chung said that ChatGPT’s responses may already feel very personal because they are directly responding to a student’s unique questions. But that level of personalization isn’t enough. “Students usually don’t know what they don’t know,” said Chung. “The student doesn’t have the ability to ask the right questions to get the best tutoring.”

To address this, Chung’s team combined a large language model with a separate machine-learning algorithm that analyzes how students interact with the online course platform — how they answer the practice questions, how many times they revise or edit their coding, and the quality of their conversations with the chatbot — and uses that information to decide which problem to serve up next.

How different students interact with the chatbot tutor

In other words, personalization isn’t just about tailoring explanations. It’s about tailoring the learning path itself.

That idea isn’t new.

Long before generative AI tools like ChatGPT were invented, education researchers developed “intelligent tutoring systems” that tried to do something similar: estimate what a student knew and deliver the right next problem. These earlier systems couldn’t produce natural conversations, but they could provide hints and instant feedback. Rigorous studies found that well-designed versions helped students learn significantly more.

Their Achilles’ heel was engagement. Many students simply didn’t want to use them.

Today’s AI tools could help address that problem. Students might feel more interested in a chatbot that converses with them in an almost human way.

In the University of Pennsylvania study, students in the personalized group spent more time practicing, about three additional minutes per problem, adding up to about an hour per module in the Python course, compared with half as much time (a half hour or less) for the comparison students. The researchers think these students did better because they were more engaged in their practice work.

Students’ previous knowledge of a subject affected how well the personalized sequencing worked. Students who were new to Python gained more than those who already had Python experience, who did just as well with the fixed sequence of practice problems. Students from less elite high schools also appeared to benefit more.

How students’ background affected results

All the Taiwanese students in this study volunteered for an optional computer programming course that could strengthen their college applications. Many were highly motivated, with highly educated parents, and many already had prior coding experience.

It’s not clear whether the chatbot would work as well with less motivated students who are behind at school and most in need of extra help.

One possible solution: fusing new and old.

Ken Koedinger, a professor at Carnegie Mellon University and a pioneer of intelligent tutoring systems, is experimenting with using new AI models to alert remote human tutors who can motivate struggling students who are drifting off. “We are having more success,” said Koedinger.

Humans aren’t obsolete — yet.

Contact staff writer Jill Barshay at 212-678-3595, jillbarshay.35 on Signal, or barshay@hechingerreport.org.

This story about AI tutors was produced by The Hechinger Report, a nonprofit, independent news organization that covers education. Sign up for Proof Points and other Hechinger newsletters.

This <a target=”_blank” href=”https://hechingerreport.org/proof-points-ai-tutor-python/”>article</a> first appeared on <a target=”_blank” href=”https://hechingerreport.org”>The Hechinger Report</a> and is republished here under a <a target=”_blank” href=”https://creativecommons.org/licenses/by-nc-nd/4.0/”>Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License</a>.<img src=”https://i0.wp.com/hechingerreport.org/wp-content/uploads/2018/06/cropped-favicon.jpg?fit=150%2C150&ssl=1″ style=”width:1em;height:1em;margin-left:10px;”>

<img id=”republication-tracker-tool-source” src=”https://hechingerreport.org/?republication-pixel=true&post=115735&ga4=G-03KPHXDF3H” style=”width:1px;height:1px;”><script> PARSELY = { autotrack: false, onload: function() { PARSELY.beacon.trackPageView({ url: “https://hechingerreport.org/proof-points-ai-tutor-python/”, urlref: window.location.href }); } } </script> <script id=”parsely-cfg” src=”//cdn.parsely.com/keys/hechingerreport.org/p.js”></script>

What's Hot

Inside a College Housing Lifeline in the Bronx

Aviation Excellence Scholarship (Deadline: August 31, 2026)

Custom eLearning Development: Best Practices For Associations

The quest to build a better AI tutor

Joliet Junior College Trustee Censured for 18th Time

When it comes to absenteeism, the real work begins in summer

How System Leaders Can Intentionally Design to Build Math Identity

Inside a College Housing Lifeline in the Bronx

Aviation Excellence Scholarship (Deadline: August 31, 2026)

Custom eLearning Development: Best Practices For Associations

Latest Post

Inside a College Housing Lifeline in the Bronx

Aviation Excellence Scholarship (Deadline: August 31, 2026)

Custom eLearning Development: Best Practices For Associations

What's Hot

The quest to build a better AI tutor

How different students interact with the chatbot tutor

How students’ background affected results

Related Posts