AI as Answer Machine? Or Wordsmith? Or Something Else?

Last month on the blog, I floated the idea of thinking about generative AI tools like ChatGPT, Bing, and Bard as technologies for learning. I drew an analogy to the digital camera I used as I learned photography. As I was developing a conceptual understanding of photography, I was also learning how to use the tool in my hands, with all its knobs and buttons, and these two kinds of learning informed and supported each other. Could we use generative AI tools in a similar fashion, teaching students to use the tools well as part of learning other skills, like writing and researching? If so, it might help to know what kind of tools we’re dealing with when we talk about ChatGPT and other AI text generation tools.

For instance, I saw this post by Simon Harris go around Mastodon a couple of weeks ago:

15yo: I use ChatGPT quite a bit.
Me: Oh?
15yo: Today, I asked it to give me the last digit of Pi.
Me: 🤔 but there is no…
15yo: …exactly 😁
Me: 😂
15yo: Apparently, it’s 4 😂

This is funny because ChatGPT is so, so wrong about pi. As an irrational number, pi has a nonterminating decimal representation; there’s no last digit of pi. The joke, however, relies on an assumption that ChatGPT is supposed to be some kind of answer machine, a tool for answering questions correctly. However, ChatGPT is famously not that. It’s well known for “hallucinating” facts and inventing citations. There are, in fact, a lot of educators who are worried that students will rely too much on ChatGPT and similar tools because these tools produce text that is factually incorrect. For example, check out this YouTube video of a chemistry instructor evaluating the accuracy of ChatGPT’s answers to various organic chemistry questions. Spoiler: It didn’t do that great.

A vintage "Underwood" manual typewriter seen in a window displaySo if these tools aren’t answer machines, what are they? I would argue that ChatGPT and similar tools are actually wordsmiths. They are powered by large language models that mix words together to create fluent prose. Ask it to draft an essay for you on a given topic, and it will put words together to do just that. Ask it to revise that draft based on your feedback, and it will do that, too. See, for instance, how Ethan Mollick used Bing to help him craft the final paragraph of his April 26th newsletter. Ask it to summarize a long text, and, because it’s good with words, you’ll get a pretty decent summary. It’s a wordsmith that actually doesn’t know what it’s writing but still produces intelligible texts.

This is, in fact, the other reason that educators are worried about AI text generators, because they can quickly produce acceptable responses to pretty standard essay question prompts, and educators are worried that students will use these tools to cheat, whether the essay question is of the “writing as assessment,” “writing to learn,” or “learn to write” variety. Because ChatGPT and its peers are good with words.

Midjourney-generated image of a team of robots playing basketballI’m struck by how the initial reporting and framing of these tools affect the ways we think about them. While AI text generators were seen as answer machines (albeit flawed ones), AI image generators have been seen as “smiths” of a different kind, working with pixels to create images. While a tool like ChatGPT making stuff up is seen as a problem, if I ask Midjourney or DALL-E (a couple of popular image generators) to give me a scene of robots playing basketball, no one worries about the result not being real-world realistic.

If AI text generators are tools for learning, as I argued on the blog last month, then I think it’s helpful to think of them as wordsmithing tools. Does a student struggle to come up with counter arguments for their thesis? Ask an AI tool to generate a few. Does a student freeze up when confronted with a blank page? Ask an AI tool to write a first draft. Does a student find it hard to adopt the right voice for their writing? Ask an AI tool to rewrite a draft essay in different voices. These are all wordsmithing tasks that ChatGPT and its ilk can handle pretty well.

A yellow balance bike for kids, with wheels but no pedals, leaning against a wall
“Balance Bike,” Justin Shanks, Flickr (CC)

Please note that I’m not suggesting we shouldn’t help students learn to develop counter arguments or to get past writer’s block or to figure out their writing voice. I am suggesting, however, that AI tools might be used to help students learn these skills, just as my digital camera helped me learn photography. Cynthia Alby uses the metaphor of learning to ride a bicycle with training wheels to talk about how AI tools might provide a kind of scaffolding for students learning writing skills. When I heard that metaphor, I was reminded of a 2012 blog post by Mark Sample, in which he talks about balance bikes as a different way to teach kids to ride bicycles. Learning with training wheels means getting help with balancing while learning to pedal, while learning with a balance bike means not worrying about pedaling while learning to balance, which is apparently easier for most kids. We might need to think carefully about where our students need the kind of support that AI tools can provide, and the answer might vary by student.

I’ve been using the training wheels and balance bike metaphors in my recent talks on teaching with AI, and I was all ready to bust out the “AI as wordsmith” argument above at a talk I gave earlier this week. But then I signed up for ChatGPT Pro to see what ChatGPT’s plugins are all about. Back in March, OpenAI, the company behind ChatGPT, announced that the tool would have plugins that allowed it to communicate (in a fashion) with other web tools and websites. It’s taken me three months, but I finally started playing around with these plugins. I quickly realized that some of the advice I had given faculty earlier in the year about designing “ChatGPT-proof” assignments hadn’t been accurate for a while!

For instance, if I ask the free version of ChatGPT, which uses GPT-3.5, to tell me about the May 2023 Supreme Court decision that dealt with Andy Warhol’s art, here’s the response I get:

I apologize for the confusion, but as of my knowledge cutoff in September 2021, I’m not aware of any specific Supreme Court decision in May 2023 related to Andy Warhol’s art. It’s possible that such a decision may have occurred after my knowledge cutoff or it may be a hypothetical scenario that hasn’t actually taken place. I recommend checking reliable news sources or legal references to get the most accurate and up-to-date information on any recent Supreme Court decisions regarding Andy Warhol’s art.

If I think about ChatGPT as an answer machine, I’m not getting the answer I want, but at least the tool is telling me what it doesn’t know. That’s an improvement over how the chatbot presented information at the start of 2023. But watch what happens when I use the same prompt in ChatGPT Pro with the VoxScript plugin enabled:

In May 2023, the U.S. Supreme Court ruled against the Andy Warhol Foundation in a copyright infringement case. The case was centered around a series of silk screen images that Warhol created based on a photograph of the late musician Prince, taken by photographer Lynn Goldsmith in 1981.

The court ruled in a 7-2 vote that Warhol infringed on Goldsmith’s copyright. The case hinged on whether Warhol’s images of Prince transformed Goldsmith’s photograph to a great enough degree to stave off claims of copyright infringement and therefore be considered as fair use under copyright law.

Goldsmith had sued the Andy Warhol Foundation for the Visual Arts (AWF) for copyright infringement after the foundation licensed an image of Warhol’s titled “Orange Prince” based on Goldsmith’s image of the pop artist to Condé Nast in 2016 for use in its publication Vanity Fair…

It goes on for another couple of paragraphs, and it gives me the news source it used to generate this summary of the case, a May 18th NPR report. This version of ChatGPT is indeed acting like an answering machine!

Quick sidebar: It is also now possible to share one’s ChatGPT correspondences with others. A few months ago, you needed some third-party tool to do so, but now ChatGPT has a share button for every chat. Thus the argument that there’s no “paper trail” for how one uses ChatGPT is no longer true.

Screenshot of the Bing query about the 2023 Andy Warhol Supreme Court caseI used ChatGPT for the above examples, but please remember that ChatGPT isn’t the only game in town. I asked Bing about the Andy Warhol case, and I got a similar summary, this time with three different news sources cited. Google Bard gave me a similar summary and added some text about how fair use works in the United States. Its sources, however, were meeting notes from a 2004 Congressional subcommittee hearing and a Quizlet created for maybe a law course? Your mileage may vary with these tools.

So are these tools now, in fact, answer machines? I’m going to argue that while they can be used in that way, a better way to think of them is as interfaces to other systems. Just as Siri or Alexa act as a voice-based interface to the apps on your digital devices, ChatGPT and Bing and such are text-based interfaces to, for instance, all the transcripts attached to all the YouTube videos. These tools are still wordsmiths, but wordsmiths used as interfaces.

To use a left field example, the fast food chain Wendy’s recently announced that it was working on an AI chatbot to take drive-thru orders. That’s AI-as-interface, with the chatbot, which is good with words, taking spoken orders for a cheeseburger and a Frosty and turning that into an order in the restaurant’s food production system. For a more academic example, consider Elicit, an AI research assistant that can help you find research in its corpus of 115 million academic papers that might address your research question. Ask it a question like “How should I form groups of students when using group work in the college classroom?” and (because it’s good with words) it will return a table of potentially relevant papers, summarized and tagged by intervention, sample size, and outcomes.

AI as answer machine, AI as wordsmith, AI as interface… I’m not sure where we’ll land with how we think about these tools, but I believe these framings are helpful in assessing what the tools can and cannot do and what roles they might play as learning technologies.

One more caveat: Most of what I’ve presented here is focused squarely on AI text generation tools powered by large language models. There are lots of other kinds of AI tools, like Cornell University’s Merlin app, which can identify birds by their songs, or that AI-powered agent that excelled at the strategic negotiation game Diplomacy, that don’t fall into any of the categories I’ve described here.

Leave a Reply

Your email address will not be published. Required fields are marked *