LLMs, genAI, ChatGPT, and the rest – SIPBS CompBiol PDA Pages

Caution

This page, and any articles that come from external sources, are opinions. You will find a range of opinions within the group, as well as across articles like these. Discussion is encouraged!

Tip

The university has a policy concerning the use of genAI. You should read this and abide by it. University policy supersedes any opinion found on this page.

University of Strathclyde policy on genAI

1 tl;dr: ChatGPT and all its kin are bullshit

genAI tools such as LLMs are statistical models that produce plausible-looking output. They are, however, bullshit.

Hicks, M.T., Humphries, J. & Slater, J. ChatGPT is bullshit. Ethics Inf Technol 26, 38 (2024). https://doi.org/10.1007/s10676-024-09775-5

I’m not just being rude. Bullshit has a technical definition:

“these programs cannot themselves be concerned with truth, and because they are designed to produce text that looks truth-apt without any actual concern for truth, it seems appropriate to call their outputs bullshit”

As these tools are mathematical models designed to map a prompt onto a statistically-plausible output inferred from a corpus of training text or images, they are not concerned with truth, correctness, or accuracy for any particular question. It is this disregard for truth and accuracy, and desire only to create a plausible answer, that makes ChatGPT and any tool like it a bullshitter, and its answers bullshit.

The bullshit nature of genAI is exemplified by image generation using stable diffusion. This method constructs new images from noise, constrained by your text prompts and its training set.

Tip

ChatGPT bullshits. It is the equivalent of the guy in the pub confidently telling you that France is the capital of Italy, that the square of 11 is 1,111, and that the Pyramids were built by aliens.

1.1 How image generation by stable diffusion works

It can seem almost like magic that we can now enter a plain text prompt and receive in seconds an illustration or graphic design representing that text. Often these images are generated by a process called “stable diffusion,” which is explained very well in the slides linked below. The mathematics can look complicated (it is in many ways a complicated process), but the general principles are understandable at a high level.

Stable Diffusion: a tutorial

Take a whole bunch of images (the training corpus) with text annotations, and use an algorithm to turn these into visual “noise,” which looks like static.
Use machine learning to learn how to go backwards from the “noisy” image (like static) to an image like the original¹, guided by (conditioned on) the words in the text annotations.

The details are important, but this is essentially the high-level process. When you provide a text prompt to, say, Bing Image Generator a new “noisy” image that looks like static is generated. Constrained by the training corpus and your text prompts, the tool then constructs a new image by translating “static” into components of images.

2 How well does ChatGPT do in exams and assessments?

The performance of ChatGPT in assessments obviously varies by assessment but, speaking personally, I design assessment questions with ChatGPT in mind. What does that mean?

I submit draft questions to ChatGPT and assess the answers
I aim to set questions for which a ChatGPT answer would not typically receive a passing mark

I understand that I am not alone in this strategy.

ChatGPT, as noted above, bullshits. It is incapable of arithmetic manipulation, or constructing a logical proposition. It does - to be sure - produce text that superficially appears to look lke text that could answer each question. But even a faintly close examination of the answers however tends to show that stated conclusions are not supported by the text that comes before it, statements of “fact” are made up and incorrect, and citations and references are hallucinated, and do not exist.

Tip

As educators, we know that ChatGPT exists and adapt our assessments accordingly. Answers obtained via ChatGPT are unlikely to score well.

2.1 But you can’t tell when I’ve used ChatGPT, right?

It’s true that there’s no reliable tool or method to distiguish between bullshit produced by AI and that produced by a human. So, to that extent, a really poor answer might as well have come from a computer as a student.

3 So what good is genAI for me?

There are a number of reasonable uses for genAI. It’s often OK at helping restructure or shorten text, although you have to take care to establish for yourself that the meaning hasn’t changed. And it can be useful in revision for setting example questions to test yourself. Just don’t rely on it to mark your work as well, or give you correct answers.

Tip

ChatGPT has some helpful uses, though always exercise caution as it is, fundamentally, a bullshit generator. These include:

helping to shorten and refine text to make a point more succinctly
generating example questinos on a topic as a revision aid
generating code snippets
summarising large bodies of text (though be cautious - ChatGPT can misrepresent text)

Footnotes

The like the original part is important. The algorithm is not “remembering” the image(s) it was trained on, it’s constructing a new image constrained by static and the text prompts (hence “generative” AI).↩︎