45 Comments

The real question is whether the AI’s exam performance means anything at all. Studies show very little overlap between what AIs do in school and the skills they actually need on the job.

Expand full comment

If your goal is to actually identify breakthrough technologies even slightly ahead of the curve, then I don't think it's helpful to apply base rates, for this exact reason. You will always predict “no”, you will be right 95+% of the time, and you will miss every transformative technology until it's too obvious to ignore.

I think AI is on a strong trajectory to be extremely useful, but I'm not sure I would take this bet. “Passing exams” is not an economically useful function (except to students who want to cheat?) and it's not clear to me that AI will be engineered or optimized for this. If you picked something with a clear economic value, like generating marketing copy or writing scripts for TV and movies, I would be much more likely to take the bet.

Expand full comment

D is not bad for a guy that didn't attend to your lectures.

Expand full comment

Shouldn't you use a third-party grader or even a set of graders? Grading is inherently subjective. What you consider a D, another professor might consider a C depending on the rubric, their mood, student quality, etc. And even if we assume no progress in this technology, which seems unlikely, a beta version of a new tech scored a marginally passing grade in an advanced economics course - probably as good or better than a substantial percentage of all college students in the country. That seems pretty amazing to me.

Expand full comment
Jan 19, 2023·edited Jan 19, 2023

Who wants to parlay this with Bryan’s AI bet with Eliezer Yudkowsky about world ending by 2030? A few options:

No humans on the face of the earth because of AI but the AI still can’t pass Caplan’s exam.

AI can’t pass Caplan’s exams but destroys the world within 1 year.

Expand full comment

How about blinding the AI’s exam by including it with all other student’s exams for grading? That way, Brian won’t know whether he’s grading a human student or the AI.

Expand full comment

Wouldn't the right way to do this be to include the AI test among the exams you actually grade during the semester, without identifying it as an AI test? Grading without knowing the identity of the student who wrote the test is probably good for a variety of reasons (though it can introduce complications if you're dealing with essays that students have worked on drafts of) and would make the test more fair.

Expand full comment

You should do this in a blinded way! You likely will grade the AI very differently because you know it is an AI. My old econ teacher used to do this to avoid bias – have students write their name on the back of the last page.

Expand full comment

This is probably the first Bryan bet I've thought he was way off the mark on. Exciting!

Expand full comment

I have a feeling that Caplan will either become an especially hard grader or that he will lose this bet!

Expand full comment

Now we need a prediction market on this bet. I'd go for the AI's side, certainly at evens.

Expand full comment

I am a pretty big proponent of AI, and I think it will be transformative sooner rather than later. That being said, I actually like your odds of winning this bet. Getting an A- on 5/6 midterms is a *really* stringent criterion.

The reason why I think the A.I will struggle has nothing to do with intellegence. I don't think Larry Summers would get an A- on 5/6 GMU econ exams if he were to be given them with little context. A big part of getting good grades is knowing the context of the class: what concepts did the teacher emphasize? How much detail is expected? Do you need to use exact jargon or are more colloquial synonyms appropriate?

I think this bet would be more "fair" if the A.I had access to either (a) past exams with solutions or (b) lecture notes/lecture videos for the entire class leading up to the midterm. This would be more comparable to the situtation a student finds them in when taking the exam.

Expand full comment

Somebody posted this joke on twitter, apologies I forgot who, but i think it is highly relevant -

I was in the park the other day, and walked past a man playing chess against a dog. "Wow," I said , "That's a smart dog."

"Not that smart," the man replied. "I'm winning 3 games to 1."

Seriously, what % of the population could get a D or higher on an labor econ midterm. Maybe 10%?

For certain tasks the ChatGPT is already outperforming humans (eg some coding tasks, organizing rough notes in to a coherent structure). It's underperforming on internal consistency of answers and general knowledge. But I can't imagine those things won't be fixed in six years.

Expand full comment

One thing I don't understand: if Matthew is right, why would he pick the 6 latest midterms from ~2028? If he's right, professors might be forced to change their assignments and midterms by that point. I think you should the 6 latest midterms from today, not from 6 years from now.

Additionally, by allowing "any AI selected by Matthew" does that mean you'd allow Matthew to train an AI on your class lectures and midterms? Because if so, there's a chance ChatGPT could pass right now with the right training.

Expand full comment

You miss 100% of the moonshots you don't take - that's the problem with the base rate argument

That said, I think you're correct when it comes to Generative AI

I think ChatGPT was a PR stunt for potentially more valuable but far less flashy use cases, such as B2B automation, data aggregation, the workplace etc.

There is a reason that Microsoft is the biggest investor in OpenAI

Expand full comment

My prediction: By 2029, it will be common knowledge that AI aces college exams, in general.

However, Bryan's exams are idiosyncratic enough that the AI might not quite hit this high grading bar, due to being trained on conventional economics textbooks (Krugman etc.) So I think Bryan will win the bet. The AI would need to be trained on his lecture transcripts to avoid this issue.

Expand full comment