Let’s just assemble a casual grab bag of hypotheses here:
• Hypothesis: GPT is good at identifying / repeating patterns that show up when people talk on the internet
• Hypothesis: GPT is bad at logic, accuracy, things we generally call “reasoning”
• Hypothesis: Some SAT questions test for specific cultural knowledge, are not just abstract intelligence tests (whatever that means)