⚡️Assessment Unlocked: Assessing thinking in the AI era
Students now have access to powerful AI tools that can draft essays, summarize readings, and generate solutions. That reality forces a simple but uncomfortable question for higher education. Are we still assessing student thinking, or are we assessing the ability to prompt a machine?
This week’s issue explores how programs can redesign assessment to measure higher‑order thinking while using GenAI as a productive learning partner instead of treating it as a threat.
🧭 Introduction
Generative AI has changed how students approach assignments. Tasks that once required hours of writing or summarizing can now be completed in seconds with AI assistance. This shift does not eliminate the need for assessment; it clarifies what we should assess.
The goal of assessment has always been to measure thinking, judgment, and application of knowledge. If an assignment can be fully completed by AI without meaningful student reasoning, the assessment may not be measuring learning.
Takeaway: AI exposes weak assessments. It also creates an opportunity to design stronger ones.
📚 Background
Higher education assessment has long emphasized higher‑order cognitive skills such as analysis, evaluation, and creation. Bloom’s taxonomy, later revised by Anderson and Krathwohl, places these complex cognitive processes at the upper levels of learning, emphasizing that meaningful education requires more than recall or summary (Anderson & Krathwohl, 2001).
Constructive alignment theory reinforces this principle. When learning outcomes, instructional activities, and assessments are aligned, students engage in the kinds of thinking instructors intend to develop (Biggs & Tang, 2011). If a program claims to teach critical thinking but evaluates students using tasks that AI can easily replicate, the alignment breaks down.
Assessment scholars have also emphasized authentic assessment, which evaluates student performance through realistic tasks that mirror real-world application of knowledge. Wiggins argued that meaningful assessment should ask students to apply knowledge in context rather than reproduce information (Wiggins, 1998). Authentic tasks often involve judgment, synthesis, or decision making, areas where human reasoning remains essential.
AAC&U’s VALUE rubrics were developed to support consistent evaluation of complex learning outcomes such as critical thinking, written communication, and integrative learning (AAC&U, 2009). These rubrics focus on evidence of reasoning processes rather than the final product alone.
More recently, assessment organizations have begun discussing how AI changes the nature of student work. NILOA notes that institutions must rethink evidence of learning in environments where technology can assist with production of academic artifacts (NILOA, 2023). The central challenge is not banning technology but designing assessment that captures student judgment and intellectual engagement.
Generative AI complicates assessment because it can produce polished outputs quickly. At the same time, AI can also support deeper learning when students use it as a thinking partner, for example by generating alternative explanations or testing arguments. Assessment design must therefore focus on documenting the student’s reasoning process rather than simply evaluating the final artifact.
Takeaway: Higher education assessment has always aimed to measure thinking. AI simply raises the stakes.
References
- Anderson, L. W., & Krathwohl, D. R. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives.
- Biggs, J., & Tang, C. (2011). Teaching for quality learning at university.
- Wiggins, G. (1998). Educative assessment: Designing assessments to inform and improve student performance.
- Association of American Colleges and Universities. (2009). VALUE rubrics.
- National Institute for Learning Outcomes Assessment. (2023). NILOA resources on AI and assessment.
🛠️ Best practices & tips
Programs do not need to eliminate AI from learning environments. Instead, assessment should focus on documenting thinking, reasoning, and decision making. The following practices help shift evaluation toward those elements.
🧠 Ask students to explain their reasoning
Require short reflective explanations that describe how conclusions were reached. This can include explaining evidence selection, argument structure, or design choices.
Example prompt
“Describe two alternative explanations you considered and explain why you rejected them.”
This reveals thinking that AI outputs alone cannot easily demonstrate.
📊 Evaluate the decision process, not just the final answer
Add rubric criteria that focus on reasoning steps such as:
- justification of claims
- evaluation of competing evidence
- explanation of assumptions
AAC&U VALUE rubrics already emphasize reasoning quality, which makes them useful in AI‑rich environments.
🤖 Require transparent AI use
Instead of banning AI, ask students to document how they used it. This can include prompts used, edits made, and reflections on where AI suggestions were accepted or rejected.
Example assignment element
“Provide a short log describing how AI supported your work and how you verified its accuracy.”
This encourages responsible use rather than hidden use.
🔍 Use comparison and critique tasks
AI performs well at generating information but less well at evaluating competing arguments with discipline‑specific nuance. Assignments that require critique, comparison, or evaluation of multiple perspectives push students toward higher‑order thinking.
⚡ Quick practical workflow
- Identify an assignment currently used to assess a learning outcome.
- Ask a GenAI tool to complete the assignment.
- Examine whether the output would receive a high score.
If the answer is yes, revise the task so that reasoning, judgment, or reflection becomes part of the graded evidence.
Takeaway: Good assessment asks students to show how they think, not just what they produce.
🏫 Example or case illustration
Setting: A political science program assessing policy analysis skills in a senior seminar.
Students traditionally submitted a policy brief evaluating a public policy issue. Faculty noticed that many papers were polished but oddly similar in structure and argument style. During a departmental conversation, one instructor demonstrated that a generative AI tool could produce a full policy brief in less than a minute.
The faculty realized the assignment primarily rewarded organization and surface explanation rather than deep analysis.
The assessment coordinator proposed a redesign.
Students would still submit a policy brief, but they also needed to include a short analytical reflection describing how they evaluated competing policy options. The reflection required students to explain the evidence they prioritized and the limitations of their analysis.
Faculty also added a rubric dimension evaluating justification of policy recommendations.
During the first semester of implementation, instructors noticed clear differences between student work. Some students presented thoughtful explanations of tradeoffs and uncertainties, while others struggled to articulate their reasoning. These differences created stronger evidence of learning and clearer feedback for improvement.
AI tools were not banned. Instead, students were encouraged to use them to explore background information or generate counterarguments. What mattered was the student’s ability to interpret and evaluate that information.
Takeaway: When reasoning becomes visible, assessment becomes more meaningful.
🔮 What’s next
Next week we explore how GenAI can help faculty analyze large sets of student work to identify patterns in learning strengths and gaps.
Prep action for next week. Collect three anonymized student work samples that assess the same learning outcome.
❓ Question of the day
If a student used AI to help complete an assignment, what evidence would convince you that the thinking was truly their own?
🚀 Call to action
Take one assignment used to assess a learning outcome and ask a GenAI tool to complete it. If the result would score highly, redesign the task so students must demonstrate their reasoning process.

