Part VII: End-to-end Practice and Teaching Kit
Chapter 33

Teaching the Material as a Course

33.5 Assessment, Outputs, and Resources

Complete assessment frameworks, assignment bank, discussion prompts, and curated resource lists for running an effective course.

Course Outputs

Students produce the following artifacts throughout the course, mirroring real product deliverables and serving as the primary assessment basis. The opportunity memo due in week 3 or 4 is a 3-5 page document identifying an AI product opportunity with market context, user need, and initial feasibility assessment. The AI PRD due in week 5 or 8 is a full product requirements document with eval-based success criteria, user stories, and acceptance tests for AI features. The UX flow and trust plan due in weeks 4, 6, or 7 includes user flow diagrams showing AI interaction points plus a trust design plan addressing uncertainty communication. The working prototype due in week 6 or 9 demonstrates the core AI feature, built with vibe coding techniques. The architecture document due in week 7 or 10 shows the technical architecture including AI model selection, retrieval system design, and orchestration patterns. The eval suite due in week 9 or 12 is a comprehensive automated test suite covering happy paths, edge cases, and known failure modes. The governance plan due in week 10 or 13 addresses data privacy, security, bias mitigation, and regulatory compliance. The launch plan due in week 11 includes launch strategy with staged rollout plan, monitoring dashboard, and incident response procedures. The capstone demo and report due in week 12 or 14 consists of a 15-minute product demo plus 5-page postmortem reflecting on decisions and learnings.

Course Artifact Summary

Week 3-4: Opportunity Memo (3-5 pages)

Week 5-8: AI PRD with evals

Week 6-9: Working prototype

Week 7-10: Architecture document

Week 9-12: Eval suite

Week 10-13: Governance plan

Week 12-14: Capstone demo + postmortem

Assessment Rubrics

Each major assignment uses a criterion-referenced rubric with four levels: Excellent, Proficient, Developing, and Beginning.

Rubric Levels

Excellent (90-100): Exceeds expectations, shows innovation

Proficient (80-89): Meets expectations solid work

Developing (70-79): Approaching expectations, needs work

Beginning (<70): Below expectations, significant gaps

Opportunity Memo Rubric

Excellent work at 90-100 points clearly articulates user need with evidence and identifies an AI-native opportunity that leverages unique AI capabilities with thorough competitive and feasibility analysis and actionable next steps. Proficient work at 80-89 points shows solid user need with some supporting evidence where the opportunity has an AI component but could be achieved without AI, with adequate competitive analysis and clear direction for further exploration. Developing work at 70-79 points states user need but not compellingly justified, relying on AI enhancement rather than an AI-native approach with limited competitive or feasibility analysis. Beginning work below 70 points has unclear user need or opportunity where AI is used as a buzzword without clear value, and competitive or feasibility analysis is missing.

AI PRD Rubric

Excellent work at 90-100 points has evalable success criteria that cover all critical user journeys, complete acceptance tests, clear failure mode handling, and is well-scoped for MVP. Proficient work at 80-89 points has most success criteria evalable with good coverage of happy paths and some edge cases addressed, and scope is reasonable. Developing work at 70-79 points has some requirements lacking evalability, edge cases missing, and scope too broad or too narrow. Beginning work below 70 points has requirements not written as evalable tests, missing critical acceptance criteria, and unclear scope.

AI PRD: Evalability is the Key

The most important criterion: can you write automated tests that verify each requirement? If yes, it is evalable. If not, it needs work.

Prototype Rubric

Excellent work at 90-100 points has a core AI feature that works reliably with clean user experience, graceful error handling, clear demonstration of AI value, and production-ready code quality. Proficient work at 80-89 points has an AI feature that works with minor issues, decent UX despite some rough edges, error handling present, and demonstrates AI value. Developing work at 70-79 points has an AI feature that is partially functional with UX needing work, incomplete error handling, and AI value not clearly demonstrated. Beginning work below 70 points has an AI feature that does not work, poor UX, no error handling, and no evidence of AI value.

Prototype Success Criteria

Must work: Core AI feature functions end-to-end

Must show value: User can see AI helping them

Must handle errors: Graceful degradation when AI fails

Must be usable: Clean UX, clear next steps

Capstone Rubric

Excellent work at 90-100 points delivers a polished, functional product with a thoughtful postmortem containing specific learnings, a strong demo presentation, and all course concepts integrated effectively. Proficient work at 80-89 points has a working product with minor polish needed, an adequate postmortem with some insights, clear presentation, and most concepts integrated. Developing work at 70-79 points has a product that works but is incomplete, a superficial postmortem, unclear presentation, and limited concept integration. Beginning work below 70 points has a product that is not functional, a missing postmortem, unclear presentation, and concepts not integrated.

Peer Review Guidelines

Peer review assignments occur at three points during the course: opportunity memo draft, prototype version 1, and eval suite. Students provide structured feedback using the "Stars and Wishes" format where Stars highlight two specific things the work does well, Wishes provide two specific suggestions for improvement, and Questions offer one clarifying question for the author. Peer review counts for 5% of the grade for each assignment, and participation in peer review sessions is required regardless of assignment submission.

Peer review counts for 5% of the grade for each assignment. Participation in peer review sessions is required regardless of assignment submission.

Stars and Wishes Format

Stars (2): Specific things done well

Wishes (2): Specific suggestions for improvement

Questions (1): One clarifying question for author

Beginner vs. Advanced Cohorts

For classes with mixed experience levels, consider these differentiation strategies:

Beginner Pathway

The beginner pathway includes an additional Week 0 module covering AI fundamentals and product management basics, starter code templates for prototypes to reduce implementation friction, more structured lab activities with step-by-step guidance, paired programming during labs pairing beginners with advanced students, and extra office hours in weeks 7-8 covering RAG and architecture content.

Advanced Pathway

The advanced pathway includes additional readings from research papers for each topic, open-ended extensions requiring architectural innovations, a mentorship role for advanced students supporting beginners, and a research paper review assignment replacing one standard assignment.

Cohort Differentiation Strategy

Beginners: Week 0 foundation + templates + structured labs + pairing

Advanced: Research papers + open extensions + mentorship role

Discussion Question Bank

For Weeks 1-2 (AI Fundamentals): What is the difference between narrow AI and general AI, and why does it matter for product decisions? How should product managers think about AI failures differently from traditional software bugs? What mental models help non-technical stakeholders understand AI capabilities? How do you set appropriate user expectations for AI-powered features?

For Weeks 3-4 (Strategy and UX): When should you build AI-native features versus enhancing existing features with AI? How do you balance differentiation through AI with commoditization of AI capabilities? What makes AI UX fundamentally different from traditional UX? How do you build trust when AI behavior is inherently probabilistic?

For Weeks 5-7 (Requirements and Architecture): Why is eval-first thinking particularly important for AI products? How do you scope an AI MVP without overcommitting on AI capability? What are the key architectural decisions when building retrieval-augmented systems? When should an AI product use memory versus always starting fresh?

For Weeks 8-10 (Models, Evals, Governance): How do you decide between using a more capable versus a faster or cheaper model? What is the relationship between evals and observability in AI products? What governance measures should be non-negotiable before launching an AI product? How do you balance innovation speed with responsible AI development?

Discussion Format Guide

Week 1-2: Frame-setting questions

Week 3-4: Strategy tradeoff questions

Week 5-7: Technical decision questions

Week 8-10: Ethics and governance questions

Assignment Bank

Weekly reflection posts (1 page): Ungraded reflections on how the week's content connects to the student's work or interests. Completed each week, these build a learning journal that students review before the capstone.

Reading quizzes (ungraded): Brief multiple-choice quizzes opening each class covering assigned readings. Used for accountability, not assessment.

Product teardown assignment: Students select an existing AI product and document its architecture, UX decisions, trust design, and eval approach. Individual presentation in week 2.

Model comparison report: Students compare three or more AI models for their specific use case, documenting methodology, results, and recommendations. 3-5 pages, submitted week 8.

Case study analysis: Detailed written analysis of one book's case study, examining decisions made and their outcomes. 5-8 pages, submitted week 10.

Peer teaching session: Pairs of students prepare and deliver a 20-minute lesson on an assigned topic to the class. Participation grade.

Assignment Types

Ongoing: Weekly reflections, reading quizzes

Major: Teardown, model report, case study

Participation: Peer teaching sessions

Suggested Readings and Tools

Core readings per week: Each week lists 2-3 chapters from the main book plus 1-2 supplementary articles or chapters from other texts. Supplementary readings are marked as required or optional.

Week 1-2 supplementary: Russell and Norvig Chapters 1-2; Bengio "Deep Learning" Chapter 1; Marcus "The Limits of AI"

Week 3-4 supplementary: Cawood "AI Product Manager"; Anderson "crossing the chasm" Chapter 1; Horowitz "How to Build AI Products"

Week 5-7 supplementary:墙上"RAG in Production"; LangChain documentation; Anthropic model guides

Week 8-10 supplementary: OpenAI/Anthropic/Gemini system guides; EU AI Act text; NIST AI Risk Management Framework

Tool Recommendations

For prototyping, Cursor, Replit, v0, Lovable, and bolt.new provide excellent environments for building AI product prototypes rapidly. For LLM access, OpenAI API, Anthropic API, Google AI Studio, and Azure OpenAI offer various capabilities and pricing models. For vector databases, Pinecone, Weaviate, Chroma, and pgvector provide retrieval capabilities for RAG architectures. For evals, RAGAS, PromptLayer, Braintrust, and LangSmith offer frameworks for testing and measuring AI output quality. For observability, LangSmith, Helicone, and Weights and Biases help monitor AI system behavior in production. For collaboration, Notion works well for documentation, Figma for UX flows, and GitHub for code management.

Tool Stack Summary

Prototype: Cursor, Replit, v0, Lovable, bolt.new

LLM: OpenAI, Anthropic, Google AI Studio, Azure

Vector DB: Pinecone, Weaviate, Chroma, pgvector

Evals: RAGAS, PromptLayer, Braintrust, LangSmith

Observability: LangSmith, Helicone, W&B

Supplementary Resources

Several supplementary resources support instructors in delivering the course effectively. An instructor-only Slack channel enables coordination and resource sharing among teaching staff. Slide deck templates are available in both Google Slides and PowerPoint formats for easy customization. Grading rubrics come as Google Sheets with formula-based score calculation to streamline assessment. Sample assignments at each milestone show excellent, proficient, and developing work to help calibrate grading standards. Lab setup guides for each tool include troubleshooting FAQs to help students overcome technical hurdles.