8.3 LLM-as-Judge EvaluationRubric design, pairwise comparison, and calibration for LLM-as-judge evaluation.