"""add_scorer.py: your turn, write one more EVAL scorer and add it to the suite.

A scorer is a small function that takes (case, answer, trace) and returns a 3-tuple:
    (label, passed, detail)
where label is a short name, passed is True/False, and detail is a human note for failures.
Worked scorers are in ../solution/evals.py. Ideas for a new one:
  - score_no_repeat_tool: fail if the same tool was called more than N times (a loop smell)
  - score_latency_budget: fail if trace total duration is over a threshold
  - score_token_budget: fail if trace.total_tokens() exceeds a cap (cost control)

Steps:
  1. Write your scorer below.
  2. Add it to the SCORERS list (or pass scorers=[...] to run_suite).
  3. Run:  python add_scorer.py
"""

import sys, os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "solution"))
import evals  # the worked harness


# TODO: write your scorer. It receives the test case, the agent's answer, and the Trace.
def score_token_budget(case, answer, trace):
    cap = case.get("max_tokens", 5000)
    used = trace.total_tokens()
    return ("token_budget", used <= cap, f"used {used} tokens, cap {cap}")


if __name__ == "__main__":
    # Run the standard suite with your extra scorer appended.
    my_scorers = evals.SCORERS + [score_token_budget]
    evals.run_suite(scorers=my_scorers)
    # Tip: compare the scorecard to ../solution/evals.py running without your scorer.
