Metrics
From Tamp Benchmarking
- Planning time (not for comparing systems, but for evaluating scalability)
- Success rate, over randomized problem instances
- Plan quality. Here, several options to discuss:
- symbolic: plan length
- geometric: plan length (in WS, CS?), safety margins, etc.
- #calls to the motion planner
To begin with, I suggest to limit the quality measure to (i) symbolic plan length and (ii) geometric plan length in the CS.