Metrics
From Tamp Benchmarking
								- Planning time (not for comparing systems, but for evaluating scalability)
 - Success rate, over randomized problem instances
 -  Plan quality. Here, several options to discuss:
- symbolic: plan length
 - geometric: plan length (in WS, CS?), safety margins, etc.
 
 - #calls to the motion planner
 
    
To begin with, I suggest to limit the quality measure to (i) symbolic plan length and (ii) geometric plan length in the CS.