CEO-Bench: Can Agents Play the Long Game? . Contribute to zlab-princeton/ceobench-src development by creating an account on GitHub.
Abstract: Ensuring the safety of autonomous driving systems (ADSs) through rigorous verification in simulated environments is crucial before real-world deployment. However, using simulation ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results