Given our current codebase and the ability to parallelize, it makes sense to implement PQN since it leverages the parallelization adn removes the need for a buffer. Moreover, it would be interesting to see it combined wiht our meta-learning framework and exploration strategies.
Given our current codebase and the ability to parallelize, it makes sense to implement PQN since it leverages the parallelization adn removes the need for a buffer. Moreover, it would be interesting to see it combined wiht our meta-learning framework and exploration strategies.