Enforcing Constraints over Learned Policies via Nonlinear MPC: Application to the Pendubot