Deterministic policy gradient algorithms for semi-Markov decision processes