Conventional reinforcement learning (RL) techniques face the formidable challenge of high sample complexity and intensive computation load, which hinders RL's applicability in real-world tasks. To tackle this challenge, Warm-Start RL is emerging as a promising new paradigm, with the basic idea being to accelerate online learning by starting with an initial policy trained offline. Indeed, owing to the knowledge transfer from an initial policy, Warm-Start RL has been successfully applied in AlphaZero and ChatGPT, demonstrating its great potential to speed up online learning. Despite these remarkable successes, a fundamental understanding of Warm-Start RL is lacking. The primary objective of this study is to quantify the impact of function approximation errors on the sub-optimality gap for Warm-Start RL. We consider the widely used ‘Actor-Critic’ method for RL. For the unbiased case, we give sufficient conditions on the question ‘how good the warm-start policy needs to be’ to achieve fast convergence. For the biased case, our findings reveal that a ‘good’ warm-start policy (obtained by offline training) may be insufficient, and bias reduction in online learning also plays an essential role to lower the suboptimality gap. We then investigate bias reduction using adaptive ensemble learning and planning.