Refactored and renamed Q-Learning, SARSA and Expected SARSA. This also includes the variants.
Made some bug fixes and removed redundant codes for some of the algorithms stated above.