Reinforcement Learning
  1. 3.
    ZipRecruiter on Classifying Job Titles With Noisy Labels Using REINFORCE - Fine-grained job title classification with noisy labels using the REINFORCE algorithm and multi-task learning
    1. 1.
      this article has a very nice trick in adding a reward component to the loss function in order to mitigate for unbalanced class label problem, instead of the usual balancing.

  • Markov chain problem, (state, action, new state, reward)
  • Lots of Exploration in the beginning, then exploitation
  • Returns optimal policy.
  • Refer to youtube here​

Copy link
On this page
Q-LEARN
Deep Learning