Connect - 4

Unlike supervised learning, reinforcement learning takes no target values as a part of training data. It relies solely on the interaction between agents and its environment and under the markov decision process assumption, it is hoped it can find its way to finish the task. In essence, it is not told how the task should be done but rather, figure out what to do itself by learning from sampled transitions.

Things inside the notebook

Connect - 4 game environment.
Demonstration of using deep convolutional network as a function approximator of state action values.
Application of Bellman optimality equation in training deep-Q network.
Importance of sampling with experience replay and comparison to AlphaGo Zero self-play algorithm.
Agent learning from scratch to eventually winning connect - 4 with no human designed rules.
Training procedure and demo of agent playing the game.

Results

The agent has been achieved effective and stable performance after 20000 epochs and demonstrated high win rate and decreasing steps needed for a win. This trained policy can be used by reloading the pytorch model policy_net which expects an image of the board state of size 6 * 7 * 1.

Reference

Volodymyr Mnih et al. Playing Atari with Deep Reinforcement Learning 2013

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.ipynb_checkpoints		.ipynb_checkpoints
training_results		training_results
DQN_plainCNN.pth		DQN_plainCNN.pth
LICENSE		LICENSE
README.md		README.md
connect_X.ipynb		connect_X.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

training_results

training_results

DQN_plainCNN.pth

DQN_plainCNN.pth

LICENSE

LICENSE

README.md

README.md

connect_X.ipynb

connect_X.ipynb

Repository files navigation

Connect - 4

Things inside the notebook

Results

Reference

About

Releases

Packages

Languages

License

neoyung/connect-4

Folders and files

Latest commit

History

Repository files navigation

Connect - 4

Things inside the notebook

Results

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Languages