GitHub - Spider101/Deep-3D-Tictactoe: An attempt to implement Deep Q Learning with 3D Tictactoe

Overview

This project is an attempt to adapt Deep Q-Learning, as described in Playing Atari with Deep Reinforcement Learning by Mnih et al, for 3D Tictactoe

~~Enumerate best states for 2D tictactoe using minimax~~
~~Implement q learning for 2D tictactoe~~
~~Extend q learning for 3D tictactoe and see what breaks~~ (couldn't finish enumerating states in state table - 80 million and counting)
~~Implement deep q learning using a simple 2-layer neural net for 2D Tictactoe~~ (then 3D Tictactoe)
Implement policy gradient learning using a simple 2-layer neural net for 2D Tictactoe (then 3D Tictactoe)
~~Establish reward rubrics and input format for tictactoe DQN pipeline~~
~~Design model pipeline for DQN~~
Design model pipeling for Policy Gradient Learning
~~Experiment with model architecture to improve performance~~

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
__pycache__		__pycache__
agents		agents
games		games
memory		memory
results		results
.gitignore		.gitignore
2dminimax.py		2dminimax.py
3dttt.py		3dttt.py
README.md		README.md
deepQ3d.py		deepQ3d.py
dqn2D.py		dqn2D.py
policy_grad_learning.py		policy_grad_learning.py
selfplay_threshold_0.05.png		selfplay_threshold_0.05.png
td_learning_eval.py		td_learning_eval.py
weights.dat		weights.dat