SoK/2025/StatusReport/Shubham Shinde

Enhancing Mankala Engine by Adding new efficient algorithms

Project Abstract

This project focused on enhancing the MankalaEngine by integrating new gameplay algorithms and adding support for the Pallanguli variant. The primary goals included exploring various algorithms for gameplay enhancement, such as Monte Carlo Tree Search (MCTS) and Q-learning (Reinforcement Learning), to evaluate their performance against existing algorithms like Minimax and MTDF.

While MCTS did not perform well against strong opponents, Q-learning demonstrated significant improvement. Additionally, efforts were made to refine and integrate the Pallanguli variant by reviewing and improving existing contributions.

Deliverables

Exploration and implementation of new efficient gameplay algorithms in MankalaEngine.

Performance evaluation of this new algorithms (MCTS and Q-learning) against existing agents like minimax and mtdf.

Optimization and refinement experiments for improving Q-learning performance.

Research into more Advanced reinforcement learning techniques for potential future integration in engine.

Reviewed and helped improve the Pallanguli variant implemented by Srisharan V S.

Mentors

Benson Muite ([email protected]).
João Gouveia (@joaotgouveia:matrix.org).

Weekly Progress

Week 1-2 : Research on various gameplay algorithms

Explored several gameplay algorithms including MCTS, Iterative Deepening, and reinforcement learning approaches like Q-learning.

Set up the MankalaEngine repository locally on my machine.

Collaborated with Srisharan V S to identify and discuss reliable resources for the implementation of Pallanguli variant rules.

Week 3-4 : Implementation of Monte Carlo Tree Search (MCTS) Technique

Analyzed the existing codebase of Mankala engine to ensure seamless integration of new algorithms.

Implemented the MCTS algorithm within the engine.

Used the benchmarking utility to evaluate MCTS performance against existing agents like Minimax, MTDF, and Random.

Week 5-6 : Evaluation Results and Shift to Machine Learning Techniques

Assessed the performance of MCTS, which performed poorly against strong agents (Minimax and MTDF operating at depth 7).

Explored techniques to improve MCTS, but results remained unsatisfactory.

Shifted focus toward machine learning techniques, especially reinforcement learning (Q-learning).

Week 7-8 : Implementing & Training Q-learning and Reviewing Pallanguli Implementation.

Reviewed the Pallanguli implementation merge request by Srisharan V S and left some comments.

Implemented and Trained Q-learning agent in Mankala engine.

Evaluated the performance of the Q-learning agent using the benchmark utility—results were significantly good, though some weaknesses remained.

Week 9-10 : Optimization and Refinement of Q-learning.

Conducted experiments to improve Q-learning through techniques like epsilon decay, increased reward incentives, and training against stronger agents (Minimax, MTDF).

Compared refined results with earlier outcomes to measure improvements.

Created a merge request enabling users to input custom initial counters for the Pallanguli variant.

Finalized performance tests and documented the outcomes.

My Blogs

Read more about my SoK journey in the following blogs -

Final blog

Initial blog