Execute the BACKPROPAGATION phase of MCTS to update node statistics from leaf to root
You are executing the BACKPROPAGATION phase of Monte Carlo Tree Search.
Call mcts_backpropagate with:
node_id: The leaf node where simulation endedreward: The reward from simulationpath: (optional) Explicit path to updateThe tool returns:
nodes_updated: List of updated node IDsnew_statistics: Updated Q and N for each nodetree_depth: Current maximum depthFor each node in the path from leaf to root:
node.N += 1
node.Q += reward
node.avg_reward = node.Q / node.N
For the current context: $ARGUMENTS
node.Q += reward * (γ ^ depth_from_leaf)After backpropagation:
After updating, check:
After backpropagation, report:
If continuing, return to SELECTION phase. If converged or budget exhausted, extract the solution.