Roam

2026-01-27
Original implementation credits go to Hugo Cisneros

Here lie bottom-up (not top-down) notes.

Peg Solitaire RL

Output

Training Data + Round 1

I ran DFS on all of the puzzles to extract their solutions:

(kits) z5362216@k201:~/minesweeper $   python3 train_peg_katana.py \
>       --boards english european 6x6 triangle5 triangle6 star \
>       --diamond41-solutions diamond41_solutions.txt \
>       --epochs 400 \
>       --batch-size 128 \
>       --lr 0.001 \
>       --channels 128 \
>       --res-blocks 8 \
>       --attention-layers 3 \
>       --drop-path 0.2 \
>       --label-smoothing 0.1 \
>       --mixup-alpha 0.2 \
>       --ema-decay 0.9995 \
>       --temperature 0.5 \
>       --eval-games 200 \
>       --solutions-per-start 5 \
>       --out models

============================================================
Universal Peg Solitaire Training
============================================================
Started: 2026-01-15 13:27:09

GPU: NVIDIA H200

============================================================
Step 1: Collecting Training Data
============================================================
Boards: ['english', 'european', '6x6', 'triangle5', 'triangle6', 'star']
Solutions per start: 5

Processing english...
Solving english from (3, 3) (32 pegs)...
  Solved in 0.37s, 31 moves
Solving english from (0, 3) (32 pegs)...
  Solved in 0.38s, 31 moves
Solving english from (2, 0) (32 pegs)...
  Solved in 0.15s, 31 moves
Solving english from (1, 3) (32 pegs)...
  Solved in 0.47s, 31 moves
  Total: 2480 training samples from 4 starting position(s)

Processing european...
Solving european from (1, 3) (36 pegs)...
  Solved in 66.04s, 35 moves
  Total: 700 training samples from 1 starting position(s)

Processing 6x6...
Solving 6x6 from (2, 2) (35 pegs)...
  Solved in 4.91s, 34 moves
Solving 6x6 from (0, 0) (35 pegs)...
  Solved in 10.63s, 34 moves
Solving 6x6 from (0, 2) (35 pegs)...
  Solved in 0.02s, 34 moves
Solving 6x6 from (1, 1) (35 pegs)...
  Solved in 54.10s, 34 moves
  Total: 2720 training samples from 4 starting position(s)

Processing triangle5...
Solving triangle5 from (0,) (14 pegs)...
  Solved in 0.00s, 13 moves
Solving triangle5 from (3,) (14 pegs)...
  Solved in 0.00s, 13 moves
Solving triangle5 from (10,) (14 pegs)...
  Solved in 0.00s, 13 moves
Solving triangle5 from (12,) (14 pegs)...
  Solved in 0.00s, 13 moves
  Total: 260 training samples from 4 starting position(s)

Processing triangle6...
Solving triangle6 from (0,) (20 pegs)...
  Solved in 0.00s, 19 moves
Solving triangle6 from (3,) (20 pegs)...
  Solved in 0.02s, 19 moves
Solving triangle6 from (6,) (20 pegs)...
  Solved in 0.00s, 19 moves
Solving triangle6 from (15,) (20 pegs)...
  Solved in 0.02s, 19 moves
  Total: 380 training samples from 4 starting position(s)

Processing star...
Solving star from (0,) (9 pegs)...
  Solved in 0.00s, 8 moves
Solving star from (5,) (9 pegs)...
  Solved in 0.00s, 8 moves
  Total: 80 training samples from 2 starting position(s)

Loading diamond41 solutions from diamond41_solutions.txt...
Loaded 248 solutions for diamond41
Added 1560 diamond41 samples
Saved to models/training_data.json

Total samples: 8180
  6x6: 2720
  diamond41: 1560
  english: 2480
  european: 700
  star: 80
  triangle5: 260
  triangle6: 380

============================================================
Step 2: Training
============================================================
Parameters: 2,653,620

Training for 400 epochs...
Epoch   1/400 | Loss: 4.3549 | Acc: 18.5% | LR: 0.000994
Epoch  10/400 | Loss: 3.1665 | Acc: 35.0% | LR: 0.000501
Epoch  20/400 | Loss: 2.4748 | Acc: 57.4% | LR: 0.001000
Epoch  30/400 | Loss: 2.7241 | Acc: 49.0% | LR: 0.000854
Epoch  40/400 | Loss: 2.6360 | Acc: 52.6% | LR: 0.000501
Epoch  50/400 | Loss: 2.5210 | Acc: 56.0% | LR: 0.000147
Epoch  60/400 | Loss: 2.4930 | Acc: 57.6% | LR: 0.001000
Epoch  70/400 | Loss: 2.3036 | Acc: 61.8% | LR: 0.000962
Epoch  80/400 | Loss: 2.6344 | Acc: 53.1% | LR: 0.000854
Epoch  90/400 | Loss: 2.5161 | Acc: 55.6% | LR: 0.000692
Epoch 100/400 | Loss: 2.1946 | Acc: 64.6% | LR: 0.000501
Epoch 110/400 | Loss: 2.1105 | Acc: 65.4% | LR: 0.000309
Epoch 120/400 | Loss: 2.4915 | Acc: 55.9% | LR: 0.000147
Epoch 130/400 | Loss: 2.2627 | Acc: 62.9% | LR: 0.000039
Epoch 140/400 | Loss: 2.2844 | Acc: 60.1% | LR: 0.001000
Epoch 150/400 | Loss: 2.4334 | Acc: 56.1% | LR: 0.000990
Epoch 160/400 | Loss: 2.4256 | Acc: 58.2% | LR: 0.000962
Epoch 170/400 | Loss: 2.2687 | Acc: 62.9% | LR: 0.000916
Epoch 180/400 | Loss: 2.3684 | Acc: 60.5% | LR: 0.000854
Epoch 190/400 | Loss: 2.2158 | Acc: 61.9% | LR: 0.000778
Epoch 200/400 | Loss: 2.4369 | Acc: 58.2% | LR: 0.000692
Epoch 210/400 | Loss: 2.2957 | Acc: 58.6% | LR: 0.000598
Epoch 220/400 | Loss: 2.1747 | Acc: 65.2% | LR: 0.000501
Epoch 230/400 | Loss: 2.1394 | Acc: 66.1% | LR: 0.000403
Epoch 240/400 | Loss: 2.2170 | Acc: 62.1% | LR: 0.000309
Epoch 250/400 | Loss: 2.5308 | Acc: 54.3% | LR: 0.000223
Epoch 260/400 | Loss: 1.9152 | Acc: 69.7% | LR: 0.000147
Epoch 270/400 | Loss: 2.3018 | Acc: 62.2% | LR: 0.000085
Epoch 280/400 | Loss: 2.1810 | Acc: 62.9% | LR: 0.000039
Epoch 290/400 | Loss: 2.3095 | Acc: 60.3% | LR: 0.000011
Epoch 300/400 | Loss: 2.2366 | Acc: 61.0% | LR: 0.001000
Epoch 310/400 | Loss: 2.5112 | Acc: 55.8% | LR: 0.000998
Epoch 320/400 | Loss: 2.2704 | Acc: 62.5% | LR: 0.000990
Epoch 330/400 | Loss: 2.3118 | Acc: 60.1% | LR: 0.000978
Epoch 340/400 | Loss: 2.2709 | Acc: 61.0% | LR: 0.000962
Epoch 350/400 | Loss: 2.3365 | Acc: 61.4% | LR: 0.000941
Epoch 360/400 | Loss: 2.2527 | Acc: 62.7% | LR: 0.000916
Epoch 370/400 | Loss: 2.2394 | Acc: 57.2% | LR: 0.000887
Epoch 380/400 | Loss: 2.3419 | Acc: 59.7% | LR: 0.000854
Epoch 390/400 | Loss: 2.2350 | Acc: 63.5% | LR: 0.000817
Epoch 400/400 | Loss: 2.3246 | Acc: 55.9% | LR: 0.000778

Training completed in 7.0 minutes
Best accuracy: 74.6%

============================================================
Step 3: Evaluation
============================================================

Temperature = 0.5 (200 games each):
  6x6         :  18.5%
  english     :   0.0%
  european    :  14.5%
  star        :  11.5%
  triangle5   :   0.0%
  triangle6   :   0.0%
  Average     :   7.4%

Temperature = 0 (greedy):
  6x6         :   0.0%
  english     :   0.0%
  european    :   0.0%
  star        :  17.5%
  triangle5   :   0.0%
  triangle6   :   0.0%
  Average     :   2.9%

============================================================
Step 4: Saving
============================================================
Saved PyTorch model to models/peg_universal.pth
Saved ONNX model to models/peg_universal.onnx (10.19 MB)

============================================================
Done!
============================================================

this first run-through was underwhelming though

Read more >

HTTP Status Codes

CodeNameCategoryMeaning
100ContinueInformationalServer received headers, client should send body
101Switching ProtocolsInformationalServer switching protocols per client request
200OKSuccessRequest succeeded
201CreatedSuccessRequest succeeded, new resource created
202AcceptedSuccessRequest accepted but not yet processed
204No ContentSuccessRequest succeeded, no content to return
206Partial ContentSuccessPartial GET request succeeded
301Moved PermanentlyRedirectionResource permanently moved to new URL
302FoundRedirectionResource temporarily at different URL
304Not ModifiedRedirectionResource hasn’t changed since last request
307Temporary RedirectRedirectionTemporary redirect, method preserved
308Permanent RedirectRedirectionPermanent redirect, method preserved
400Bad RequestClient ErrorMalformed request syntax
401UnauthorizedClient ErrorAuthentication required
403ForbiddenClient ErrorServer understood but refuses to authorize
404Not FoundClient ErrorResource doesn’t exist
405Method Not AllowedClient ErrorHTTP method not supported
409ConflictClient ErrorRequest conflicts with current state
429Too Many RequestsClient ErrorRate limit exceeded
500Internal Server ErrorServer ErrorGeneric server error
501Not ImplementedServer ErrorServer doesn’t support functionality
502Bad GatewayServer ErrorInvalid response from upstream server
503Service UnavailableServer ErrorServer temporarily unavailable
504Gateway TimeoutServer ErrorUpstream server timeout