Open Division Final
San Mateo, California
16 March 1997
11 Point Match
Ron Karr (Black) vs. Richard McIntosh (White)
Score: 4 - 4
Analysis by Nick Ballard
On Sunday, March 16th, Backgammon by the Bay (directed by Beth Skillman and Joan Clark) held its monthly tournament. Richard McIntosh faced Ron Karr in the final round.
Richard McIntosh started playing backgammon in late 1994, and discovered the r.g.b. newsgroup and FIBS a year later (his handle is "rambeau"). He has played in every BGBB tournament since the second one (March '96), shortly after which he volunteered to develop and maintain this WWW site. Besides the two BGBB Intermediate wins that propelled him into the Open division, he won the recently concluded Zapped #1 on GamesGrid.
Richard is a confirmed UNIX and Macintosh zealot, and an internet user since 1978, back in the dark ages of character-mode FTP and 300 baud modems over what was then called the ARPA-net. In the 70's, he ran marathons, played basketball, packed in the Sierras, and studied statistical methods and computer modeling of voting behavior. In the 80s he gravitated to software development, and in the 90s to customer satisfaction and business metrics.
Earlier in this tournament (round 3, the quarter-finals), I was given the opportunity to be impressed by Richard, with whom I was paired. After a long game he took a 1-0 lead. The next game was a seesaw battle in which he redoubled me, but fortune turned the tides and we reached a bearoff in which he suddenly faced an 8 cube. He had the choice of being down 1-4, or letting the conclusion of this bearoff dictate whether he would trail 1-8 or win the match. His take point was 28%, but his chance of winning this particular position was only 25%.
Richard says he he did not do all the math, but felt that our skill difference made this a straightforward take-and-pray situation. This reasoning is simultaneously flattering and frustrating -- He was clever enough to realize it was not in his interest to go with the theoretical odds only to be ground down. Richard was rewarded with a 6-6 the following roll. He went on to defeat his next opponent, to face Ron in the finals.
Ron Karr played backgammon on and off from 1978 to 1989 (and won the intermediate jackpot in 1986 Nevada State tournmament). He resurfaced when he started playing on FIBS in December 1994 (his handle is "ronkarr"), and later placed 2nd in the Championship Consolation of the November 1996 Las Vegas Open.
Professionally, Ron once made a living as a card-counter at blackjack (1976-1983). He now works at Apple Computer, writing technical documentation for software developers. At one time, he played a lot of duplicate bridge, but more recently his hobbies include piano (jazz, pop), and writing (philosophy).
I should make clear in advance that I have never worked directly with any programs. My knowledge of JellyFish's strengths and weaknesses is based only on having been present at several computer backgammon conversations amongst top players (some Jellyficionados, some not). Incidentally, I'm grateful to Malcolm Davis for having proffered the most guidance to me in this regard, but just to protect the man from ridicule, please understand that my views are not necessarily his.
The following example illustrates what can happen when JellyFish is followed blindly, without applying common sense. This position (also shown in the April/May Chicago Point), was reached in an advanced round of March's Copenhagen tournament (Mike Svobodny was Black, Perry Gartner was White):
Score: Black needs 3, White needs 2 Play A Black to play 2Candidate Plays Equities 22/20 +1.000 7/5 +0.950 5/3 +0.942
Black, who rolled 3-2, has entered with a 3 and has a 2 to play. The three choices are to move the builder to the 3 point, stack on the 5 point, or come up to the 20 point.
Elliott Winslow did only a short rollout using JF 2.02 (seed 32, 1296 games, settlement 0.550), for respective Black equities of +0.942, +0.950, and +1.000, but apparently there were longer rollouts with similar data. Such data "prove" that Black's best play with the deuce is to come up to the 20 point!
How can this be? Doesn't Black want to pick up a second checker? Easing the gammon pressure on White flies in the face of logic. Can it really be that Black loses so many games cracking his six-prime through awkward sequences (allowing White ace followed by 6)?
Without having watched JF play, I speculate; but I strongly suspect the answer is that it misplays -- it does not maximize gammons by leaving the ace point slotted for Black after hitting there. JF just has Black close White out at the first available opportunity.
If so, JF will lose out on a plethora of gammons, and naturally will support the play that wins a tad more games. Coming up to the 20 point will rate best because Black will never get trapped. Stacking on the 5 point will rank second because it doesn't risk 4-3 and 4-4 cracking when White keeps the bar point.
At the table, Black chose to slide to the 3 point. Perhaps stacking is a hair better, perhaps it isn't. Black risks losing to more nightmare sequences, but hits on the ace point more often, which, with correct play, should lead to an increased gammon percentage.
[It is conceivable that coming up to 20 is correct anyway because the improved 7-5 distribution of the spares is more important for recirculation than staying back on the 22 is valuable in the short term. In this case, JF would luck out, getting the right answer for the wrong reason. In other words, if JF chooses coming up to 20 over not playing the deuce, it would surely be demonstrably wrong.]
The point here is that I do not believe JF's rollouts of the illustrated position can be trusted, and that it would seem reasonable to question JF's rollouts of any position in which a rolling prime is involved.
You have seen an example of what can happen if a prime that only has one pip left to be rolled forward is mishandled. Now back up the prime, and you can imagine how the error could be multiplied. By making lower board points (particularly the ace point), even when wrong, JF avoids reaching a position where it won't misplay again!
Newer versions of JellyFish continue to emerge and improve, yet still need to fully appreciate the value of a rolling prime. When JF hits a shot from a backgame, its containment policy places too great an emphasis on closing inner board points and too little on forming/strengthening a prime. This means that JF is vulnerable to a greater chance of a fluky escape, and it fails to capitalize on the optimal strategy of generating numbers that will force a second blot -- to maximize the number of such opportunities. As a consequence, JF's rollouts evaluate backgames and (to a lesser degree) deep holding games to be weaker than they are in actual play.
This defect has a retrograde effect on checker moves in earlier, undeveloped positions: JF perceives too much upside and not enough downside in making lower board points. The larger the gap between the ace point and the next point made, the more likely the potential strategical resource of a prime has been lost, and the greater the JF misevaluation could potentially be.
- JF's evaluation of a candidate move should be adjusted favorably towards the side who is more likely to end up with multiple checkers back as a result of vulnerability to hits or a hit exchange.
- Candidate moves which make (or even hit or slot) lower board points (particularly when higher points are gapped, and there is still the potential for that side to reach a backgame-ish position) should be evaluated lower than JF indicates. Look for these themes in the notes to the game below.
How much should one dock "safe" plays and ace-point make/hit/slot moves? One must be careful not to get carried away. JF does, after all, play most positions very well, and to reach backgames or deep holding games often requires parlays.
JF worshippers may argue that such adjustments should be relatively small (or even negligible), and in most cases they're probably right. However, at this point, I do not believe even the top JF addicts have enough experience with the creature to authoritatively determine the size of such adjustments. Required are comparison of JF rollouts with very extensive top player rollouts in a variety of positions.
In these notes, candidate moves will be written in shorthand notation, listing only the checkers' destinations. For example, when the opening move "24/18 13/11" appears next to the diagram, I refer to it as "18,11." "JellyFish" is usually abbreviated as "JF."
And now, without further ado, let the game begin.
Ron Karr (Black) vs. Richard McIntosh (White)
Score: 4 - 4
11 Point Match
The game was recorded on tape and transcribed by Richard McIntosh.
Rollouts were made by Richard McIntosh, using JellyFish Analyzer 2.02. Rollout results show equities for the player on move. Candidate plays were better than or within 0.100 equity of the actual plays, evaluated at level 7. The following plays were judged by the annotator not to warrant separate diagrams:
- Play 12b: 8/2 0.070 weaker than 13/7
- Play 14b: 13/8 6/4 0.057 weaker than 13/6
Parameter values for rollouts on moves were:
- level 5
- 7776 games (36x216)
- horizon 7
- seed 316
Standard deviations of equity estimates were between 0.003 and 0.011.
Parameter values for rollouts on cube decisions were:
- level 5
- 23328 games (36x648)
- full game
- seed 316
- settlement limit 0.550
Copyright © 1996-2010 BackGammon By the Bay