Update: The EBU believe that their robot NGS data is unreliable.
When I queried the EBU after publishing this post they stated that the data that they receive does not reliably distinguish between basic and advanced BBO robots. We cannot therefore compare the two types of robot using EBU NGS data as described below. The idea of using pairs of identical bots playing together to get a single player NGS rating is valid but the BBO data do not identify the type of robot used. The person relaying the data to the EBU has to set the EBU number to identify the bot but this step is not being done reliably. The original post continues below:
When you play on Bridge Base online you can play with or against bridge playing programs instead of real people. These programs are called robots and come in two types – Basic or Advanced. When you want a robot to make up the numbers at table you can rent one. A basic robot costs 2.99 BB$ per week while an advanced robot costs 1.99 BB$ per day. Only one system is described for the robots and the only clue about the difference between them is the statement “Advanced robots are usually better, if slower”. It therefore looks as if advanced robots use more complex algorithms that need more computing time.
Given that the advanced robot costs more than a basic robot (nearly 5 times as much on a cost per hour basis) you might expect a decent advantage in performance – but can we measure it?
We should be able to since, somewhat surprisingly, the robots have an EBU NGS rating! For those that don’t know what this is, the English Bridge Union (EBU) runs a National Grading System (NGS) to rate how well their members play. Ratings are based on members' weighted average percentage results for the last 2000 boards played for which results have been uploaded to the EBU.
The possible grades run through the cards from a deuce to an ace with the 4 ace grades split into the individual suits with the Ace of Spades as the highest possible grade. Currently there are only 24 people who hold this highest grade. It was originally stated that the average club player would be a grade 8 (49% to 51%) but the average rating is currently a little below 49%, probably due to the number of new and improving players which skews the distribution to lower values.
You can look up NGS ratings here. On a computer, just enter a registered EBU player’s name in the search box at the top of the page and press return. On a smartphone or tablet you will probably need to click the hamburger menu (three horizontal lines) to access the search box.
To search for the BBO robots you need to enter ‘Basic Bot A’ or ‘Advanced Bot A’ (without the ‘’ marks). Today (5th October 2023) I find that Basic Bot A has a rating of 60.15% but Advanced Bot A, however, has a rating of only 59.35% – just worse than the Basic robot!
Clearly, we need to ask how fair a comparison this is and whether the difference is significant. The results used to generate a players rating are obtained from the performances of pairs of players but the NGS rating generates an individual rating for a single player. It does this by assuming that the rating of a pair of players is the average of their individual ratings.
For example, suppose Alice and Bob usually average 50% while Alice and Charlie average 55% and Bob and Charlie average 45%. We can match these pair ratings by assigning Alice an individual rating of 60%, Bob 40% and Charles 50% if we assume that the pair ratings are the average of the individual ratings.
This is a questionable assumption when the difference in ratings is large and may be a source of error if the people who play with the two different groups of robots have different distributions of ability. In the case of robots we can avoid this assumption since they can play with themselves without cheating! We should therefore be able to generate a genuine individual rating.
If you look up Basic Bot A you will see its overall grade on the left, ( 60.15% on 5th October). On the right, however, you also see the grades that the Bot achieves with the partners that it plays with frequently and at or near the top of the list is Basic Bot B. This is another version of the same program so the figure here is a truly individual rating which today is 60.82%. Doing the same thing for the Advanced Bot A playing with Advanced Bot B we find 58.28% so in this more reliable comparison we see a larger difference of 2.5%.
The question is, however, whether this is a significant difference or just the result of random fluctuations? The NGS documentation states that the standard deviation of results is 2% and, from observation of my own results, I find that my own rating goes up or down by a couple of percent over the months so the claimed standard deviation is reasonable..
My final conclusion is, therefore, that the NGS rating of the basic and advanced robots is essentially the same within the variability of the results. There is certainly no evidence from these data that it is worth paying extra for advanced robots.
I have my doubts, however, whether these data are reliable. The results come from EBU events run on BBO and from events run by EBU virtual clubs such as the Axe Virtual Bridge Club which I help to run. We, in common with other virtual clubs, use pairs of robots in our events to avoid sitouts and I believe that these are always advanced robots. I don’t see how data can be generated for a basic robot playing with itself. I do know that submission of the BBO results to the EBU by the event director requires the entry of the robot’s EBU number and I suspect that some directors may be using the wrong EBU numbers. I will discuss this with the EBU and BBO and report on their comments.
Comments