Chess Analysis Project FAQ (Frequently Asked
Questions)
Project manager: Dann Corbit
Email: dcorbit@solutionsiq.com
C.A.P. FAQ authors: Dann Corbit; Richard Fowell; Shaun Brewer
C.A.P. Newsgroup http://www.dejanews.com/~c_a_p
Chess Data: ftp://38.168.214.175/pub
The location of this
document:
HTML: ftp://38.168.214.175/pub/Chess%20Analysis%20Project%20FAQ.htm
Word97: ftp://38.168.214.175/pub/Chess%20Analysis%20Project%20FAQ.doc
RTF: ftp://38.168.214.175/pub/Chess%20Analysis%20Project%20FAQ.rtf
Text: ftp://38.168.214.175/pub/Chess%20Analysis%20Project%20FAQ.txt
Benefactors (more to come,
pending permission):
name |
url |
notes |
Applied Computer Concepts Ltd. |
http://www.acc-ltd.demon.co.uk/ |
Mark Uniacke has been very helpful in supplying the
C.A.P. project with the advanced HIARCS 7 computer program. |
Bookup Corporation |
http://www.bookup.com/ |
Mike Leahy "The Database Man" has been very
helpful in supplying excellent Bookup software for our C.A.P. project |
Chess Assistant Co. |
http://www.chessassistant.com/ |
Victor Zakharov of Chess Assistant has been absolutely
instrumental in supplying the C.A.P. team with top-notch database
software. Victor has given many hours
of special consulting and has even written custom software especially for the
project. Victor is also the most
active current participant, with a huge collection of machines operating towards
project goals. Victor Zakharov is
probably the most important and influential member of the C.A.P. team
(including the founder). |
Dr. Robert M. Hyatt, Ph.D. |
http://www.cis.uab.edu/info/faculty/hyatt/hyatt.html |
Dr. Robert Hyatt of the University of Alabama at
Birmingham has been incredibly helpful, both in supplying the Crafty chess
program and also incredibly useful advice to the C.A.P. project. |
Pace University |
http://www.pace.edu/mainN.html |
Pace University has been very helpful in active
participation in the project, through use of computer equipment, time and
resources. We are very grateful. Peter Knopf and Erich Markert have been
instrumental in lending assistance to the project. |
Schröder BV |
http://www.rebel.nl/ |
Ed Schröder has been very helpful in supplying the
C.A.P. project with the powerful Rebel 10 chess program. |
Current
Active Participants............................................................................................................................................................... 4
How do I use Rebel to analyze .EPD files?........................................................................................................................................ 5
How do I use Hiarcs to analyze .EPD files?...................................................................................................................................... 5
Why is the project so "Crafty-Centric?"........................................................................................................................................ 7
I don't see my name in the list of contributors, where is it?......................................................................................................... 7
I can’t run analyze.cmd.
My computer says “Bad command or file name.”.............................................................................. 8
I don’t understand this junk in crafty.rc.............................................................................................................. 8
Can I use my opening books during analysis?...................................................................................................... 10
Can I use my tablebase files during analysis?................................................................................................... 10
Can I use “time cpu” instead of “time elapsed”
(the default)?.................................................................. 10
Can I use sd instead of st?................................................................................................................................................ 10
I shut crafty down by killing the task and all
the analysis went away. What should I
do? 10
How can I get the utmost performance out of
Crafty?.............................................................................. 10
I get “EG fault: a problem occurred during
epdpfga processing.”.......................................................... 10
Sending Results................................................................................................................................................................................. 11
The mailer gave me an error. What should I do?............................................................................................ 11
I heard about another mailer. Can I use that?................................................................................................ 11
Since I can’t get the automatic mailers to
work, what should I do?............................................... 11
Can I combine result sets?............................................................................................................................................... 11
Workload or processing difficulties.............................................................................................................................................. 11
I ran out of time and had to use my
machine. Should I start the batch over?.......................... 11
Your daily batches are much too long. Can you shorten them?......................................................... 11
Your weekend batches are much too long. Can you shorten them?.................................................. 11
I just can’t keep up with all these files you
are sending. Should I drop out of the
project? 11
I want out.
What do I have to do to quit?........................................................................................................... 11
I want back in. What do I have to do to get back in?................................................................................... 11
The ‘magic’ directory........................................................................................................................................................................ 12
How can I get to the ‘magic’ directory, and
what is stored there?.................................................... 12
I’ve been trying to get on to your FTP site
for the last couple of days. I can get to the /usr/analyzer directory, but not
to any of the subdirectories. Is this intentional?.............................................. 12
Generic project miscellaneous goo................................................................................................................................................. 13
What are the goals of the project?.......................................................................................................................... 13
What is the status of the project?............................................................................................................................ 13
What about using the resources of
Distributed.net?................................................................................... 13
Can I make a project suggestion?................................................................................................................................ 14
Project ECO:.............................................................................................................................................................................. 14
Project OrAnG UtAn:............................................................................................................................................................... 14
Project Apocalypse:................................................................................................................................................................. 14
Project Heartwood:.................................................................................................................................................................. 14
Project Stonewall:.................................................................................................................................................................... 14
Project Connect The
Dots:...................................................................................................................................................... 14
Project Brainy-SOC:................................................................................................................................................................ 15
Project NOSE:........................................................................................................................................................................... 15
Project Ghost Games
-- the road not traveled:.................................................................................................................... 15
Project Whirlwind:................................................................................................................................................................... 15
Project Bulldozer:..................................................................................................................................................................... 15
Project "What a
War in '24":................................................................................................................................................ 15
Project "Clash
of the Titans":............................................................................................................................................... 15
Project "Johnny
Walker Takes the Fifth":......................................................................................................................... 15
As a sanity check have you run any statistics
on the data for the current project?......... 16
Do I have to use Crafty to do the analysis?........................................................................................................ 16
I asked to join, but you never sent me a batch. Where is it?..................................................................... 16
I thought this project was to study
openings. Why does my output have
checkmates in it?! 16
Credit where credit is due: Where does the
idea for this project originate?................................ 16
What is the best way to
view C.A.P. data?............................................................................................................ 17
What is this 'Brainy'
thing and how does it work?......................................................................................... 17
I am finally seeing the ECO project
data. Am I to infer that the analysis shows that 1. d4 is white's
strongest opening since it's the highest value of (+20)?................................................................................ 17
"Active Participants" |
Anonymous Mr. |
Anthony Crawley {On Break} |
Bart Van Hoyweghen |
Bernhard Bauer {On Break} |
Bert Quijalvo |
Bill Murphy |
Brandon Beasley {On Break} |
Brandon Galbraith |
Brian Deane |
Brian Schroeder {On Break} |
Bruce Ford |
Bryan R. Drysdale {On Break} |
cell . {On Break} |
Dan Andersson |
Danniel Corbit |
Dave Gomboc |
Dave Pantaleo |
Derek Adair |
Douglas Elznic |
Ed Seid {On Break} |
Erich Markert {On Break} |
Francesco Di Tolla {On Break} |
Glenn Frazier |
Jeremiah Penery {On Break} |
Joao Rita |
John Hartley {On Break} |
John Perry {On Break} |
Johnny McMenamin {On Break} |
Larry Applebaum {On Break} |
Les Fernandez |
Manuel J. Petit de Gabriel |
Mark Ping {On Break} |
McRiley . {On Break} |
Michel Langeveld |
Mike McKee |
Mr. Anonymous |
Paul Walker |
Pete Berger {On Break} |
Pete Rihaczek |
Quenton Fyfe {On Break} |
Ricardo Sant'Ana |
Richard Fowell {On Break} |
Rob Shultz {On Break} |
Roger Davis {On Break} |
Russ Garratt |
S Warren Lohr |
Secret Secret |
Shaun Brewer |
Stefan Hildingstam {On Break} |
Terry Bohannon |
thewiz . {On Break} |
Thomas F. Mooney, III |
Tom Davie |
Tony Day {On Break} |
Victor Zakharov |
A few of these are on "unofficial" break…
Contributed by Shaun
Brewer
1.
Ensure the EPD file
to analyze has the .EPD extension. (Note: analysis is written to this file so
it would be prudent to take a backup first).
2.
If the first line
of the EPD file contains TIME=mm:ss then this will be used for the analysis
time otherwise you will be asked to enter a time.
3.
Start Rebel10 (For
the best performance Run in DOS this will give approximately 5% improvement in
NPS. Machines with 64mb or more RAM may experience problems if you reboot to
DOS from Windows95. Shutdown Windows95 and restart in DOS safe mode to avoid
this, you have to load the mouse drivers yourself). If you have problems with
this email me, Shaun@brewer-1.freeserve.co.uk, and I will try to help).
4.
Ensure Rebel10 has
the following settings, these are Rebel10's default setting apart from the
Combination setting (Settings can be found under the 'Options' menu):
·
Anti-GM set to
SMART
·
EOC use
(Encyclopedia of Chess) set to 'N'
·
Combination to Off
- This is probably the only setting you will have to change, off is the default
for game play and therefore this will, I believe, give the best results for
normal positions.
5.
From the 'Extra'
menu select 'Analyze EPD file' then select the file to analyze. (Fairly
standard 'Windows type' file selection).
6.
If the 'TIME' line
was not present in your EPD file you will be asked for the time control
otherwise analysis will commence immediately.
Example of an EPD file
for REBEL10 for 15 minute analysis a valid file name would be 'TEST.EPD'.
TIME=15:00
2b3k1/p4qpp/p1p2n2/3P1P2/8/N3P3/r2PQ1PP/5RK1 w - -
2b3k1/p4qpp/p1P2n2/5P2/1r2P3/3P4/4Q1PP/5RK1 w - -
2b3k1/p4qpp/p1P2n2/5P2/1r2P3/3P4/6PP/3Q1RK1 b - -
2b3k1/p4qpp/p1P2n2/5P2/4P3/3P4/1r3RPP/3Q2K1 b - -
Example output (30
second analysis).
TIME=00:30
2b3k1/p4qpp/p1p2n2/3P1P2/8/N3P3/r2PQ1PP/5RK1 w - - ce -317; pv
00:00:18 4.00 -3.18 Na3-b1 c6xd5 Nb1-c3
Ra2-b2 Nc3-d1 Rb2-c2 Nd1-c3 ; c0 Analysis by Rebel 10.0; c1 fixed: ; c2 Key
move not found after 30 seconds; c3 Total moves found sofar 0; c4 Total time
sofar 30 seconds;
2b3k1/p4qpp/p1P2n2/5P2/1r2P3/3P4/4Q1PP/5RK1 w - - ce -417; pv
00:00:12 6.00 -4.18 Qe2-d2 Qf7-b3
Qd2-e1 Kg8-h8 Qe1-g3 Nf6-e8 ; c0 Analysis by Rebel 10.0; c1 fixed: ; c2 Key
move not found after 30 seconds; c3 Total moves found sofar 0; c4 Total time
sofar 60 seconds;
2b3k1/p4qpp/p1P2n2/5P2/1r2P3/3P4/6PP/3Q1RK1 b - - ce 508; pv
00:00:22 7.00 5.09 Qf7-c7 Kg1-h1 Qc7xc6
Qd1-a1 Qc6-b7 Rf1-c1 Rb4-b2 ; c0 Analysis by Rebel 10.0; c1 fixed: ; c2 Key
move not found after 30 seconds; c3 Total moves found sofar 0; c4 Total time
sofar 90 seconds;
2b3k1/p4qpp/p1P2n2/5P2/4P3/3P4/1r3RPP/3Q2K1 b - - ce 493; pv
00:00:23 8.00 4.93 Qf7-b3 Qd1xb3+
Rb2xb3 Rf2-d2 Rb3-c3 Rd2-a2 Rc3xc6 ; c0 Analysis by Rebel 10.0; c1 fixed: ; c2
Key move not found after 30 seconds; c3 Total moves found sofar 0; c4 Total
time sofar 120 seconds;
The Rebel homepage is here:
http://www.rebel.nl/edindex.htm
Using HIARCS to analyze .EPD files
Contributed by Richard A. Fowell
Suppose you have a file full of .epd problems. To be specific, perhaps
Dann Corbit sent you a file and asked you to analyze each problem for
907 seconds.
1) Strip off all the non-epd lines from the file, to get a file
that consists of nothing but
lines like the six below.
(if you're curious - the last
few characters of the first line mean:
Black (b) is on the move,
white can legally castle on either the
king or queen side (KQ), black
can legally castle only on the
king side (k), and the side on
the move can capture en passant
on the f3 square. For the
position in the second line, White is
on the move (W), neither side
can legally castle (-), and there
are no legal en passant
captures (-).)
1nb1kb1r/1p1p1ppp/1qp2n2/4r3/2PNpP2/P1N1P3/3P2PP/R1BQKB1R b KQk f3
1nb1r1k1/1p3ppp/2p1pn2/q2p4/2P5/2N1PN2/3PBPPP/3Q1RK1 w - -
1nb1kb1r/1p1p1ppp/1qp2n2/4r3/2P5/P1N1PN2/3P2PP/R1BQKB1R b KQk -
1nb1k2r/1pq1bppp/2pp1n2/1P6/8/2N5/3PPPPP/B2QKBNR w Kk -
1nb1r1k1/1p3ppp/2p1pn2/q2p4/2P5/2N1PN2/3PBPPP/2Q2RK1 b - -
1nb1r1k1/1p3ppp/2p1p3/q2p4/2PPn3/2N1PN2/4BPPP/2Q2RK1 b - d3
2) Start up HIARCS, and set up the following settings:
(do this in the order specified
- some commands are order dependent)
Level: Infinite
HIARCS: unselect "Permanent Brain"
HIARCS: Selectivity = 5
HIARCS: Style = Normal
Options: Book = Off
Options: Max Time = 907
(This last bit is what set the
thinking time)
3) Go to File: Analyse EPD, and select the file with the EPD positions.
4) Go to lunch, work, sleep, whatever. Don't try to run something else
while analyzing the problem (no
programs in background, either),
or HIARCS won't get the number
of CPU cycles intended.
5) Wait until the time equal to the number of lines times the thinking
time, for HIARCS to complete.
(It will say something like:
"Your Move.
5 positions done."
in the display window at the lower
right of the main window
when it is done).
6) The .epd file should now be processed - if you look at it in
a text editor, it will look
something like the lines below.
The "ce" stands for
"centipawns" (evaluation expressed in 1/100 pawn.
The "pv" stands for
"principal variation", the best line HIARCS found.
1nb1kb1r/1p1p1ppp/1qp2n2/4r3/2PNpP2/P1N1P3/3P2PP/R1BQKB1R b KQk f3 ce
-14; pv exf3 Nxf3 Rh5 Be2 Bd6 O-O Qc7 h3 ;
1nb1r1k1/1p3ppp/2p1pn2/q2p4/2P5/2N1PN2/3PBPPP/3Q1RK1 w - - ce
-87; pv Qc2 dxc4 Bxc4 b5 Bd3 e5 Ng5 g6 ;
1nb1kb1r/1p1p1ppp/1qp2n2/4r3/2P5/P1N1PN2/3P2PP/R1BQKB1R b KQk - ce
-14; pv Rh5 Be2 Bd6 O-O Qc7 h3 ;
1nb1k2r/1pq1bppp/2pp1n2/1P6/8/2N5/3PPPPP/B2QKBNR w Kk - ce
-38; pv bxc6 bxc6 e4 O-O d4 Be6 Bb2 Bg4 ;
1nb1r1k1/1p3ppp/2p1pn2/q2p4/2P5/2N1PN2/3PBPPP/2Q2RK1 b - - ce
112; pv dxc4 Bxc4 b5 Qb2 Na6 Be2 Bb7 Ra1 Qb4 Qxb4 Nxb4 d4 Nc2 ;
7) Email the processed file to Dann.
Hiarcs Home page is here:
http://www.acc-ltd.demon.co.uk/
For those who don't know about Crafty, Crafty is a freely available Chess program that runs on a large number of platforms. You don't have to use Crafty. Crafty works on Unix, Mac, PC's etc. and can be used to accomplish all of the project needs. If you have another tool that you would prefer to use, that would be just as good or better.
Currently, most of the systems in use are running Crafty. There are a few Hiarcs and Rebel entries as well. Any new system can be used, as long as the EPD output format is documented.
Superior results are sure to result from a broad range of systems and software. Where one program is weak, the other may be strong and vice-versa.
If you have suggestions as to additional tools to expand the functionality of the project, by all means, send a note to dcorbit@solutionsiq.com and/or user923005@aol.com.
You probably won't see your name in the list until I receive your first batch of data. If you have sent the data and are still not in the list, please send me an email and I will update the list. It's just an oversight. Also, if you are using Microsoft Internet Explorer, please remember to press the "refresh" button.
A: That’s my
fault. I should have named it
“analyze.bat” because Windows 95 does not know what to do with a file with an
extension of “.cmd”. If you simply
rename the file analyze.bat, it should run.
A: (piece by piece – for more details, get ftp://ftp.cis.uab.edu/pub/hyatt/v15/crafty.doc which explains the settings in exquisite detail)
st=720.
The st
command is used to set the number of seconds used. This number will be different for each different computer. I use the NPS value given by crafty to
determine this setting.
time cpu
The default
is “time elapsed” and it turns out that this setting is better. If your computer is multi-tasking a bunch of
other programs, then this setting will ensure that process time is used,
instead of wall time. It probably won’t
make a lot of difference, since Crafty has a remarkable ability to hog cpu, but
it could make some difference.
The display settings:
display notime
display nochanges
display novariation
display nostats
display noextstats
display nomovenum
display nomoves
display nogeneral
These
display settings turn off the updating of the screen. In the case of the current analysis, they will make little or no
difference in the EPD throughput. When
the phase of the project to analyze at very short time intervals arrives, it
will make a large difference.
log off
Another
thing to prevent wasting time. I don’t
need the log files, and they just chew up a lot of space if you are going to
run crafty frequently. During the
initial stages of the project you can leave it on, if you like to read the
logs. But when it is time for short
interval analysis, please turn it off.
name ferret
This is necessary
to fool crafty into thinking it is playing against a tough computer
opponent. It has internal tables of
names and this is one that it recognizes as a formidable opponent. If you change it to your name or something
else, it will play differently.
hash=64M
This is how
much memory crafty uses for position hashing.
If you don’t know what to do with this figure, then divide your
machine’s memory by two and use that.
Crafty will use slightly less than that amount. You can’t make it as big as the machine’s
memory, because a bunch of memory is used by the operating system, other
programs, etc. If you are a crafty
expert and you have a scientifically derived favorite setting, go ahead and use
that. It will not make enormous
improvements to find an optimal setting, but it can make crafty slightly
faster.
hashp=16M
This number
is the memory used for hashing of pawns.
If you don’t know what value to set it to, just set it to one fourth of
hash. This number should be a lot
smaller than hash, but it should be at least two, if possible.
book random 0
Actually
this one should not make any difference, since we should not be using a book at
all.
egtb=4 {for crafty v15 and earlier, or v16 specially compiled…}
egtb {for crafty 16 and above}
If you have
the tablebase files, then set this number to the appropriate value. I happen to have tablebase files to level
4. These files occupy an enormous
amount of disk space. They are not
required for analysis but will make crafty do a better job when we are near the
end of a game. If you want to get them,
you can find them at Dr. Hyatt’s ftp site:
ftp://ftp.cis.uab.edu/pub/hyatt/TB/
I’m not
asking anybody to use them. Use them
only if you have a real desire to do so.
If you do not have any tablebase files, definitely set this value
to zero (0) for Crafty version 15 and earlier.
If you are
using Crafty version 16.0 or higher, just say egtb. You don't have to set the level number for the Namilov endgame
tablebase files like you do for the Edwards format tablebase files used in
Crafty 15.xx and earlier.
epdpfga epd.epd epd.out
This is
where the magic happens. You will
notice that I always send you a file called epd.epd. This is a file of EPD positions.
You can learn more about EPD, if you are interested, by reading the PGN
specification, which is found here:
http://www.clark.net/pub/pribut/standard.txt
and specifically section “16.2:
EPD.” The command epdpfga reads
epd.epd, sends each row to the crafty engine, and writes the resultant analyzed
row to epd.out. The file epd.out is
what gets mailed back to me.
exit
This
command tells crafty that from now on, commands will be coming from the
keyboard instead of from the file crafty.rc.
Crafty FAQ:
ftp://ftp.cis.uab.edu/pub/hyatt/crafty.faq
Dr. Hyatt's Official Crafty site:
ftp://ftp.cis.uab.edu/pub/hyatt/v16/
You must not use your opening books during analysis. Crafty will simply skirt over any row that is already in its books with a tiny fraction of the time it would spend on a normal position. Since we want to carefully analyze anything and everything we look at, please do not use an opening book. If you don’t know what an opening book is, don’t worry about it. I did not send one in the sample files, so if you are using those, there is no difficulty.
In this case, it is a very good idea, if you have them. Tablebase files can give crafty a proof of outcome and hence give better choices when in the endgame. However, they consume an enormous amount of space, so if you do not want to use them, don’t bother.
Actually, this is much better than going by the default time setting of “time elapsed.”
Please do not change to sd (which is by number of plies). When we are analyzing some positions, crafty will be able to get 20 or 30 plies. On others only 10 or 11. If you change to sd, then crafty may quit a position much too early or analyze so long on one position it never even finishes a single row.
Well, the best answer is to have typed ‘quit’ at the crafty prompt. In that way, all the work up to that point would have been saved, and you could have mailed the part that was complete. Just send me a note that the batch got killed and we won’t worry about it. When you kill crafty by clicking on the little ‘x’ in the corner, it does not flush the current calculations to disk. Quite likely, all the effort will be lost. However, if you type ‘quit’ to exit, then all the work up to that point will be saved.
Fiddling with memory hash settings can get you a tiny bit of speed. It’s not really worth the bother unless you are keenly interested in it. If you get the tablebase files, crafty will play much better in the endgame. But that takes a ton of disk space. If you are using Windows NT, then you can issue the command
START /REALTIME CRAFTY2
Or something similar to that. However, I do not recommend this, because it will be virtually impossible to do anything else on the machine. Even getting crafty to quit can be a chore. If something else gets focus (a popup from another app, etc) it can be difficult to get control back to the crafty window. The way to get control back is to get focus on the crafty window and then type ‘quit.’
If we are talking about an initial test run, then nothing is amiss. The file I sent has a DOS end of file marker in it (ASCII 26) which makes pfga-processing croak. It has already processed the two identical rows by that time. If you get that error during a production batch then something is wrong. Please send me a note and both the input and output files.
Don’t use it. It’s a piece of junk. I thought it would work nicely, since it works great here. It only works for CMC compliant mail systems such as x.25 and x.400. It fails for most people.
The other mailer is in the special magic ftp directory. If you have SMTP mail, and if you know the name of your SMTP server, you can give it a try. Most likely, you will just have to attach the file to a mail message manually. One thing I would really like is a generic mailer that will work on all or most systems.
Just attach the file epd.out to a mail message manually
Please do. Put all Crafty results together into one file called EPD.OUT. Put all Rebel results into a single file called EPD.REB. Put all Hiarcs results into one file called EPD.HIR. This will greatly simplify my processing of the results.
No. Just send whatever got finished. The stored procedure that generates a batch of epd rows to process will re-issue any rows that are not received back. So it does not matter if they are never received. Just send back whatever did get finished.
Yes. Please tell me how much shorter to make them. I can change it to anything you want and it causes no additional work for me. I have a ‘fudge factor’ built into the table, which has a default value of 1. If we change it to 0.5, your batches will take exactly half as long to run (for instance).
Yes. We can shorten them to any size you like. Just let me know how much shorter to make them. I have a fudge factor for these, just like the daily batches.
If you can’t finish a batch of positions, don’t worry about it. They will get caught up automatically later. It also won’t matter if you send the same batch twice. If you are sick of it, you can drop out if you like. Or you can just ask for a rest. Whatever you like.
Send me an email with REMOVE as the subject. We hate to see you go, but we are very glad of whatever help you did give.
If you never sent in a batch, you will have to run the startup tests all over. You will have been deleted from my database, so I will not know anything about your systems. If you have sent in a batch of data, just let me know you want reactivated, and I will change a setting.
Special instructions for ftp connections to chess
project analyzers.
You can log onto my ftp site at ftp://38.168.214.175
then
give the username: Analyzer
then
give the password: ******* <send
an email to dcorbit@solutionsiq.com
for the special password. It is only
for team members. Since this FAQ will
be found at my ftp site ftp://38.168.214.175/pub
where anyone can get at it, I am not telling the password in this FAQ. If you are not a team member, you will not
be given this password.>
You would see a session similar to the following:
Connected
to 38.168.214.175.
220-Dann
Corbit's FTP using WAR ftp daemon WarFTPd 1.70.b01.02 Ready
(C)opyright 1996 - 1998 by Jarle (jgaa)
Aase - all rights reserved.
220
Please enter your user name.
User
(38.168.214.175:(none)): Analyzer
331
User name okay, Need password.
Password:
230
Welcome, Fellow EPD Analyzers, and thank you!User logged in.
From that point, you can change to a special directory:
ftp>
cd /usr/Analyzer
250
"/usr/Analyzer" is current directory.
Notice that you CANNOT chage to this directory in two
moves as in:
cd /usr
cd
Analyzer
since there are no rights granted to the root of the usr
directory.
This is the location where project output and
information will be stored, for those who would like to see project progress
and play with the result sets.
Since this project is public domain, you can do anything
you want to with the results, even give them to someone else or whatever. I do ask that you not give out the user name
and password to this special account to non-team members.
I goofed it up. It has been fixed now. My apologies. Team members now have rights to all subdirectories under /usr/Analyzer
The goals of the project are to do a systematic analysis for tactical errors of a large database of chess games. This analysis will be made available as public domain. After each phase of the project is complete, team members will have immediate access to the data. I will publish the data at my ftp site three months later. You may do anything you like with it, including transferring it to others immediately if you so desire. The schema for the database will be in standard SQL and will also be released to the public.
Several million rows have been processed. I only have one million in my possession, as the others are still in Russia. I am expecting them to be shipped to me momentarily.
In reality, even with the addition of the supercomputer help, we will not realize the true throughput because the actual number of participating units will not be 100% of the physical total. To start with, we will have between four and eight (out of thirty possible) supercomputer units participating. This still gives total Pentium Pro 200 throughput of between 945 and 545 machines equivalence.
The distributed.net people have been contacted with a proposal. As soon as results become available, I will make a public announcement.
I would be thrilled to receive any and all project
suggestions. Once we get the mailing
list completed, that would be an excellent place to post them for open
discussion. Until that point, you can
mail them to me.
Some current suggestions include:
An analysis of every distinct position from the Encyclopedia of Chess Openings standard classified openings. This comprised 4038 unique board positions.
[Completed]
ftp://38.168.214.175/pub/Public_CAP_Results/ECO/
Calculate the value of positions from 1. b4 (Orangutan) games. This contained 126,082 unique board positions.
[Completed]
ftp://38.168.214.175/pub/Public_CAP_Results/Orangutan/
System Calibration using a 7000 row EPD test suite. Additional insight may be gleaned by analyzing the correct choice as indicated by 'bm' to see if simulated annealing will find the best move.
[Completed]
ftp://38.168.214.175/pub/Public_CAP_Results/Apocalypse/
Analysis of the most frequently played chess positions, by order of frequency. First, positions gained at least 100 times in a chess game (there are over 27,000 of these in our database). Then, positions gained at least 75 times, then 50, etc. Imagine a forest of 400 trees (there are 400 combinations for the first move by white and the first response by black). Some are scraggly little shrubs (hardly ever played and not usually successful). Some are mighty Sequoia trees (played thousands or millions of times). We examine the heartwood of the big trees to find out why they grow so tall. Changed the name from project onion to project heartwood, because the former name "stunk".
Phase 0: All positions achieved 200
or more times [Completed]
Phase 1: All positions achieved 100
or more times [Completed]
Phase 2: All positions achieved 50
or more times [Completed]
ftp://38.168.214.175/pub/Public_CAP_Results/Heartwood/
NOTE:
the Brainy-Soc data contains all of Heartwood and is much more complete.
An in depth analysis of the computer killer "Stonewall Attack (ECO D00)"
[Completed]
ftp://38.168.214.175/pub/Public_CAP_Results/Stonewall/
This is a largely SQL procedure which will attempt to magnify the ply depth by iterative improvement. This could result in preferred variations hundreds of moves in length. The name may not be as exciting as "Simulated Annealing" but it is a lot more descriptive. For any calculated point, we go to the next point and collect the ce and pv from that point to update the old data for this point. We repeat the procedure for every point in the database. We repeat this entire process as many times as we like.
[Programming completed]
"Brainy-SOC" stands for "Brainy - Son of Crafty." The Crafty chess program will be modified to use the C.A.P. database as an opening book. Given some board position, if analysis for that position exists, then the ce value along with win/loss/draw statistics will be given to Crafty for decision making. As the program plays, it will also update the database with any new information. Several computer chess engines now use this data.
[Completed]
ftp://38.168.214.175/pub/Public_CAP_Results/Brainy/
New Opening Search Experiment will have Brainy-SOC engines play against each other at 12 minutes per move of PII 300 MHz equivalent time. There will be 400 individual games, each one starting with one of the 400 possible opening positions after two plies. This will require 800 computers (or a lesser number with several games each). Since they will have the existing C.A.P. Database, most of the games will run very quickly. [Programming in progress]
For every pv which is suggested, but never has been played, expand the pv a fixed number of moves and analyze the positions suggested by the program.
[Phase 0 -- ECO missed
opportunities Completed]
Analyze every single data point at two seconds per position. This will provide tens of millions of data points. While not terribly useful for game playing, these could be valuable for preliminary "connect the dots" experiments.
[In progress]
Analyze positions that have not yet been checked using a "brute force" technique. For each ply starting with a bare board, generate legal positions. For those positions without certain checkmate pending given optimal play, generate the next set of positions. These positions will be reduced by selecting only unique positions and transforming the EPD format to that used by BOOKUP (purely positional).
Analyze each position in the 1924 New York International Chess Tournament. This analysis can be compared and contrasted to the analysis in the on-line book Battle Royale by Steven Lopez:
http://chessbaseusa.com/NY1924/ny1924.htm
This should prove very helpful, because we will have an enormous amount of human analysis on exactly the same topic completed. This may help to show areas where either type of analysis can benefit from the other.
[Completed]
ftp://38.168.214.175/pub/Public_CAP_Results/War_In_24/
Analyze each position in a collection of games between Super GM players rated 2600 and above.
[In progress]
We will analyze all positions from the Scotch Defense, along with variations and related Gambits.
[In progress]
The batches are all carefully scrutinized as a sanity check. If any rerun of a batch shows significant differences from the first run, both batches are carefully reanalyzed by hand.
No. In fact, the use of as many tools as possible will likely yield better analysis. I would like to see Rebel, Hirarcs and other programs used as well. We have only a few Rebel and Hiarcs entries. This sort of cross check should produce superior results to the use of any one tool.
I’m still waiting for your sample run and NPS. If you are using crafty, you will not receive a batch request until you send me your NPS figure (obtained by running the bench command). If you are running another program, you will not receive a batch until you give me your exact hardware specification.
I
am not just sending opening positions, but game positions clear out to the
end. This has several
implications. First of all, probably
2/3 of the work will do nothing to improve the theory of chess openings. Secondly, the last 1/3 of the moves (towards
the end) will probably never be played again, since the odds are stacked
against it. Considering these things,
it may seem to be a wasted effort. But
I want more to come out of the project than simply to improve the understanding
of openings. Analysis of middle game
and end game play could also benefit.
Since the data will eventually be delivered in an ANSI SQL database, you
will be able to study whatever you like with it. Also, I hope to extend good openings beyond the current
limits. I don't really know where the
opening is going to leave off and the middle game is to start. I could impose an arbitrary restriction, but
I might be wrong about it. It is much
simpler for me to simply fragment the entire set of games into EPD positions
and batch them up. After crunching
through a few openings this way, I might reevaluate and cut the games off at
some sensible point.
The concept for analysis of chess positions is no doubt an ancient one, but a lot of the credit for the concept of this project must be given to "Komputer Korner" from news:rec.games.chess.computer, and specifically the thread entitled "Komputer Korner's World's Greatest Opening book project with CDB" dated 1998/04/12.
Chess
Assistant 5.0 will have special features to enable comprehension and study of C.A.P.
data. Until that product is released, I
suggest using CDB [by Peter Klausler] for smaller subsets of your data that you
would like to study. There seems to be
a bug in CDB where about 5% of the EPD rows do not import properly, but most of
the data can be viewed in a very nice way after importing both the relevant PGN
and EPD data into a CDB database. Some
of the public release directories listed below also contain CDB database files
with the data pre-loaded. There may be
other options that I am not aware of.
The
'Brainy' project uses C.A.P. data in a binary database format. The EPD rows are loaded into an abstract
data type called a skiplist in a special binary format. This data may then be queried by chess
programs or other systems wishing to investigate the data. The initial coding has been completed and
the API is formally completed. There
are 4 sites beta testing the interface.
The data is returned in an array of structures that look like this:
typedef struct {
short ce; /*
hundredths of a pawn */
unsigned char
acd; /* Depth in plies */
char *pm; /*
Predicted move (by computer) */
char *am; /*
Avoid this move */
char *bm; /*
Best move */
} epd_info;
where
one of these structures is returned for each EPD position requested by the
caller.
No, you can infer that with about 7 full moves, the best tactical advantages
are seen. That should not be surprising, since 1. d4 is a real
firecracker compared to the much more ponderous 1. e4 much of the time.
But the value of +20 is 20/100 = 1/5 of a pawn. That is not much, when
you think about it. And the computer analysis does not include strategic
planning of any sort.
Think of this generation of CAP data as "Deep Blunder Checks" rather
than move suggestions. However, this data can be used in combination with
other data to make good move choices. Consider a database where you have
win/loss/draw statistics, and the ELO of players who make the choices. Also, it
includes GM analysis. You can look at the CAP data in conjunction with
this information to make an informed choice of what really is the best move.
The CAP analysis is a tactical dream. Imagine a great GM who does not
miss tactical things and who tirelessly scrutinizes chess positions for
tactical advantage. Will he be able to create millions of entries that
are flawless? Surely not. The CAP
initiative is the only way to arrive at this kind of information. On the
other hand, positional sacrifices and even gambits fool chess programs!
So we need more information to really choose what move to make.
Here is what the chess programs do poorly at:
1. Strategic thinking -- there is *none* because chess programs are
completely incapable of this. Chess programs do not plan. Sometimes
you will see them do something very silly. That is because they are only
looking at combinations and how to form more combinations. They don't
have a real goal at all.
2. Sacrifices and gambits -- if the answer is deep, they won't see
it. The analysis created by CAP averages 13 plies. That is 6 and
1/2 full moves deep. If the benefit is farther than the current analysis
record, it will be completely invisible to a chess program, even though a human
can see it easily.
3. Positional moves -- chess programs do very poorly at this. For a
teaser, try the LCT II EPD test suite against the chess engine of your choice
and let it think all day for each move. It will still get the wrong
answer on several of them.
The CAP data is not an answer -- it is only part of an answer. You will
need more information to choose truly great chess moves. Fortunately, you
can get a package like Chess Assistant 5.0, which will give you all the
information that CAP data lacks. (Thanks to Josh Coleman for this question
-- it has actually been asked [in various forms] many times and each time I
answered it by email)