Background: The chimpanzee genome (Pan troglodytes) was sequenced to 4X coverage using a West African male chimp known as "Clint" from the Yerkes Primate Research Center (Atlanta, GA) in a collaboration with the Broad Institute of the Massachusetts Institute of Technology and Harvard University (Cambridge, MA) The genome sequencing strategy employed plasmids, fosmids and BAC end sequences. The sequenced reads were assembled using ARACHNE and PCAP (Genome Res. 13(9):2164-70 2003)assembly software tools. The assembly described here (Pan_troglodytes-1.0) represents the PCAP version. This 4X PCAP assembly was filtered for all known non-chimp sequence contaminants. The ARACHNE assembly was used for the analysis that appeared in the September 1, 2005 issue of Nature. Ongoing sequence improvement efforts at WUGSC will act to further enhance this 4X draft assembly. For questions regarding this chimp assembly please visit our existing chimp genome web page and contact the designated person for chimp. Funding for the sequence characterization of the chimp genome was provided by the National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH). Production Sequencing statistics: Pan_troglodytes-1.0 total reads input: 21466717 total reads placed: 18728807 total reads unplaced: 2737910 chaff rate: 0.13 total input phred20 bases: 12486637483 estimated genome size: 3000000000 total contig length added sum: 2687769690 phred >20 bases sequence redundancy: 4.16 X total input reads in bases: 17125154663 average phred >20 per read: 581.67 average base pair read length: 797.75 Sequence assembly statistics: total contig number: 435593 maximum contig length: 144170 major contig (> 1kb) number: 400289 total supercontig number: 81459 maximum supercontig length: 11937040 major supercontig (> 1kb) number: 67734 based on the added contigs sum contig N50 length:13096, contig N50 number: 58492 supercontig N50 length:2344823, supercontig N50 number: 308 total GC counts in the genome: 1095327134 percentage: 0.41 total AT counts in the genome: 1591983718 percentage: 0.59 total NX counts in the genome: 458838 The total repeats bases exceed threshold 50 is about 0.15 % of the total bases total mate pairs forward reverse constraints: 10037101 total unsatisfied constraints excluding due to singleton, short supercontig, and supercontigs end: 89445 total unsatisfied rate: 0.89 % No. of satisfied constraints in contigs: 4189805 No. of unsatisfied in distance in contigs: 21952 No. of satisfied links in scaffolds: 3249474 No. of unsatisfied in dist. in scaffolds: 152310 No. of unsatisfied due to singlets: 1724749 No. of unsatisfied due to short scaffolds: 524321 No. of unsatisfied due to scaffold ends: 31667 No. of other unsatisfied constraints: 89445 No. of redundant constraints: 53378 Total no. of satisfied constraints: 7439279 Total no. of unsatisfied constraints: 2544444 Total no. of constraints: 10037101