The Drosophila yakuba Tai18E2 strain used in this project was derived from a single inseminated female captured in the Tai rainforest on the border of Liberia and Ivory Coast by D. Lachaise. This line was inbred for 10 generations via full sib matings and the genome sequenced to a depth of ~8x using short-insert plasmid libraries and large-insert fosmid libraries. The paired-end plasmid and fosmid reads were assembled using PCAP (Genome Res. 13:2164-70 2003) and then two rounds of primer-directed sequence improvement targeted at improving regions of low-quality sequence and closing gaps. The data was re-assembled and supercontigs were assigned to chromosomes based on their alignment to the D. melanogaster genome. Inversions based on the D. yakuba assembly were then introduced and checked against polytene chromosome banding data. The resulting assembly was submitted to GenBank under the accession AAEU00000000. It can also be downloaded from the GSC ftp site. For answers to questions about this assembly or project, or any other GSC genome project, please visit our Genome Groups web page (http://genome.wustl.edu/genome_group_index.cgi) and email the designated contact person. Funding for the sequence characterization of the Drosophila yakuba genome was provided by the National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH). Production Sequencing statistics: project: Drosophila_yakuba-7.1 total reads input: 2244660 total reads placed: 2029679 total reads unplaced: 214981 chaff rate: 0.10 total input phred20 bases: 1522591737 estimated genome size: 160000000 total contig length added sum: 162665134 phred20 bases sequence redundancy: 9.52 X total input reads bases: 1686318108 average phred20 per read: 678.32 average read length: 751.26 total contig number: 13496 maximum contig length: 1458687 major contig (> 1kb) number: 11806 total supercontig number: 8361 maximum supercontig length: 10394945 major supercontig (> 1kb) number: 7035 based on the added contigs sum contig N50 length:115562, contig N50 number: 276 supercontig N50 length:2455785, supercontig N50 number: 15 total GC counts in the genome: 68773991 percentage: 0.42 total AT counts in the genome: 93891016 percentage: 0.58 total NX counts in the genome: 127 total NX counts in the genome: 127 total mate pairs forward reverse constraints: 842649 total unsatisfied constraints excluding due to singlton, short supercontig, and supercontigs end: 34647 total unsatisfied rate: 4.11 % No. of satisfied constraints in contigs: 556379 No. of unsatisfied in distance in contigs: 4722 No. of satisfied links in scaffolds: 57521 No. of unsatisfied in dist. in scaffolds: 8833 No. of unsatisfied due to singlets: 96473 No. of unsatisfied due to short scaffolds: 35977 No. of unsatisfied due to scaffold ends: 8027 No. of other unsatisfied constraints: 34647 No. of redundant constraints: 40070 Total no. of satisfied constraints: 613900 Total no. of unsatisfied constraints: 188679 Total no. of constraints: 842649