On 1 August 1997, US Vice President Gore officially announced the creation of a new World Wide Web database which aims to provide powerful new resources to researchers investigating the molecular basis of cancer. The publicly accessible website is serving to disseminate data generated by the National Cancer Institute's Cancer Genome Anatomy Project (CGAP), which is intended to drive the detailed molecular characterization of pre-cancerous and malignant cells. A first objective in the CGAP is the compilation of a catalogue, or 'index', of genes that are expressed in these cells. But what sort of gene-identification data are being collected to form the backbone of this gene-expression index? The answer to this question is expressed sequence tags (ESTs), which are DNA sequences read from the ends of cDNA molecules. Sequences are to be generated at a rate of 10 000 per week from a large number of cDNA libraries (for a list of libraries see Ref. 1).