This directory contains assemblies and other organism-specific data.
This data is freely available but please observe the WU GSC data policy
(contained in the file DATA_POLICY in this directory) if you download,
use, or publish on the data.

DIRECTORY HIERARCHY
-------------------

1. Organism Class

The top level of this directory structure contains a directory for each
organism class or group the WU GSC has sequenced.  The following
classifications are present:

  Primates                         Primate genomic data
  Other Vertebrates                Non-primate vertebrate genomic data
  Invertebrates                    Invertebrate genomic data
  Plants                           Plant genomic data
  Fungi                            Fungal genomic data
  Microbes                         Microbial genomic data
  Othere Single Celled Organisms   Other single celled genomic data


2. Organism

Each of the above directories contains a subdirectory for each organism
of that type.  The name of each of the organism directories follows the
format Genus_species.

3. Genomic Data

Each of the organism directories contains the following subdirectories (if
data of the given type is available for that organism):

  assembly        assembly data
  end_sequences   paired-end sequence data
  genes           predicted gene sequences
  map             fingerprint map data

3.1 Assembly Data

The assembly directory contains a directory for each assembly available
for the organism.  Each assembly has its own version number of the
format M.N and the directory name has the format Genus_species-M.N.  All
assemblies with the same value of M were created from the same set of
genomic data but the assembly parameters, pre- and/or post- processing
was different.  Each assembly directory contains a ASSEMBLY file which
describes the assembly in detail.  For more information on the contents
of an assembly directory see the README_ASSEMBLY file.

Assemblies that have been submitted to the NCBI and can be downloaded
from GenBank using their accession numbers.

3.2 End Sequences

This directory contains FASTA files of the BAC and/or fosmid-end sequences.

3.3 Genes (Predicted Gene Sequences)

This directory contains either CDS sequence, or gene peptide sequence, or possibly both if available. It is for the Human Gut Microbiome project only.

3.4 Map Data

This directory contains the FPC files for the organism.  Only the most
current version is available.
      Name                    Last modified       Size  Description

[DIR] Parent Directory 03-May-2007 15:09 - [TXT] DATA_POLICY 29-Oct-2007 16:13 2k [DIR] Fungi/ 23-Jun-2006 15:38 - [DIR] Invertebrates/ 29-Apr-2008 21:35 - [DIR] Microbes/ 19-Apr-2006 11:04 - [DIR] Other_Vertebrates/ 27-Mar-2008 11:40 - [DIR] Plants/ 15-Dec-2006 14:29 - [DIR] Primates/ 11-Apr-2007 14:54 - [TXT] README 15-Nov-2007 11:03 2k [TXT] README_ANNOTATION 23-Apr-2008 11:05 7k [TXT] README_ASSEMBLY 17-Apr-2007 13:48 6k

Apache/1.3.33 Server at genome.wustl.edu Port 80