This directory contains assemblies and other organism-specific data.
This data is freely available but please observe the WU GSC data policy
(contained in the file DATA_POLICY in this directory) if you download,
use, or publish on the data.
DIRECTORY HIERARCHY
-------------------
1. Organism Class
The top level of this directory structure contains a directory for each
organism class or group the WU GSC has sequenced. The following
classifications are present:
Primates Primate genomic data
Other Vertebrates Non-primate vertebrate genomic data
Invertebrates Invertebrate genomic data
Plants Plant genomic data
Fungi Fungal genomic data
Microbes Microbial genomic data
Othere Single Celled Organisms Other single celled genomic data
2. Organism
Each of the above directories contains a subdirectory for each organism
of that type. The name of each of the organism directories follows the
format Genus_species.
3. Genomic Data
Each of the organism directories contains the following subdirectories (if
data of the given type is available for that organism):
assembly assembly data
end_sequences paired-end sequence data
genes predicted gene sequences
map fingerprint map data
3.1 Assembly Data
The assembly directory contains a directory for each assembly available
for the organism. Each assembly has its own version number of the
format M.N and the directory name has the format Genus_species-M.N. All
assemblies with the same value of M were created from the same set of
genomic data but the assembly parameters, pre- and/or post- processing
was different. Each assembly directory contains a ASSEMBLY file which
describes the assembly in detail. For more information on the contents
of an assembly directory see the README_ASSEMBLY file.
Assemblies that have been submitted to the NCBI and can be downloaded
from GenBank using their accession numbers.
3.2 End Sequences
This directory contains FASTA files of the BAC and/or fosmid-end sequences.
3.3 Genes (Predicted Gene Sequences)
This directory contains either CDS sequence, or gene peptide sequence, or possibly both if available. It is for the Human Gut Microbiome project only.
3.4 Map Data
This directory contains the FPC files for the organism. Only the most
current version is available.
Name Last modified Size Description
Parent Directory 03-May-2007 15:09 -
DATA_POLICY 29-Oct-2007 16:13 2k
Fungi/ 23-Jun-2006 15:38 -
Invertebrates/ 29-Apr-2008 21:35 -
Microbes/ 19-Apr-2006 11:04 -
Other_Vertebrates/ 27-Mar-2008 11:40 -
Plants/ 15-Dec-2006 14:29 -
Primates/ 11-Apr-2007 14:54 -
README 15-Nov-2007 11:03 2k
README_ANNOTATION 23-Apr-2008 11:05 7k
README_ASSEMBLY 17-Apr-2007 13:48 6k
Apache/1.3.33 Server at genome.wustl.edu Port 80