We have examined potential links and overlaps
between GDB 6.0 and GSDB 2.0 at the (abstract) level
of OPM classes and attributes (rather than Sybase tables and columns).
The result of our analysis is summarized below.
- Genes.
- Both GDB and GSDB have a Gene class.
In GSDB, genes are considered to be a kind of Feature:
not actually as a specialization, since the same gene can
occur as several features (see attribute Feature.genes).
Other information held on genes in GSDB is
confined to the name of the gene and references to an external
database where primary information on the gene can be found. External
references to either GDB or MOUSEDB are represented by attributes
gdb_xref and mousedb_xref, respectively.
In GDB, class Gene is a subclass of class
GenomicSegment.
Information on genes include why a genomic region is
considered a gene (see tuple attribute Gene.evidence)
and links to gene families the gene belongs to
(see derived attribute Gene.families).
In addition, genes are characterized by additional data regarding
mapping information (see derived attribute GenomicSegment.mapsOf)
and references to derived sequences (see derived attribute
GenomicSegment.sequences).
A gene in GDB can be referenced from an external database
using its GDB accession number (represented by attribute accessionID).
- Sequences.
-
Sequences are primary objects in GSDB (represented by object
class Sequence).
Sequence data include the actual sequence (see attribute sequence),
sequence length (see attribute length), and information on the
source of the sequence.
In addition, there are references to
sequences from other GSDB classes. For example,
a feature is associated with a particular point on a sequence.
Sequence information in GDB is represented by objects of
class SequenceLink.
These objects contain annotations linking
primary GDB objects to external sequence databases such as GSDB,
as well as information regarding
the beginning and end points of sequences
(see attributes startPos and endPos).
A SequenceLink object associates a GDB DBObject,
either a GenomicSegment, a Variation or a GeneProduct,
(attribute dBObject), with an accession number for some external database
(attributes accessionID and externalDB inherited from class
ExternalLink).
- Sources.
-
The GSDB Source class contains
information about the source of a sequence: the organism, species and
so on, which chromosome the sequence is associated with, and the
corresponding cell-type.
In addition, the GSDB Source class contains references to
external taxonomic databases (see attribute taxonomy_xref) and to
GDB probes (see attribute gdb_probe_xref).
Similar data are contained in GDB in the Organism and
Chromosome classes. Objects in class Organism represent
links to an external taxonomic database.
Objects of class Chromosome are characterized by
mapping and organism information.
- Products.
-
Both GDB and GSDB contain classes representing products.
In GDB products are limited to gene products,
while in GSDB a product can be associated with any feature.
In both GDB and GSDB,
these classes seem primarily to serve as a way of
referencing external databases, such as protein databases.
In GDB class GeneProduct has two sub classes,
Protein and RNA,
meaning that a gene product can be either a protein or a piece of
functional (non-messenger) RNA. It's not clear whether products provide
an interesting cross-reference between GDB and GSDB: in particular they
would not in general give rise to direct inter-database links.
However, products can provide important links to
other protein databases such
as PDB, and hence indirect links between GDB and GSDB.
- References/Citations.
- Both GSDB and GDB contain
data representing references or citations.
In GSDB, a Reference object is
considered as a kind of (i.e., a specialization of) Feature
object.
References in GSDB are characterized by titles, publication
status, lists of authors and editors
(see attributes title, pub_status, authors
and editors, respectively),
and external references to the Medline bibliographic
database (see attribute medline_xref).
In GDB citations are represented by objects of class CitationLink
and are further classified in subclasses of Citation
representing books, journals, articles and so on.
(In GDB's HGD, the CitationLink contains only links
to external databases of citations, namely
GDB's Citations Database.)