No country information defined
Database scope and data types
No support information provided.
PhosphoSite is a mammalian protein database that provides information about in vivo phosphorylation sites. This datatype refers to protein-level information, providing a list of phosphorylation sites for each protein in the database.
The objective of this project is to generate the most comprehensive description of human chromosome 7 to facilitate biological discovery, disease gene research and medical genetic applications.
Animal Transcription Factor Database
Apo- and Holo- structure pairs of proteins
RNAiDB provides access to results from RNAi interference studies in C. elegans , including images, movies, phenotypes, and graphical maps.
The Candida Genome Database (CGD) provides access to genomic sequence data and manually curated functional information about genes and proteins of the human pathogen Candida albicans. It collects gene names and aliases, and assigns gene ontology terms to describe the molecular function, biological process, and subcellular localization of gene products.
Bitter taste: molecules and receptors
Bacterial protein tYrosine Kinase database
Central Aspergillus Data Repository
The database of human DNA Methylation and Cancer (MethyCancer) is developed to study interplay of DNA methylation, gene expression and cancer. It hosts both highly integrated data of DNA methylation, cancer-related gene, mutation and cancer information from public resources, and the CpG Island (CGI) clones derived from our large-scale sequencing.
Classification of helix cappings in protein structures
Shanghai RAPESEED Database: a resource for functional genomics studies of seed development and fatty acid metabolism of Brassica.
GreenPhylDB v2.0 comprises 16 full genomes from the major phylum of plant evolution. Clustering of these genomes was performed to define a consistent and extensive set of homeomorphic plant families.
The aim of this Oryza sativa database was first to display sequence information such as the T-DNA and Ds flanking sequence tags (FSTs) produced in the framework of the French genomics initiative Genoplante and the EU consortium Cereal Gene Tags.This information was later linked with related molecular data from external rice molecular resources (cDNA full length, Gene, EST, Markers, Expression data...).
Oryza Tag Line consists in a searchable database developed under the Oracle management system integrating phenotypic data resulting from the evaluation of the Genoplante rice insertion line library.
Sequence and structures of peptides expressed by marine cone snails
Corynebacterial Regulation Network
Dragon Antimicrobial Peptide Database
TropGENE DB is a database that manages genetic and genomic information about tropical crops studied by Cirad. The database is organised into crop specific modules.
Transcriptional start sites and adjacent promoters
Human Disease-Related Viral Integration Sites
Community-based pages about non-pathogenic E. coli
The GPCRDB is a molecular-class information system that collects, combines, validates and stores large amounts of heterogenous data on G protein-coupled receptors (GPCRs). The GPCRDB contains data on sequences, ligand binding constants and mutations. In addition, many different types of computationally derived data are stored such as multiple sequence alignments and homology models.
Eukaryotic linear motif: functional sites in eukaryotic proteins
Drosophila sequences and genomic information
Genome-wide RNAi analysis data in Drosophila
Functional genomics of fungi
Evolution of novel enzyme functions in enzyme superfamilies
T-DNA insertions in Arabidopsis, their flanking sequence tags
Human Gene and Protein Database: experimental results of human proteomics
Human histone database
PGN is a repository for plant EST sequence data located at Cornell. It comprises an analysis pipeline and a website, and presently contains mainly data from the Floral Genome Project.
Integrative and Conjugative Elements in Bacteria
Intrinsically Disordered proteins with Extensive Annotations and Literature
Data management and analysis system for metagenomes
The Sol Genomics Network (SGN) is a database and website dedicated to the genomic information of the nightshade family, which includes species such as tomato, potato, pepper, petunia and eggplant.
Evolution of protein-protein Interfaces
Model Legumes Integrative database Platform
Putative transcription factor binding sites in various genomes
Mouse Genome Database
Search tools for short functional motifs involved in posttranslational modifications, binding to other proteins, nucleic acids, or small molecules
Major Intrinsic Protein superfamily Models
Plant microRNA Expression data
Mitochondrial proteomics data
The Influenza Research Database (IRD, http:// www.fludb.org) is a free, open, publicly-accessible resource funded by the U.S. National Institute of Allergy and Infectious Diseases through the Bioinformatics Resource Centers program. IRD provides a comprehensive, integrated database and analysis resource for influenza sequence, surveillance, and research data, including user-friendly interfaces for data retrieval, visualization and comparative genomics analysis, together with personal log in- protected ‘workbench’ spaces for saving data sets and analysis results. IRD integrates genomic, proteomic, immune epitope, and surveillance data from a variety of sources, including public databases, computational algorithms, external research groups, and the scientific literature.
The DrugBank database is a bioinformatics and chemoinformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information.
Data on red spotted newt Notophthalmus viridescens
A database of noncoding RNAs
Families of nuclear hormone receptors
Online GEne Essentiality database
Protein families: Multiple sequence alignments and profile hidden Markov models of protein domains
Experimentally characterized Prokaryotic GlycoProteins
Protein-Chemical Structural Interactions
GWASdb that contains more GWAS moderate effect data than the GWAS Catalog, GWAS Central and which manually curated from the literature. In addition, GWASdb provides very comprehensive functional annotations for each genetic variants. It provides a structured ontology mapping for GWAS traits.
Group II introns database
Simple modular architecture research tool: signalling, extracellular and chromatin-associated protein domains
Phenotypic effects of human coding SNPs
The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline.
Sequences extracted from European Patent Office (EPO) patents
Autism genetics KnowledgeBase
Compilation and Creation of datasets from Protein Data Bank
Functional divergence in human protein families
Protein interaction data for Death Domain superfamily
microRNAs in animal and plant EST sequences
Mitochondrial genomes in Metazoa
Parasitic nematode sequencing project
InterPro is an integrated database of predictive protein "signatures" used for the classification and automatic annotation of proteins and genomes. InterPro classifies sequences at superfamily, family and subfamily levels, predicting the occurrence of functional domains, repeats and important sites.
The Ligand-Gated Ion Channel database provides nucleic and proteic sequences of the subunits of ligand-gated ion channels. These transmembrane proteins can exist under different conformations, at least one of which forms a pore through the membrane connecting two neighbouring compartments. The database can be used to generate multiple sequence alignments from selected subunits, and gives the atomic coordinates of subunits, or portion of subunits, where available.
Links between RNA splicing and disease
Experimentally validated Viral siRNA/shRNA
Structural motifs of protein superfamilies
Genome database on S. pombe
Prokaryotic Operon DataBase
PDBsum provides an at-a-glance overview of every macromolecular structure deposited in the Protein Data Bank (PDB), giving schematic diagrams of the molecules in each structure and of the interactions between them.
Therapeutic target database
Archaeal genome assemblies and annotation
Invertebrate Vectors of Human Pathogens
Virulence Factors Database
Yeast Metabolome Database
A comprehensive online knowledgebase for the monkey research community.
Ensembl is a joint project between EMBL - EBI and the Sanger Institute to develop a software system which produces and maintains automatic annotation on selected eukaryotic genomes.
GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. GenBank is part of the International Nucleotide Sequence Database Collaboration , which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI. These three organizations exchange data on a daily basis. The complete release notes for the current version of GenBank are available on the NCBI ftp site. A new release is made every two months. GenBank growth statistics for both the traditional GenBank divisions and the WGS division are available from each release.
mycoCLAP is a searchable resource for the knowledge and annotation of Characterized Lignocellulose-Active Proteins of fungal origin.
Picture atlas of annotated bacterial genomes
The Virus Pathogen Database and Analysis Resource (ViPR, www.ViPRbrc.org) is an integrated repository of data and analysis tools for multiple virus families, supported by the National Institute of Allergy and Infectious Diseases (NIAID) Bioinformatics Resource Centers (BRC) program. ViPR contains information for human pathogenic viruses belonging to the Arenaviridae, Bunyaviridae, Caliciviridae, Coronaviridae, Flaviviridae, Filoviridae, Hepeviridae, Herpesviridae, Paramyxoviridae, Picornaviridae, Poxviridae, Reoviridae, Rhabdoviridae and Togaviridae families, with plans to support additional virus families in the future. ViPR captures various types of information, including sequence records, gene and protein annotations, 3D protein structures, immune epitope locations, clinical and surveillance metadata and novel data derived from comparative genomics analysis. Analytical and visualization tools for metadata- driven statistical sequence analysis, multiple sequence alignment, phylogenetic tree construction, BLAST comparison and sequence variation determination are also provided. Data filtering and analysis workflows can be combined and the results saved in personal ‘Workbenches’ for future use. ViPR tools and data are available without charge as a service to the virology research community to help facilitate the development of diagnostics, prophylactics and therapeutics for priority pathogens and other viruses.
Mining of modENCODE data
Mycobrowser is a resource that provides both in silico generated and manually reviewed information within databases dedicated to the complete genomes of Mycobacterium tuberculosis, Mycobacterium leprae, Mycobacterium marinum and Mycobacterium smegmatis. This collection references Mycobacteria leprae information.
Mycobrowser is a resource that provides both in silico generated and manually reviewed information within databases dedicated to the complete genomes of Mycobacterium tuberculosis, Mycobacterium leprae, Mycobacterium marinum and Mycobacterium smegmatis. This collection references Mycobacteria marinum information.
Mycobrowser is a resource that provides both in silico generated and manually reviewed information within databases dedicated to the complete genomes of Mycobacterium tuberculosis, Mycobacterium leprae, Mycobacterium marinum and Mycobacterium smegmatis. This collection references Mycobacteria smegmatis information.
Mycobrowser is a resource that provides both in silico generated and manually reviewed information within databases dedicated to the complete genomes of Mycobacterium tuberculosis, Mycobacterium leprae, Mycobacterium marinum and Mycobacterium smegmatis. This collection references Mycobacteria tuberculosis information.
This database provides a platform to query and compare gene expression data during the development of the major model animals (zebrafish, drosophila, medaka, mouse). The high resolution expression data was acquired through whole mount in situ hybridsation-, antibody- or transgenic experiments.
The Simple Modular Architecture Research Tool (SMART) is an online tool for the identification and annotation of protein domains, and the analysis of domain architectures.
STRING is a database of known and predicted protein interactions. The interactions include direct (physical) and indirect (functional) associations
eggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups) is a database of orthologous groups of genes. The orthologous groups are annotated with functional description lines (derived by identifying a common denominator for the genes based on their various annotations), with functional categories (i.e derived from the original COG/KOG categories).
CryptoDB is an integrated genomic and functional genomic database for the parasite Cryptosporidium. CryptoDB integrates whole genome sequence and annotation along with experimental data and environmental isolate sequences provided by community researchers, it also includes supplemental bioinformatics analyses and a web interface for data-mining.
EuPathDB Bioinformatics Resource Center for Biodefense and Emerging/Re-emerging Infectious Diseases is a portal for accessing genomic-scale datasets associated with the eukaryotic pathogens (Cryptosporidium, Encephalitozoon, Entamoeba, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma).
A detailed study of Giardia lamblia's genome will provide insights into an early evolutionary stage of eukaryotic chromosome organization as well as other aspects of the prokaryotic / eukaryotic divergence.
MicrosporidiaDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
Hierarchical clustering of UniProt proteins
PlasmoDB is a genome database for the genus Plasmodium, a set of single-celled eukaryotic pathogens that cause human and animal diseases, including malaria.
ToxoDB is a genome database for the genus Toxoplasma, a set of single-celled eukaryotic pathogens that cause human and animal diseases, including toxoplasmosis.
TrichDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
TriTrypDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
The Human Oral Microbiome Database (HOMD) provides a site-specific comprehensive database for the more than 600 prokaryote species that are present in the human oral cavity. It contains genomic information based on a curated 16S rRNA gene-based provisional naming scheme, and taxonomic information.
The goal of The Gene Index Project is to use the available EST and gene sequences, along with the reference genomes wherever available, to provide an inventory of likely genes and their variants and to annotate these with information regarding the functional roles played by these genes and their products.
DNASU is a central repository for plasmid clones and collections. Currently we store and distribute over 197,000 plasmids including 75,000 human and mouse plasmids, full genome collections, the protein expression plasmids from the Protein Structure Initiative as the PSI: Biology Material Repository (PSI : Biology-MR), and both small and large collections from individual researchers. We are also a founding member and distributor of the ORFeome Collaboration plasmid collection.
The Protein Classification Benchmark Collection was created in order to create standard datasets on which the performance of machine learning methods can be compared.
ICDS database is a database containing ICDS detected by a similarity-based approach. The definition of each interrupted gene is provided as well as the ICDS genomic localisation with the surrounding sequence.
The aim of PEROXISOME database (PeroxisomeDB) is to gather, organise and integrate curated information on peroxisomal genes, their encoded proteins, their molecular function and metabolic pathway they belong to, and their related disorders.
IMGT is a high-quality integrated knowledge resource specialized in the immunoglobulins (IG) or antibodies, T cell receptors (TR), major histocompatibility complex (MHC) of human and other vertebrate species, and in the immunoglobulin superfamily (IgSF), major histocompatibility complex superfamily (MhcSF) and related proteins of the immune system (RPI) of vertebrates and invertebrates.
ABS: A database of Annotated regulatory Binding Sites from known binding sites identified in promoters of orthologous vertebrate genes.
The Database of Protein Disorder (DisProt) is a curated database that provides information about proteins that lack fixed 3D structure in their putatively native states, either in their entirety or in part.
The D. melanogaster and eight other eukaryote model genomes, and gene predictions from several groups. Summaries of essential genome statistics include sizes, genes found and predicted, homology among genomes, phylogenetic trees of species, and comparisons of several gene predictions for sensitivity and specificity in finding new and known genes.
euGenes provides a common summary of gene and genomic information from eukaryotic organism databases including gene symbol and full name, chromosome, genetic and molecular map information, Gene Ontology (Function/Location/Process) and gene homology, product information.
wFleaBase includes data from all species of the genus, yet the primary species are D. pulex and D. magna, because of the broad set of genomic tools that have already been developed for these animals.
The Aphid Genome Database's aim is to improve the current pea aphid genome assembly and annotation, and to provide new aphid genome sequences as well as tools for analysis of these genomes.
An integrated database for the genomics of the Lepidoptera Spodoptera frugiperda
A database dedicated to the analysis of the genome of Mycobacterium ulcerans, the Buruli ulcer bacillus. It provides a complete dataset of DNA and protein sequences derived from the epidemic strain Agy99, linked to the relevant annotations and functional assignments.
Its purpose is to collate and integrate various aspects of the genomic information from E. coli, the paradigm of Gram-negative bacteria. Colibri provides a complete dataset of DNA and protein sequences derived from the paradigm strain E. coli K-12, linked to the relevant annotations and functional assignments. It allows one to easily browse through these data and retrieve information, using various criteria (gene names, location, keywords, etc.).
GenoList is an integrated environment for comparative exploration of microbial genomes. The current release integrates genome data for over 700 species (Genome Reviews). The query and navigation user interface includes specialized tools for subtractive genome analysis and dynamic synteny visualization.
LegioList is a database dedicated to the analysis of the genomes of Legionella pneumophila strain Paris (endemic in France), strain Lens (epidemic isolate), strain Phildelphia 1, and strain Corby. It also includes the genome of Legionella longbeachae strain NSW150.
ListiList is a database dedicated to the analysis of the genomes of the food-borne pathogen, Listeria monocytogenes, and its non-pathogenic relative, Listeria innocua. Its purpose is to collate and integrate various aspects of the genomic information from L. monocytogenes, a paradigm for bacterial-host interactions.
Its purpose is to collate and integrate various aspects of the genomic information from M. pulmonis, a mollicute causal agent of murine respiratory mycoplasmosis. MypuList provides a complete dataset of DNA and protein sequences derived from the strain M. pulmonis UAB CTIP, linked to the relevant annotations and functional assignments.
PhotoList, contains a database dedicated to the analysis of the genome of Photorhabdus luminescens. This analysis has been described in: "The genome sequence of the entomopathogenic bacterium Photorhabdus luminescens"
SagaList contains a database dedicated to the analysis of the genomes of the food-borne pathogen, Streptococcus agalactiae.
Its purpose is to collate and integrate various aspects of the genomic information from B. subtilis, the paradigm of sporulating Gram-positive bacteria. SubtiList provides a complete dataset of DNA and protein sequences derived from the paradigm strain B. subtilis 168, linked to the relevant annotations and functional assignments
The PeptideAtlas Project provides a publicly accessible database of peptides identified in tandem mass spectrometry proteomics studies and software tools.
BACTIBASE contains calculated or predicted physicochemical properties of 177 bacteriocins produced by both Gram-positive and Gram-negative bacteria. The information in this database is very easy to extract and allows rapid prediction of relationships structure/function and target organisms of these peptides and therefore better exploitation of their biological activity in both the medical and food sectors.
The Human Protein Reference Database represents a centralized platform to visually depict and integrate information pertaining to domain architecture, post-translational modifications, interaction networks and disease association for each protein in the human proteome.
MycoBank is an online database, documenting new mycological names and combinations, eventually combined with descriptions and illustrations.
SoyBase, the USDA-ARS soybean genetic database, is a comprehensive repository for professionally curated genetics, genomics and related data resources for soybean. SoyBase contains the most current genetic, physical and genomic sequence maps integrated with qualitative and quantitative traits. The quantitative trait loci (QTL) represent more than 18 years of QTL mapping of more than 90 unique traits. SoyBase also contains the well-annotated 'Williams 82' genomic sequence and associated data mining tools. The genetic and sequence views of the soybean chromosomes and the extensive data on traits and phenotypes are extensively interlinked. This allows entry to the database using almost any kind of available information, such as genetic map symbols, soybean gene names or phenotypic traits. SoyBase is the repository for controlled vocabularies for soybean growth, development and trait terms, which are also linked to the more general plant ontologies. SoyBase can be accessed at http://soybase.org.
The integrated microbial genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI (DoE Joint Genome Institute) microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses.
PSIbase is a molecular interaction database based on PSIMAP (PDB, SCOP) that focuses on structural interaction of proteins and their domains
BeetleBase is being developed as an important community resource for Tribolium genetics, genomics and developmental biology.The database is built on the Chado generic data model, and is able to store various types of data, ranging from genome sequences to mutant phenotypes.
HUGE is a database for human large proteins newly identified in the Kazusa cDNA project, the aim of which is to predict the primary structure of proteins from the sequences of human large cDNAs (>4 kb).
The ROUGE protein database is a sister database of HUGE protein database which has accumulated the results of comprehensive sequence analysis of human long cDNAs (KIAA cDNAs). The ROUGE protein database has been created to publicize the information obtained from mouse homologues of the KIAA cDNAs (mKIAA cDNAs).
YEASTRACT (Yeast Search for Transcriptional Regulators And Consensus Tracking) is a curated repository of more than 48333 regulatory associations between transcription factors (TF) and target genes in Saccharomyces cerevisiae, based on more than 1200 bibliographic references.
The Protein Model DataBase (PMDB), is a database that collects manually built three dimensional protein models, obtained by different structure prediction techniques.
A 16S rRNA gene database which provides chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies.
MaizeGDB is the maize research community's central repository for genetics and genomics information.
Rather than being a complete record of a proteomics experiment, this database holds the minimum amount of information necessary for certain bioinformatics-related tasks, such as sequence assignment validation. Most of the data is held in a set of XML files
The NEW Antirrhinum majus (Snapdragon) genetic and genomic database
Rat Genome Database seeks to collect, consolidate, and integrate rat genomic and genetic data with curated functional and physiological data and make these data widely available to the scientific community.
The Mouse Genome Database (MGD) project includes data on gene characterization, nomenclature, mapping, gene homologies among mammals, sequence links, phenotypes, allelic variants and mutants, and strain data.
Clusters of Orthologous Groups of proteins (COGs) were delineated by comparing protein sequences encoded in complete genomes, representing major phylogenetic lineages. Each COG consists of individual proteins or groups of paralogs from at least 3 lineages and thus corresponds to an ancient conserved domain.
The dbEST contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms.
Influenza Virus Resource presents data obtained from the NIAID Influenza Genome Sequencing Project as well as from GenBank, combined with tools for flu sequence analysis, annotation and submission to GenBank. In addition, it provides links to other resources that contain flu sequences, publications and general information about flu viruses.
We present the second generation of centrosomeDB, with a significant expansion of 1357 human and drosophila centrosomal genes and their corresponding information. The active research done during the past decades has produced lots of data related to centrosomal proteins. Unfortunately, the accumulated data is dispersed among diverse and heterogeneous sources of information. This was our motivation to introduce CentrosomeDB, a collection of human centrosomal proteins that were reported in the literature and other sources.Using our database, the researcher is offered the possibility to study the evolution, function, and structure of the centrosome. We have compiled information from many sources, including Gene Ontology, disease-association, single nucleotide polymorphisms, and associated gene expression experiments.
This database is compiled from the human genome nucleotide sequences obtained mostly in the Human Genome Projects. The database makes it possible to continuously improve classification and characterization of retroviral families. The HERV database now contains retroviruses from more than 90 % of the human genome.
Implemented the SNP discovery software autoSNP within a relational database to enable the efficient mining of the identified polymorphisms and the detailed interrogation of the data. AutoSNP was selected because it does not require sequence trace files and is thus applicable to a broader range of species and datasets.
miRNEST is a database of animal, plant and virus microRNAs, containing miRNA predictions conducted on Expressed Sequence Tags of animal and plant species.
Description of plasmids used in experiments
The Assembling the Fungal Tree of Life (AFTOL) project is dedicated to significantly enhancing our understanding of the evolution of the Kingdom Fungi, which represents one of the major clades of life.
AgBase is a curated, open-source, Web-accessible resource for functional analysis of agricultural plant and animal gene products.
DoOP is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21,061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically
Allergome aims to supply information on Allergenic Molecules (Allergens) causing an IgE-mediated (allergic, atopic) disease (anaphylaxis, asthma, atopic dermatitis, conjunctivitis, rhinitis, urticaria).
Evola contains ortholog information of all human genes among vertebrates. Orthologs are a pair of genes in different species that evolved from a common ancestral gene by speciation. In Evola, orthologs were detected by comparative genomics and amino acid sequence analysis (Computational analysis).
The Telomerase Database is a Web-based tool for the study of structure, function, and evolution of the telomerase ribonucleoprotein. The objective of this database is to serve the research community by providing a comprehensive compilation of information known about telomerase enzyme and its substrate, telomeres.
The Aspergillus Genome Database is a resource for genomic sequence data and gene and protein information for Aspergilli. AspGD is based on the Candida Genome Database and is funded by the National Institute of Allergy and Infectious Diseases at the US National Institutes of Health.
LOCATE is a curated database that houses data describing the membrane organization and subcellular localization of proteins from the RIKEN FANTOM4 mouse and human protein sequence set.
Drosophila Polymorphism Database, is a secondary database designed to provide a collection of all the existing polymorphic sequences in the Drosophila genus. It allows, for the first time, the search for any polymorphic set according to different parameter values of nucleotide diversity.
Relative evolutionary importance of amino acids within a protein sequence
In BGI-RIS, sequence contigs of Beijing indica and Syngenta japonica have been further assembled and anchored onto the rice chromosomes. The database has annotated the rice genomes for gene content, repetitive elements, and SNPs. Sequence polymorphisms between different rice subspecies have also been identified.
The chicken Variation Database (ChickVD) is an integrated information system for storage, retrieval, visualization and analysis of chicken variation data.
IVDB hosts complete genome sequences of influenza A virus generated by BGI and curates all other published influenza virus sequences after expert annotations. IVDB provides a series of tools and viewers for analyzing the viral genomes, genes, genetic polymorphisms and phylogenetic relationships comparatively.
The Pig Genomic Informatics System (PigGIS) presents accurate pig gene annotations in all sequenced genomic regions. It integrates various available pig sequence data, including 3.84 million whole-genome-shortgun (WGS) reads and 0.7 million Expressed Sequence Tags (ESTs) generated by Sino-Danish Pig Genome Project, and 1 million miscellaneous GenBank records.
The YH database was produced to present the entire DNA sequence assembled based on 3.3 billion reads (117.7Gbp raw data) generated by Illumina Genome Analyzer. In total of 102.9Gbp nucleotides were mapped onto the NCBI human reference genome (Build 36) by self-developed software SOAP (Short Oligonucleotide Alignment Program), and 3.07 million SNPs were identified.
The Barcode of Life Data Systems (BOLD) is an online workbench that aids collection, management, analysis, and use of DNA barcodes. It consists of 3 components (MAS, IDS, and ECS) that each address the needs of various groups in the barcoding community.
The VIRsiRNA database contains details of siRNA/shRNA which target viral genome regions. It provides efficacy information where available, as well as the siRNA sequence, viral target and subtype, as well as the target genomic region.
CIPRO is an integrated protein that has been developed to provide widespread information of the proteins expressed in the ascidian Ciona intestinalis, especially for the researcher who wants to get advance and useful information for starting biological and biomedical research. The protein information in CIPRO directly links to gene expression, a tool for peptide mass fingerprinting (PMF), intracellular localization, 3D image of early development, and transgenic resources.
PRODORICÌÎ_Ì´å¢ is a comprehensive database about gene regulation and gene expression in prokaryotes. It includes a manually curated and unique collection of transcription factor binding sites.
The NMPDR provided curated annotations in an environment for comparative analysis of genomes and biological subsystems, with an emphasis on the food-borne pathogens Campylobacter, Listeria, Staphylococcus, Streptococcus, and Vibrio; as well as the STD pathogens Chlamydiaceae, Haemophilus, Mycoplasma, Neisseria, Treponema, and Ureaplasma.
Observing these difficulties, this database of experimentally validated mammalian siRNAs with efficacy ratings has been established. Currently, 17,192 records of experimentally validated siRNAs, targeting 5,086 genes, originated from 6,122 independent studies are hosted in siRecords.
H-DBAS offers unique data and viewer for human Alternative Splicing (AS) analysis including genome-wide representative alternative splicing variants (RASVs), RASVs affecting protein functions, conserved RASVs compared with mouse genome (full length cDNAs).
The Human Gene and Protein Database (HGPD) is a unique database that stores information from a set of human Gateway entry clones.
CnidBase, the Cnidarian Evolutionary Genomics Database, is a tool for investigating the evolutionary, developmental and ecological factors that affect gene expression and gene function in cnidarians.
StellaBase is the Nematostella vectensis genomics database.
Tandem Repeats Database (TRDB) is a public repository of information on tandem repeats in genomic DNA and contains a variety of tools for their analysis.
The CFGP (Comparative Fungal Genomics Platform) was designed for comparative genomics projects with diverse fungal genomes.
The Magnaporthe grisea genome project is a partnership between the International Rice Blast Genome Consortium, and the Broad Institute. The project is facilitated by an Advisory Board made up of members of the rice blast research community.
The goal of MutDB is to annotate human variation data with protein structural information and other functionally relevant information, if available. The mutations are organized by gene. Click on the alphabet below to go alphabetically through the list of genes.
The Molecular Modeling Database (MMDB), as part of the Entrez system, facilitates access to structure data by connecting them with associated literature, protein and nucleic acid sequences, chemicals, biomolecular interactions, and more.
NCBI Gene provides information for genes from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.
This collection of virus genomic sequences is a part of Entrez Genome that provides curated sequence data and related information for the community.
The organelle genomes on this site are part of the NCBI Reference Sequence (RefSeq) project that provides curated sequence data and related information for the community to use as a standard.
The list of plant sequencing projects in this page includes those that have reached the stage where active sequence determination is currently producing, or is expected to produce in the near future, GenBank accessions toward the goal of determining the sequence of that plant genome.
ProtClustDB is a collection of related protein sequences (clusters) consisting of Reference Sequence proteins encoded by complete genomes. This database contains both curated and non-curated clusters.
The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins.
Each UniGene entry is a set of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.
UniSTS is a comprehensive database of sequence tagged sites (STSs) derived from STS-based maps and other experiments. STSs are defined by PCR primer pairs and are associated with additional information such as genomic position, genes, and sequences.
UniVec is a database that can be used to quickly identify segments within nucleic acid sequences which may be of vector origin (vector contamination). In addition to vector sequences, UniVec also contains sequences for those adapters, linkers, and primers commonly used in the process of cloning cDNA or genomic DNA.
A searchable mouse cDNA library.
This website provides genome sequence from the Nipponbare subspecies of rice and annotation of the 12 rice chromosomes. These data are available through search pages and the Genome Browser that provides an integrated display of annotation data.
A database for facilitating the search for drug adverse reaction target. It contains information about known drug adverse rection targets, functions and properties.
A collection of information about restriction enzymes and related proteins. It contains published and unpublished references, recognition and cleavage sites, isoschizomers, commercial availability, methylation sensitivity, crystal, genome, and sequence data.
The Transporter Classification Database details a comprehensive classification system for membrane transport proteins known as the Transporter Classification (TC) system. The TC system is analogous to the Enzyme Commission (EC) system for classification of enzymes, except that it incorporates both functional and phylogenetic information. Descriptions, TC numbers, and examples of over 600 families of transport proteins are provided.
SitEx is a database containing information on eukaryotic protein functional sites. It stores the amino acid sequence positions in the functional site, in relation to the exon structure of encoding gene This can be used to detect the exons involved in shuffling in protein evolution, or to design protein-engineering experiments.
The Oryzabase is a comprehensive rice science database established in 2000 by rice researcher's committee in Japan. The Oryzabase consists of five parts, (1) genetic resource stock information, (2) gene dictionary, (3) chromosome maps, (4) mutant images, and (5) fundamental knowledge of rice science.
HOVERGEN is a database of homologous vertebrate genes that allows one to select sets of homologous genes among vertebrate species, and to visualize multiple alignments and phylogenetic trees.
The PIR SuperFamily concept is being used as a guiding principle to provide comprehensive and non-overlapping clustering of UniProtKB sequences into a hierarchical order to reflect their evolutionary relationships.
ArachnoServer is a manually curated database containing information on the sequence, three-dimensional structure, and biological activity of protein toxins derived from spider venom.
DPVweb provides a central source of information about viruses, viroids and satellites of plants, fungi and protozoa. Comprehensive taxonomic information, including brief descriptions of each family and genus, and classified lists of virus sequences are provided. The database also holds detailed, curated, information for all sequences of viruses, viroids and satellites of plants, fungi and protozoa that are complete or that contain at least one complete gene (currently, n~9000).
TargetDB, a target registration database, provides information on the experimental progress and status of targets selected for structure determination.
GeneDB is a genome database for prokaryotic and eukaryotic organisms and provides a portal through which data generated by the "Pathogen Genomics" group at the Wellcome Trust Sanger Institute and other collaborating sequencing centres can be accessed.
The MEROPS database is an information resource for peptidases (also termed proteases, proteinases and proteolytic enzymes) and the proteins that inhibit them.
The Pfam database contains information about protein domains and families. For each entry a protein sequence alignment and a Hidden Markov Model is stored.
The Vertebrate Genome Annotation (VEGA) database is a central repository for high quality manual annotation of vertebrate finished genome sequence.
The database of 3D Interaction Domains (3did) is a collection of domain-domain interactions in proteins for which high-resolution three-dimensional structures are known. 3did exploits structural information to provide critical molecular details necessary for understanding how interactions occur
The Saccharomyces Genome Database (SGD) collects and organizes information about the molecular biology and genetics of the yeast Saccharomyces cerevisiae. SGD contains a variety of biological information and tools with which to search and analyze it.
The Pseudomonas Genome Database is a resource for peer-reviewed, continually updated annotation for all Pseudomonas species. It includes gene and protein sequence information, as well as regulation and predicted function and annotation.
The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a unique resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function even in the absence of direct experimental evidence.
HAMAP is a system, based on manual protein annotation, that identifies and semi-automatically annotates proteins that are part of well-conserved families or subfamilies: the HAMAP families. HAMAP is based on manually created family rules and is applied to bacterial, archaeal and plastid-encoded proteins.
PROSITE consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them
T1DBase focuses on two research areas in type 1 diabetes (T1D): the genetics of T1D susceptibility and beta cell biology.
Domain mapping of disease mutations (DMDM) is a database in which each disease mutation can be displayed by its gene, protein or domain location
UniProtKB consists of the manually annotated and reviewed Swiss-Prot, the automatically annotated TrEMBL, and the PIR protein databases.
ACLAME is a database dedicated to the collection and classification of mobile genetic elements (MGEs) from various sources, comprising all known phage genomes, plasmids and transposons.
The CATH database is a hierarchical domain classification of protein structures in the Protein Data Bank. Protein structures are classified using a combination of automated and manual procedures. There are four major levels in this hierarchy; Class (secondary structure classification, e.g. mostly alpha), Architecture (classification based on overall shape), Topology (fold family) and Homologous superfamily (protein domains which are thought to share a common ancestor). This collection is concerned with superfamily classification.
The Human Metabolome Database (HMDB) is a database containing detailed information about small molecule metabolites found in the human body.It contains or links 1) chemical 2) clinical and 3) molecular biology/biochemistry data.
Toxin and Toxin Target Database (T3DB) is a bioinformatics resource that combines detailed toxin data with comprehensive toxin target information.
GpDB is a publicly accessible, relational database of G-proteins and their interactions with GPCRs and effector molecules. The sequences are classified according to a hierarchy of different classes, families and sub-families, based on extensive literature search.
Xenbase is the model organism database for Xenopus laevis and X. (Silurana) tropicalis. It contains genomic, development data and community information for Xenopus research. it includes gene expression patterns that incorporates image data from the literature, large scale screens and community submissions.
The database of interacting protein (DIP) database stores experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein-protein interactions
VBASE2 is an integrative database of germ-line variable genes from the immunoglobulin loci of human and mouse. All variable gene sequences are extracted from the EMBL-Bank.
STITCH is a resource to explore known and predicted interactions of chemicals and proteins. Chemicals are linked to other chemicals and proteins by evidence derived from experiments, databases and the literature.
The miRBase Sequence Database is a searchable database of published miRNA sequences and annotation. The data were previously provided by the miRNA Registry. The miRBase Registry continues to provide gene hunters with unique names for novel miRNA genes prior to publication of results. The miRBase Targets database is a new resource of predicted miRNA targets in animals
The Molecular INTeraction database (MINT) stores, in a structured format, information about molecular interactions by extracting experimental details from work published in peer-reviewed journals.
The UNITE is primarily a fungal rDNA ITS sequence database, although we also welcome additional genes and genetic markers. UNITE focuses on high-quality ITS sequences generated from fruiting bodies collected and identified by experts and deposited in public herbaria.
Bacteriome.org is a database integrating physical (protein-protein) and functional interactions within the context of an E. coli knowledgebase.
Poxvirus Bioinformatics Resource Center has been established to provide specialized web-based resources to the scientific community studying poxviruses.
The Genome Database for Rosaceae (GDR) is a curated and integrated web-based relational database providing centralized access to Rosaceae genomics and genetics data and analysis tools to facilitate cross-species utilization of data.
The National Center for Research Resources' Yeast Resource Center is located at the University of Washington in Seattle, Washington. The mission of the center is to facilitate the identification and characterization of protein complexes in the yeast Saccharomyces cerevisiae.
Annotated collection of all publicly available nucleotide and protein sequences
The European Mouse Mutant Archive (EMMA) is a non-profit repository for the collection, archiving (via cryopreservation) and distribution of relevant mutant strains essential for basic biomedical research.The laboratory mouse is the most important mammalian model for studying genetic and multi-factorial diseases in man. Thus the work of EMMA will play a crucial role in exploiting the tremendous potential benefits to human health presented by the current research in mammalian genetics. EMMA is supported by the partner institutions, national research programmes and by the EC's FP7 Capacities Specific Programme.
EMBL's database of bioactive drug-like small molecules
This record is in need of a maintainer. If you login, you'll be able to claim this record.
No Maintainers Listed
No Funders Listed
No publications available