Genes encoding the NBS-LRR protein motifs comprise one of the most prevalent
classes in plant genomes. To date, the only demonstrated function for these
genes is in disease or pest resistance. However, NBS-LRR proteins may also
be involved in other aspects of plant biology. These proteins are one of ~5
classes of plant resistance genes (reviewed in Baker et al., 1997; Bent et
al., 1996).
The NBS (nucleotide binding site) is a common protein motif in all
organisms, occurs in numerous structural forms, and functions to bind ATP or
GTP (Saraste et al., 1990; Walker et al., 1982). The domains within the NBS
include a highly conserved "P-loop" that functions in phosphate binding,
followed by additional conserved domains also involved in binding (Traut,
1994).
This database focuses on the sub-class of NBS sequences which contain a
characteristic set of conserved motifs (van der Biezen and Jones, 1998;
Meyers et al., 1999).
The LRR (leucine rich repeat) in the plant R-genes is of variable length.
This database refers to only one class of LRRs found in plant R-genes - the
'cytoplasmic LRR' that is found in conjunction with the NBS. The LRR regions
are typically 10-40+ repeats of a motif of ~24 amino acids that is highly
variable; the core signature of this repeated motif is of the form
LxxLxxLxLxxLxx(N/C/T)x(x)LxxIPxx
(L = leucine or other aliphatic residues,
and x = any residue; reviewed in Jones and Jones, 1997).
One of our research goals is to define the different classes of
NBS-LRR-encoding genes, and ultimately to determine the function of these
genes and how they are related to defense. Sequence analysis of the
complete Arabidopsis genome has identified more than 160 NBS-LRR - encoding
genes (see also, Meyers et al., 1999). Approximately two-thirds of these
genes also encode a "Toll-Interleukin homology" domain at the N-terminus of
the predicted protein (the "TIR" class), while the remaining one-third
encode a coiled-coil motif at the N-terminus (the "non-TIR" class).
Phylogenetic analysis demonstrates that known resistance genes are present
in most of the major clades.
Candidate NBS-encoding sequences were selected from the Arabidopsis genomic
sequence, starting with those sequences identified in Meyers et al., 1999.
Additional and more diverse NBS-encoding sequences in the Arabidopsis genome
were identified by BLAST and HMM/Pfam analysis. These sequences were then
compared against the translated complete Arabidopsis genomic sequence by
FASTA analysis (ktup=1). The twelve best matches for each gene (or all with
a score of e-24 or better) were extracted from the MIPS database and
compiled in a single file. Redundant sequences were removed. The complete
non-redundant dataset of predicted protein sequences was placed into a mySQL database.
This database contains all of the known NBS-LRR - encoding genes in
Arabidopsis, as of December 14, 2000. With the completion of the genome, we
are finalizing the data set - please check back, as we will post this data
shortly. In addition, we are studying the expression patterns and transcript
structure for these genes, and expect to add this information to the
database in the coming months.
RGenes Data in mySQL database
RGenes distribution over chromosomes
Sequence Alignment of NBS region
RGenes Phylogeny
NBS Domain Relationship in Arabidopsis Genome
NBS, P450, PK-LRR Clustering in Arabidopsis Genome
NBS, P450, PK-LRR clustering on the background of duplication events in Arabidopsis genome
HMM profiles of Arabidopsis NBS domain