NAR Molecular Biology Database Collection entry number 130
There are 2, 122 patient entries in this ninth edition of the database compared with 1, 918 patients last year. Besides point mutations, these show 155 short (defined as less than 30 nucleotides) deletions or additions or both, made up from 117 deletions, 29 additions and nine examples involving both additions and deletions. This year we have included a separate table listing gross deletions/insertions in the gene as well as a listing of known polymorphisms. There are also 24 double mutations and 1 triple mutation. Twenty-one patients are known to have developed inhibitors, 2 are somatic mosaics and 33 are females (either affected or nonsymptomatic carriers). Of the 2, 122 patients listed, 753 show unique molecular events probably causing the disease, while the remainder are repeats. Many of these repeats occur at CG doublets and involve a CG-TG or CA change. As discussed before, such sites are believed to be genuine "hotspots" for mutation. However it is now becoming clear that the high number of repeat observations at some CG doublets, particularly those causing mild disease (e.g. at 31, 008) are caused, at least in part, by founder effects. A founder effect is responsible for the many repeats of a mutation at residue 31, 311 that is not part of a CG . The database attempts to offer a view of the spectrum of mutations causing haemophilia B that is as accurate as possible and this is helped by the fact that about 1/3 of all mutations have been detected as a result of full population studies. However, some bias cannot be completely avoided. Obviously there is an over-representation of severe haemophilia-causing mutations as these tend to be the first analysed and the most likely to come to notice. We also expect under-representation of double mutants as not all laboratories have done "complete" gene screens. The distribution of mutants according to protein domains and control regions within the gene shows that mutations have been detected in all regions except the poly(A) site. As expected, exons a and f are under-represented by mutations due to the lack of importance of most of their amino acids (prepeptide and activation peptide), whereas exons d and h are over-represented illustrating their importance (calcium binding EGF, and catalytic domain, respectively). Perhaps surprisingly, exon g is also under-represented by mutations suggesting weaker structural constraints on its amino acid sequence. The present list contains 455 different amino acid substitutions (probably detrimental). One hundred and twenty-eight residues of factor IX show two or more amino acid substitutions and 93 only one. Mutations at nine of the 12 carboxyglutamyl residues have now been detected, confirming their critical role for the function of factor IX. Amino acid substitutions have also been found for every one of the 22 cysteines of circulating factor IX and therefore mutations have been found that compromise each one of the disulphide bridges of the mature protein, thus confirming the importance of such structures. This year 132 missense mutations were added, of which 23 are new. This year's database is now available on the World Wide Web (http://www.umds.ac.uk/molgen/haemBdatabase.htm), but is too large to publish as a printed table. Copies on disk (in both text and database format) can be obtained from the nearest country coordinator if internet access is not possible.
Category: Human Genes and Diseases
Subcategory: Gene-, system- or disease-specific databases
Go to the abstract in the NAR 1998 Database Issue.
Oxford University Press is not responsible for the content of external internet sites