UniProt Knowledgebase
Swiss-Prot Protein Knowledgebase
TrEMBL Protein Database

Release notes
UniProtKB release 6.0 of 13-Sep-2005

Content

  Introduction
  UniProtKB/Swiss-Prot Protein Knowledgebase release statistics
  UniProtKB/TrEMBL Protein Database release statistics

  Submissions and Updates
  Download information
  Contact
  Citation

  Related documents: UniProtKB user manual, Recent changes, Forthcoming changes.

Introduction

Release 6.0 of the UniProt Knowledgebase is composed of the UniProtKB/Swiss-Prot Protein Knowledgebase release 48.0 and the UniProtKB/TrEMBL Protein Database release 31.0.

More information on these databases can be found in the user manual What is the UniProt Knowledgebase ?.


UniProtKB/Swiss-Prot protein knowledgebase release 48.0 statistics

Release 48.0 of 13-Sep-2005 of Swiss-Prot contains 194'317 sequence entries, comprising 70'391'852 amino acids abstracted from 133'723 references.

The growth of the database is summarized below.

Release Date Number of entries Number of amino acids
2.0 09/86 3'939 900'163
3.0 11/86 4'160 969'641
4.0 04/87 4'387 1'036'010
5.0 09/87 5'205 1'327'683
6.0 01/88 6'102 1'653'982
7.0 04/88 6'821 1'885'771
8.0 08/88 7'724 2'224'465
9.0 11/88 8'702 2'498'140
10.0 03/89 10'008 2'952'613
11.0 07/89 10'856 3'265'966
12.0 10/89 12'305 3'797'482
13.0 01/90 13'837 4'347'336
14.0 04/90 15'409 4'914'264
15.0 08/90 16'941 5'486'399
16.0 11/90 18'364 5'986'949
17.0 02/91 20'024 6'524'504
18.0 05/91 20'772 6'792'034
19.0 08/91 21'795 7'173'785
20.0 11/91 22'654 7'500'130
21.0 03/92 23'742 7'866'596
22.0 05/92 25'044 8'375'696
23.0 08/92 26'706 9'011'391
24.0 12/92 28'154 9'545'427
25.0 04/93 29'955 10'214'020
26.0 07/93 31'808 10'875'091
27.0 10/93 33'329 11'484'420
28.0 02/94 36'000 12'496'420
29.0 06/94 38'303 13'464'008
30.0 10/94 40'292 14'147'368
31.0 02/95 43'470 15'335'248
32.0 11/95 49'340 17'385'503
33.0 02/96 52'205 18'531'384
34.0 10/96 59'021 21'210'389
35.0 11/97 69'113 25'083'768
36.0 07/98 74'019 26'840'295
37.0 12/98 77'977 28'268'293
38.0 07/99 80'000 29'085'965
39.0 05/00 86'593 31'411'114
40.0 10/01 101'602 37'315'215
41.0 02/03 122'564 44'986'459
42.0 10/03 135'850 50'046'799
43.0 03/04 146'720 54'093'154
44.0 07/04 153'871 56'608'159
45.0 10/04 163'235 59'631'787
46.0 02/05 168'297 61'443'278
47.0 05/05 181'577 65'746'672
48.0 09/05 194'317 70'391'852

In rare cases, Swiss-Prot entries are removed. Deleted entries are almost exclusively Open Reading Frames (ORFs) that have been wrongly predicted to code for proteins. When there is enough evidence that these hypothetical proteins are not real we take the decision to remove them from Swiss-Prot. In the document delac_sp.txt, you will find a list of all accession numbers which were previously present in UniProtKB/Swiss-Prot, but which have now been deleted from the database.


Status of the model organisms

We have selected a number of organisms that are the target of genome sequencing and/or mapping projects and for which we intend to:

From our efforts to annotate human sequence entries as completely as possible arose the HPI project, and the bacterial model organisms became the focus of the HAMAP project. Here is the current status of the model organisms which are not covered by these two projects:

Organism Database cross-references Index file Number of sequences
A.thaliana TAIR arath.txt 3'609
C.albicans None yet calbican.txt 390
C.elegans Wormpep celegans.txt 2'667
D.discoideum DictyBase dicty.txt 325
D.melanogaster FlyBase fly.txt 2'273
M.musculus MGD mgdtosp.txt 9'933
S.cerevisiae SGD yeast.txt 5'139
S.pombe GeneDB_SPombe pombe.txt 2'840

UniProtKB/Swiss-Prot release statistics

1.  INTRODUCTION

Release 48.0 of 13-Sep-2005 of Swiss-Prot contains 194'317 sequence entries,
comprising 70'391'852 amino acids abstracted from 133'723 references. 

11'963 sequences have been added since release 47, the sequence data of
1'095 existing entries has been updated and the annotations of
93'692 entries have been revised. This represents an increase of 7%.

The growth of the database is summarized below.


2.  AMINO ACID COMPOSITION

   2.1  Composition in percent for the complete database

   Ala (A) 7.83   Gln (Q) 3.94   Leu (L) 9.64   Ser (S) 6.85
   Arg (R) 5.35   Glu (E) 6.63   Lys (K) 5.93   Thr (T) 5.42
   Asn (N) 4.18   Gly (G) 6.94   Met (M) 2.37   Trp (W) 1.15
   Asp (D) 5.32   His (H) 2.28   Phe (F) 4.00   Tyr (Y) 3.06
   Cys (C) 1.53   Ile (I) 5.92   Pro (P) 4.83   Val (V) 6.72

   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.01


   2.2  Classification of the amino acids by their frequency

   Leu, Ala, Gly, Ser, Val, Glu, Lys, Ile, Thr, Arg, Asp, Pro, Asn, Phe,
   Gln, Tyr, Met, His, Cys, Trp


3.  TAXONOMIC ORIGIN

   Total number of species represented in this release of Swiss-Prot: 9'479

   The first twenty species represent 66639 sequences:  34.3 % of the total
   number of entries.


   3.1 Table of the frequency of occurrence of species

        Species represented 1x: 4552
                            2x: 1489
                            3x:  734
                            4x:  476
                            5x:  320
                            6x:  281
                            7x:  197
                            8x:  156
                            9x:  138
                           10x:   78
                       11- 20x:  382
                       21- 50x:  287
                       51-100x:  111
                         >100x:  278


   3.2  Table of the most represented species

  ------  ---------  --------------------------------------------
  Number  Frequency  Species
  ------  ---------  --------------------------------------------
       1      12860  Homo sapiens (Human)
       2       9933  Mus musculus (Mouse)
       3       5139  Saccharomyces cerevisiae (Baker's yeast)
       4       4846  Escherichia coli
       5       4570  Rattus norvegicus (Rat)
       6       3609  Arabidopsis thaliana (Mouse-ear cress)
       7       2840  Schizosaccharomyces pombe (Fission yeast)
       8       2814  Bacillus subtilis
       9       2667  Caenorhabditis elegans
      10       2273  Drosophila melanogaster (Fruit fly)
      11       1782  Methanococcus jannaschii
      12       1772  Haemophilus influenzae
      13       1758  Escherichia coli O157:H7
      14       1653  Bos taurus (Bovine)
      15       1512  Salmonella typhimurium
      16       1438  Escherichia coli O6
      17       1404  Shigella flexneri
      18       1403  Mycobacterium tuberculosis
      19       1230  Gallus gallus (Chicken)
      20       1136  Mycobacterium bovis
      21       1106  Salmonella typhi
      22       1029  Pseudomonas aeruginosa
      23       1001  Xenopus laevis (African clawed frog)
      24        983  Sus scrofa (Pig)
      25        964  Synechocystis sp. (strain PCC 6803)
      26        964  Archaeoglobus fulgidus
      27        823  Rhizobium meliloti (Sinorhizobium meliloti)
      28        810  Vibrio cholerae
      29        809  Yersinia pestis
      30        770  Oryctolagus cuniculus (Rabbit)
      31        746  Aquifex aeolicus
      32        694  Pasteurella multocida
      33        687  Mycoplasma pneumoniae
      34        661  Pongo pygmaeus (Orangutan)
      35        652  Vibrio parahaemolyticus
      36        644  Streptomyces coelicolor
      37        632  Bacillus halodurans
      38        621  Mycobacterium leprae
      39        608  Treponema pallidum
      40        603  Canis familiaris (Dog)
      41        599  Vibrio vulnificus
      42        591  Staphylococcus aureus (strain Mu50 / ATCC 700699)
      43        588  Staphylococcus aureus (strain N315)
      44        587  Anabaena sp. (strain PCC 7120)
      45        583  Methanobacterium thermoautotrophicum
      46        578  Vibrio vulnificus (strain YJ016)
      47        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
      48        571  Staphylococcus aureus (strain MW2)
      49        566  Oryza sativa (Rice)
      50        563  Helicobacter pylori (Campylobacter pylori)
      51        562  Buchnera aphidicola subsp. Schizaphis graminum
      52        546  Pseudomonas putida (strain KT2440)
      53        546  Rickettsia prowazekii
      54        544  Helicobacter pylori J99 (Campylobacter pylori J99)
      55        541  Pseudomonas syringae pv. tomato
      56        531  Bacillus anthracis
      57        528  Lactococcus lactis subsp. lactis (Streptococcus lactis)
      58        528  Staphylococcus epidermidis
      59        524  Bradyrhizobium japonicum
      60        523  Brachydanio rerio (Zebrafish) (Danio rerio)
      61        521  Zea mays (Maize)
      62        517  Ralstonia solanacearum (Pseudomonas solanacearum)
      63        513  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
      64        512  Listeria monocytogenes
      65        507  Buchnera aphidicola subsp. Baizongia pistaciae
      66        506  Listeria innocua
      67        500  Rhizobium loti (Mesorhizobium loti)
      68        493  Xanthomonas campestris pv. campestris
      69        493  Neisseria meningitidis serogroup B
      70        490  Neisseria meningitidis serogroup A
      71        488  Photorhabdus luminescens subsp. laumondii
      72        486  Mycoplasma genitalium
      73        485  Clostridium acetobutylicum
      74        475  Caulobacter crescentus
      75        467  Thermotoga maritima
      76        462  Staphylococcus aureus (strain MRSA252)
      77        461  Staphylococcus aureus (strain MSSA476)
      78        459  Shewanella oneidensis
      79        458  Bacillus cereus (strain ATCC 14579 / DSM 31)
      80        456  Xanthomonas axonopodis pv. citri
      81        453  Streptococcus pneumoniae
      82        451  Pan troglodytes (Chimpanzee)
      83        447  Xylella fastidiosa
      84        441  Deinococcus radiodurans
      85        440  Listeria monocytogenes serotype 4b (strain F2365)
      86        437  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
      87        436  Pyrococcus horikoshii
      88        431  Chlamydia trachomatis
      89        431  Pyrococcus abyssi
      90        430  Methanosarcina acetivorans
      91        426  Halobacterium salinarium (Halobacterium halobium)
      92        423  Brucella melitensis
      93        423  Brucella suis
      94        422  Clostridium perfringens
      95        421  Corynebacterium glutamicum (Brevibacterium flavum)
      96        419  Oceanobacillus iheyensis
      97        419  Haemophilus ducreyi
      98        418  Borrelia burgdorferi (Lyme disease spirochete)
      99        418  Neurospora crassa
     100        417  Mimivirus
     101        412  Chlamydia pneumoniae (Chlamydophila pneumoniae)
     102        410  Methanosarcina mazei (Methanosarcina frisia)
     103        404  Rhizobium sp. (strain NGR234)
     104        402  Chlamydia muridarum
     105        399  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
     106        399  Yersinia pseudotuberculosis
     107        398  Pyrococcus furiosus
     108        390  Thermoanaerobacter tengcongensis
     109        390  Candida albicans (Yeast)
     110        389  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
     111        388  Lactobacillus plantarum
     112        385  Campylobacter jejuni
     113        384  Ovis aries (Sheep)
     114        383  Sulfolobus solfataricus
     115        375  Streptococcus mutans
     116        372  Synechococcus elongatus (Thermosynechococcus elongatus)
     117        370  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
     118        369  Nicotiana tabacum (Common tobacco)
     119        367  Rickettsia conorii
     120        366  Streptococcus pyogenes serotype M1
     121        363  Streptococcus pyogenes serotype M6
     122        363  Bordetella pertussis
     123        361  Enterococcus faecalis (Streptococcus faecalis)
     124        360  Chromobacterium violaceum
     125        360  Streptococcus pyogenes serotype M18
     126        359  Bordetella parapertussis
     127        359  Streptococcus pyogenes serotype M3
     128        359  Streptomyces avermitilis
     129        346  Chlorobium tepidum
     130        338  Aeropyrum pernix
     131        338  Staphylococcus aureus
     132        332  Methanopyrus kandleri
     133        330  Leptospira interrogans
     134        329  Corynebacterium efficiens
     135        329  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
     136        328  Pyrococcus kodakaraensis (Thermococcus kodakaraensis)
     137        325  Dictyostelium discoideum (Slime mold)
     138        319  Leptospira interrogans serogroup Icterohaemorrhagiae serovar copenhageni
     139        313  Bacillus cereus (strain ATCC 10987)
     140        313  Nitrosomonas europaea
     141        313  Pisum sativum (Garden pea)
     142        309  Staphylococcus aureus (strain COL)
     143        309  Sulfolobus tokodaii
     144        307  Kluyveromyces lactis (Yeast)
     145        297  Streptococcus agalactiae serotype V
     146        297  Streptococcus agalactiae serotype III
     147        297  Thermoplasma acidophilum
     148        296  Gloeobacter violaceus
     149        294  Photobacterium profundum (Photobacterium sp. (strain SS9))
     150        294  Ashbya gossypii (Yeast) (Eremothecium gossypii)
     151        285  Triticum aestivum (Wheat)
     152        280  Synechococcus sp. (strain WH8102)
     153        280  Fusobacterium nucleatum subsp. nucleatum
     154        279  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
     155        278  Pseudomonas putida
     156        273  Prochlorococcus marinus (strain MIT 9313)
     157        273  Hordeum vulgare (Barley)
     158        270  Lycopersicon esculentum (Tomato)
     159        268  Cavia porcellus (Guinea pig)
     160        268  Bacteriophage T4
     161        268  Glycine max (Soybean)
     162        267  Macaca mulatta (Rhesus macaque)
     163        265  Rhodopseudomonas palustris
     164        265  Prochlorococcus marinus
     165        264  Pyrobaculum aerophilum
     166        262  Coxiella burnetii
     167        261  Thermoplasma volcanium
     168        257  Solanum tuberosum (Potato)
     169        257  Prochlorococcus marinus subsp. pastoris (strain CCMP 1378 / MED4)
     170        256  Clostridium tetani
     171        254  Rhodobacter capsulatus (Rhodopseudomonas capsulata)
     172        254  Vaccinia virus (strain Copenhagen) (VACV)
     173        253  Candida glabrata (Yeast) (Torulopsis glabrata)
     174        252  Acinetobacter sp. (strain ADP1)
     175        251  Emericella nidulans (Aspergillus nidulans)
     176        250  Bacteroides thetaiotaomicron
     177        249  Bacillus thuringiensis subsp. konkukian
     178        246  Salmonella paratyphi-a
     179        245  Spinacia oleracea (Spinach)
     180        245  Wolinella succinogenes
     181        242  Ureaplasma parvum (Ureaplasma urealyticum biotype 1)
     182        239  Mycobacterium paratuberculosis
     183        235  Bacillus stearothermophilus
     184        233  Wigglesworthia glossinidia brevipalpis
     185        231  Thermus thermophilus (strain HB8 / ATCC 27634 / DSM 579)
     186        228  Equus caballus (Horse)
     187        227  Chlamydophila caviae
     188        227  Bifidobacterium longum
     189        223  Geobacter sulfurreducens
     190        221  Rhodopirellula baltica
     191        220  Porphyra purpurea
     192        219  Porphyromonas gingivalis (Bacteroides gingivalis)
     193        219  Burkholderia pseudomallei (Pseudomonas pseudomallei)
     194        217  Corynebacterium diphtheriae
     195        216  Chlamydomonas reinhardtii
     196        214  Helicobacter hepaticus
     197        213  Methanococcus maripaludis
     198        212  Bacillus clausii (strain KSM-K16)
     199        211  Bacillus cereus (strain ZK)
     200        210  Desulfovibrio vulgaris (strain Hildenborough / ATCC 29579 / NCIMB 8303)
     201        209  Klebsiella pneumoniae
     202        209  Thermus thermophilus (strain HB27 / ATCC BAA-163 / DSM 7039)
     203        203  Haloarcula marismortui (Halobacterium marismortui)
     204        202  Mannheimia succiniciproducens (strain MBEL55E)
     205        202  Yarrowia lipolytica (Candida lipolytica)
     206        200  Vaccinia virus (strain Western Reserve / WR) (VACV)


   
   3.3  Taxonomic distribution of the sequences

   Kingdom        sequences (% of the database)
    Archaea            9783 (  5%)
    Bacteria          89394 ( 46%)
    Eukaryota         85403 ( 44%)
    Viruses            9737 (  5%)


   Within Eukaryota:

    Category            sequences (% of Eukaryota) (% of the complete database)
     Human                  12861 ( 15%)           (  7%)
     Other Mammalia         25396 ( 30%)           ( 13%)
     Other Vertebrata        7582 (  9%)           (  4%)
     Viridiplantae          13805 ( 16%)           (  7%)
     Fungi                  12450 ( 15%)           (  6%)
     Insecta                 4391 (  5%)           (  2%)
     Nematoda                2971 (  3%)           (  2%)
     Other                   5947 (  7%)           (  3%)


4.  SEQUENCE SIZE

   Repartition of the sequences by size (excluding fragments)

               From   To  Number             From   To   Number
                  1-  50    4028             1001-1100     1637
                 51- 100   13893             1101-1200     1126
                101- 150   19808             1201-1300      842
                151- 200   18963             1301-1400      639
                201- 250   19439             1401-1500      495
                251- 300   16591             1501-1600      309
                301- 350   17240             1601-1700      230
                351- 400   15608             1701-1800      177
                401- 450   12090             1801-1900      189
                451- 500   10072             1901-2000      154
                501- 550    7724             2001-2100       95
                551- 600    5156             2101-2200      148
                601- 650    4379             2201-2300      119
                651- 700    3103             2301-2400       84
                701- 750    2629             2401-2500       64
                751- 800    2190                 >2500      480
                801- 850    1774
                851- 900    1891
                901- 950    1339
                951-1000    1108


   The average sequence length in Swiss-Prot is 362 amino acids.

   The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
   The longest sequence is  SYNE1_HUMAN (Q8NF91):  8797 amino acids.


5.  JOURNAL CITATIONS

   Note: the following citation statistics reflect the number of distinct
         journal citations.

   Total number of journals cited in this release of Swiss-Prot: 1618


   5.1 Table of the frequency of journal citations

        Journals cited 1x:  577
                       2x:  226
                       3x:  114
                       4x:   77
                       5x:   55
                       6x:   36
                       7x:   33
                       8x:   36
                       9x:   21
                      10x:   15
                  11- 20x:  124
                  21- 50x:  132
                  51-100x:   56
                    >100x:  116


   5.2  List of the most cited journals in Swiss-Prot

   Nb    Citations   Journal name
   --    ---------   -------------------------------------------------------------
    1        12470   Journal of Biological Chemistry
    2         6211   Proceedings of the National Academy of Sciences of the U.S.A.
    3         4207   Journal of Bacteriology
    4         3922   Gene
    5         3880   Nucleic Acids Research
    6         3338   Biochemical and Biophysical Research Communications
    7         3249   FEBS Letters
    8         2906   Biochemistry
    9         2799   European Journal of Biochemistry
   10         2756   The EMBO Journal
   11         2524   Nature
   12         2458   Biochimica et Biophysica Acta
   13         2228   Journal of Molecular Biology
   14         2134   Molecular and Cellular Biology
   15         2118   Genomics
   16         2018   Cell
   17         1619   Biochemical Journal
   18         1490   Science
   19         1337   Molecular Microbiology
   20         1251   Plant Molecular Biology
   21         1241   Molecular and General Genetics
   22         1020   Journal of Cell Biology
   23         1013   Journal of Biochemistry
   24          963   Virology
   25          961   Human Molecular Genetics
   26          894   Nature Genetics
   27          854   Journal of Virology
   28          837   Genes and Development
   29          778   The American Journal of Human Genetics
   30          766   Oncogene
   31          757   Plant Physiology
   32          735   Human Mutation
   33          669   Infection and Immunity
   34          668   Journal of Immunology
   35          638   Structure
   36          636   Archives of Biochemistry and Biophysics
   37          626   Yeast
   38          625   Development
   39          578   Journal of General Virology
   40          561   Genetics
   41          559   Microbiology
   42          517   FEMS Microbiology Letters
   43          507   Nature Structural Biology
   44          473   Blood
   45          457   Human Genetics
   46          452   Current Genetics
   47          410   Molecular Biology of the Cell
   48          396   Applied and Environmental Microbiology
   49          394   The Plant Cell
   50          390   Molecular and Biochemical Parasitology
   51          384   Journal of Clinical Investigation
   52          383   Developmental Biology
   53          374   Cancer Research
   54          370   Mammalian Genome
   55          367   Journal of Cell Science
   56          361   Protein Science
   57          358   Mechanisms of Development
   58          356   Molecular Endocrinology
   59          346   Neuron
   60          344   Acta Crystallographica, Section D
   61          340   Immunogenetics
   62          331   The Journal of Experimental Medicine
   63          327   Journal of Molecular Evolution
   64          326   The Plant Journal
   65          316   DNA and Cell Biology
   66          315   Journal of Neuroscience
   67          314   Molecular Cell
   68          298   Endocrinology
   69          283   Biological Chemistry Hoppe-Seyler
   70          279   DNA Sequence
   71          276   Journal of Neurochemistry
   72          268   Current Biology
   73          259   The Journal of Clinical Endocrinology and Metabolism
   74          255   Molecular Biology and Evolution
   75          247   Brain Research. Molecular Brain Research
   76          240   Journal of General Microbiology
   77          240   Bioscience, Biotechnology, and Biochemistry
   78          239   Toxicon
   79          238   American Journal of Physiology
   80          227   Cytogenetics and Cell Genetics
   81          216   Comparative Biochemistry and Physiology
   82          214   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
   83          198   Antimicrobial Agents and Chemotherapy
   84          189   Molecular Pharmacology
   85          181   Journal of Investigative Dermatology
   86          179   Proteins
   87          173   Journal of Medical Genetics
   88          163   Peptides
   89          161   DNA Research
   90          158   DNA
   91          157   Molecular Plant-Microbe Interactions
   92          155   Genome Research
   93          154   Virus Research
   94          154   American Journal of Medical Genetics
   95          152   Tissue Antigens
   96          151   Plant and Cell Physiology
   97          144   Biochimie
   98          144   European Journal of Immunology
   99          143   Biology of Reproduction
  100          138   Bioorganicheskaia Khimiia
  101          137   Molecular and Cellular Endocrinology
  102          135   Hemoglobin
  103          119   Archives of Microbiology
  104          119   Insect Biochemistry and Molecular Biology
  105          119   Molecular Phylogenetics and Evolution
  106          118   Agricultural and Biological Chemistry
  107          117   Experimental Cell Research
  108          112   Journal of Human Genetics
  109          112   Annals of Neurology
  110          110   Nature Cell Biology
  111          110   European Journal of Human Genetics
  112          109   General and Comparative Endocrinology
  113          108   Neurology
  114          106   RNA
  115          104   Diabetes
  116          103   Developmental Dynamics


6.  STATISTICS FOR SOME LINE TYPES

The following table summarizes the total number of some Swiss-Prot lines,
as well as the number of entries with at least one such line, and the
frequency of the lines.

                                   Total    Number of  Average
Line type / subtype                number   entries    per entry
---------------------------------  -------- ---------  ---------

References (RL)                     378807              1.95
   Journal                          335752    181666    1.73
   Submitted to EMBL/GenBank/DDBJ    40044     34264    0.21
   Submitted to Swiss-Prot             671       668   <0.01
   Plant Gene Register                 510       498   <0.01
   Book citation                       494       482   <0.01
   Unpublished observations            469       465   <0.01
   Submitted to other databases        353       345   <0.01
   Thesis                              301       299   <0.01
   Patent                              124       122   <0.01
   Unpublished results                  83        81   <0.01
   Worm Breeder's Gazette                6         6   <0.01

Comments (CC)                       732470              3.77
   SIMILARITY                       208863    174553    1.07
   FUNCTION                         130272    127160    0.67
   SUBCELLULAR LOCATION              97545     97545    0.50
   CATALYTIC ACTIVITY                69879     65211    0.36
   SUBUNIT                           64500     64500    0.33
   PATHWAY                           35314     32279    0.18
   COFACTOR                          27402     24680    0.14
   TISSUE SPECIFICITY                20576     20576    0.11
   PTM                               13541     11921    0.07
   MISCELLANEOUS                     12757     11828    0.07
   DOMAIN                             9667      8549    0.05
   ALTERNATIVE PRODUCTS               7803      7803    0.04
   CAUTION                            7087      6298    0.04
   INDUCTION                          5374      5374    0.03
   INTERACTION                        4966      4966    0.03
   DEVELOPMENTAL STAGE                4928      4928    0.03
   DISEASE                            3066      2236    0.02
   ENZYME REGULATION                  2770      2770    0.01
   MASS SPECTROMETRY                  1881      1597    0.01
   DATABASE                           1413      1322    0.01
   BIOPHYSICOCHEMICAL PROPERTIES      1117      1117    0.01
   POLYMORPHISM                        509       496   <0.01
   RNA EDITING                         401       401   <0.01
   ALLERGEN                            387       387   <0.01
   TOXIC DOSE                          277       276   <0.01
   BIOTECHNOLOGY                       117       117   <0.01
   PHARMACEUTICAL                       58        58   <0.01

Features (FT)                      1082233              5.57
   TRANSMEM                         123597     26975    0.64
   METAL                             77253     19387    0.40
   CONFLICT                          71568     24920    0.37
   TOPO_DOM                          62341     12741    0.32
   TURN                              62287      4652    0.32
   CARBOHYD                          61962     15580    0.32
   STRAND                            57083      4155    0.29
   DISULFID                          57006     15630    0.29
   DOMAIN                            55803     29830    0.29
   ACT_SITE                          45107     26532    0.23
   HELIX                             44973      4509    0.23
   REPEAT                            42352      6078    0.22
   VARIANT                           34916      6771    0.18
   CHAIN                             31363     25429    0.16
   NP_BIND                           26977     18970    0.14
   MOD_RES                           25644     13037    0.13
   REGION                            22431     11354    0.12
   BINDING                           20862     11717    0.11
   SIGNAL                            19975     19973    0.10
   COMPBIAS                          18938     10358    0.10
   VARSPLIC                          16016      6973    0.08
   MUTAGEN                           12686      3279    0.07
   ZN_FING                           12442      4999    0.06
   SITE                              12122      6874    0.06
   NON_TER                           10993      8349    0.06
   MOTIF                             10543      7622    0.05
   INIT_MET                           8934      8858    0.05
   PROPEP                             6568      5481    0.03
   DNA_BIND                           5748      5387    0.03
   LIPID                              5732      3787    0.03
   COILED                             5564      3389    0.03
   PEPTIDE                            4022      1851    0.02
   TRANSIT                            3372      3342    0.02
   CA_BIND                            2334       941    0.01
   NON_CONS                           1096       526    0.01
   CROSSLNK                           1001       761    0.01
   UNSURE                              417       170   <0.01
   SE_CYS                              205       139   <0.01

Cross-references (DR)              2038749             10.49
   InterPro                         399585    178352    2.06
   EMBL                             371472    186572    1.91
   Pfam                             235069    172308    1.21
   PROSITE                          175229    108545    0.90
   GO                                95530     27032    0.49
   PIR                               93479     86872    0.48
   PRINTS                            74849     58177    0.39
   HSSP                              73924     73924    0.38
   TIGRFAMs                          72463     67688    0.37
   HAMAP                             66305     66201    0.34
   ProDom                            52161     50186    0.27
   SMART                             48269     36731    0.25
   PANTHER                           47077     44568    0.24
   Ensembl                           36475     36473    0.19
   PDB                               30616      8392    0.16
   SMR                               26242     26242    0.14
   TIGR                              19090     18555    0.10
   PIRSF                             15949     15699    0.08
   HGNC                              12075     12018    0.06
   MIM                               11065      9064    0.06
   MGI                                9693      9659    0.05
   IntAct                             7064      7064    0.04
   SGD                                5192      5129    0.03
   GermOnline                         4926      4876    0.03
   EcoGene                            4225      4223    0.02
   EchoBASE                           4159      4127    0.02
   MEROPS                             3861      3746    0.02
   H-InvDB                            3676      3658    0.02
   TAIR                               3675      3603    0.02
   RGD                                3231      3228    0.02
   WormPep                            3097      2666    0.02
   FlyBase                            2883      2852    0.01
   GeneDB_Spombe                      2872      2838    0.01
   TRANSFAC                           2782      2494    0.01
   SubtiList                          2757      2756    0.01
   WormBase                           2738      2661    0.01
   Gramene                            1890      1883    0.01
   StyGene                            1467      1464    0.01
   TubercuList                        1431      1395    0.01
   SWISS-2DPAGE                       1155      1155    0.01
   GeneFarm                           1059      1053    0.01
   ListiList                          1019      1011    0.01
   Reactome                            992       992    0.01
   Leproma                             625       621   <0.01
   ZFIN                                516       509   <0.01
   PhotoList                           488       488   <0.01
   MaizeDB                             426       421   <0.01
   HIV                                 370       365   <0.01
   REBASE                              367       362   <0.01
   OGP                                 367       367   <0.01
   ECO2DBASE                           351       299   <0.01
   DictyBase                           326       324   <0.01
   AGD                                 300       294   <0.01
   SagaList                            298       297   <0.01
   LegioList                           286       286   <0.01
   GlycoSuiteDB                        283       283   <0.01
   PHCI-2DPAGE                         239       239   <0.01
   MypuList                            175       175   <0.01
   Aarhus/Ghent-2DPAGE                 128        98   <0.01
   Siena-2DPAGE                        103       103   <0.01
   HSC-2DPAGE                           85        85   <0.01
   COMPLUYEAST-2DPAGE                   59        59   <0.01
   PhosSite                             54        54   <0.01
   PMMA-2DPAGE                          52        52   <0.01
   Maize-2DPAGE                         39        39   <0.01
   Rat-heart-2DPAGE                     28        28   <0.01
   ANU-2DPAGE                           16        16   <0.01

Number of explicitly cross-referenced databases: 69
Number of implicitly cross-referenced databases: 31


7.  MISCELLANEOUS STATISTICS

Total number of distinct authors cited in Swiss-Prot: 208469

Total number of entries encoded on a plastid: 64  
Total number of entries encoded on a mitochondrion: 3334
Total number of entries encoded on a plasmid: 3046

Number of fragments: 8504
Number of additional sequences encoded on splice variants: 12128


UniProtKB/TrEMBL protein database release 31.0 statistics


1.  INTRODUCTION

Release 31.0 of 13-Sept-2005 of UniProtKB/TrEMBL has been produced in synch
with UniProtKB/Swiss-Prot release 48 and EMBL/DDBJ/GenBank nucleotide
sequence database release 83 and updates until the 19-August-2005. It contains 
2'055'517 sequence entries, comprising 680'464'593 amino acids.

405'513 sequences have been added since release 30. This represents an 
increase of 27%.

In the document delac_tr.txt, you will find a list of all accession numbers
which were previously present in TrEMBL, but which have now been
deleted from the database. Most deletions are due to the deletion of the
corresponding CDS in the source nucleotide sequence databases EMBL-
Bank/DDBJ/GenBank. In addition, some entries are recognised to be Open
Reading frames (ORFs) that have been wrongly predicted to code for proteins.
When there is enough evidence that these hypothetical proteins are not real,
we take the decision to remove them from TrEMBL. 


2.  AMINO ACID COMPOSITION

   2.1  Composition in percent for the complete database

   Ala (A) 7.85   Gln (Q) 3.87   Leu (L) 9.72   Ser (S) 7.14
   Arg (R) 5.38   Glu (E) 6.08   Lys (K) 5.51   Thr (T) 5.71
   Asn (N) 4.48   Gly (G) 6.87   Met (M) 2.38   Trp (W) 1.35
   Asp (D) 5.13   His (H) 2.27   Phe (F) 4.11   Tyr (Y) 3.11
   Cys (C) 1.49   Ile (I) 5.96   Pro (P) 4.94   Val (V) 6.48

   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.06


   2.2  Classification of the amino acids by their frequency

   Leu, Ala, Ser, Gly, Val, Glu, Ile, Thr, Lys, Arg, Asp, Pro, Asn, Phe,
   Gln, Tyr, Met, His, Cys, Trp


3.  TAXONOMIC ORIGIN

   Total number of species represented in this release of 
   UniProtKB/TrEMBL: 95545

   The first twenty species represent 571629 sequences: 27.1 % of the
   total number of entries.


   3.1 Table of the frequency of occurrence of species

        Species represented 1x:46664
                            2x:18167
                            3x: 9165
                            4x: 4827
                            5x: 2821
                            6x: 2208
                            7x: 1484
                            8x: 1222
                            9x: 1005
                           10x:  763
                       11- 20x: 3497
                       21- 50x: 1926
                       51-100x:  775
                         >100x: 1021


   3.2  Table of the most represented species

  ------  ---------  --------------------------------------------
  Number  Frequency  Species
  ------  ---------  --------------------------------------------
       1     138508  Human immunodeficiency virus 1
       2      58027  Homo sapiens (Human)
       3      49342  Oryza sativa (japonica cultivar-group)
       4      39688  Arabidopsis thaliana (Mouse-ear cress)
       5      39144  Mus musculus (Mouse)
       6      27998  Tetraodon nigroviridis (Green puffer)
       7      25252  Drosophila melanogaster (Fruit fly)
       8      25184  Hepatitis C virus
       9      20341  Caenorhabditis elegans
      10      20090  Trypanosoma cruzi
      11      15223  Anopheles gambiae str. PEST
      12      14672  Plasmodium chabaudi
      13      14614  Dictyostelium discoideum (Slime mold)
      14      13522  Brachydanio rerio (Zebrafish) (Danio rerio)
      15      13197  Caenorhabditis briggsae
      16      11765  Plasmodium berghei
      17      11636  Gibberella zeae PH-1
      18      11543  Xenopus laevis (African clawed frog)
      19      11007  Magnaporthe grisea 70-15
      20      10876  Neurospora crassa
      21       9872  Aspergillus fumigatus Af293
      22       9826  Rattus norvegicus (Rat)
      23       9676  Schistosoma japonicum (Blood fluke)
      24       9474  Aspergillus nidulans FGSC A4
      25       9168  Candida albicans SC5314
      26       9092  Entamoeba histolytica HM-1:IMSS
      27       8990  Hepatitis B virus
      28       8349  uncultured bacterium
      29       8212  Leishmania major
      30       8122  Bradyrhizobium japonicum
      31       8063  Solibacter usitatus Ellin6076
      32       7801  Plasmodium yoelii yoelii
      33       7663  Burkholderia vietnamiensis G4
      34       7563  Streptomyces coelicolor
      35       7349  Streptomyces avermitilis
      36       7236  Escherichia coli
      37       7178  Rhizobium loti (Mesorhizobium loti)
      38       7050  Rhodopirellula baltica
      39       7049  Burkholderia cenocepacia HI2424
      40       6994  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
      41       6545  Pseudomonas aeruginosa
      42       6531  Cryptococcus neoformans var. neoformans B-3501A
      43       6498  Ustilago maydis 521
      44       6456  Burkholderia cenocepacia AU 1054
      45       6433  Ralstonia eutropha JMP134
      46       6399  Yarrowia lipolytica (Candida lipolytica)
      47       6394  Giardia lamblia ATCC 50803
      48       6243  Bacillus anthracis
      49       6180  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
      50       6124  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
      51       5905  Bacillus cereus G9241
      52       5848  Cryptococcus neoformans var. neoformans JEC21
      53       5757  Nocardia farcinica
      54       5701  Burkholderia pseudomallei (Pseudomonas pseudomallei)
      55       5694  Rhizobium meliloti (Sinorhizobium meliloti)
      56       5661  Crocosphaera watsonii
      57       5644  Polaromonas sp. JS666
      58       5556  Anabaena sp. (strain PCC 7120)
      59       5507  Bacillus cereus (strain ATCC 10987)
      60       5474  Gallus gallus (Chicken)
      61       5429  Trypanosoma brucei
      62       5421  Bacillus cereus (strain ZK)
      63       5226  Plasmodium falciparum (isolate 3D7)
      64       5193  Yersinia pestis
      65       5183  Helicobacter pylori (Campylobacter pylori)
      66       5131  Kluyveromyces lactis (Yeast)
      67       5074  Pseudomonas syringae pv. syringae (strain B728a)
      68       5055  Photobacterium profundum (Photobacterium sp. (strain SS9))
      69       5043  Candida glabrata (Yeast) (Torulopsis glabrata)
      70       5042  Pseudomonas syringae pv. phaseolicola 1448A
      71       4959  Pseudomonas syringae pv. tomato
      72       4938  Azotobacter vinelandii AvOP
      73       4918  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
      74       4872  Colwellia psychrerythraea (strain 34H / ATCC BAA-681) (Vibrio psychroerythus)
      75       4865  Escherichia coli O157:H7
      76       4837  Bacillus thuringiensis subsp. konkukian
      77       4796  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
      78       4782  Bacillus cereus (strain ATCC 14579 / DSM 31)
      79       4767  Pseudomonas putida (strain KT2440)
      80       4747  Streptococcus pneumoniae
      81       4744  Bacteroides fragilis
      82       4730  Ralstonia solanacearum (Pseudomonas solanacearum)
      83       4610  Burkholderia mallei (Pseudomonas mallei)
      84       4593  Bacteroides thetaiotaomicron
      85       4583  Rhodopseudomonas palustris
      86       4563  Xanthomonas oryzae pv. oryzae
      87       4552  Leptospira interrogans
      88       4546  Oryza sativa (Rice)
      89       4533  Frankia sp. CcI3
      90       4518  Arthrobacter sp. FB24
      91       4515  Salmonella choleraesuis
      92       4456  Ashbya gossypii (Yeast) (Eremothecium gossypii)
      93       4412  Vibrio vulnificus (strain YJ016)
      94       4390  Vibrio parahaemolyticus
      95       4381  Azoarcus sp. (strain EbN1)
      96       4352  Mycobacterium tuberculosis
      97       4310  Anaeromyxobacter dehalogenans 2CP-C
      98       4237  Xanthomonas campestris pv. campestris (strain 8004)
      99       4213  Bacteroides fragilis (strain ATCC 25285 / NCTC 9343)
     100       4180  Mycobacterium paratuberculosis
     101       4159  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
     102       4155  Dechloromonas aromatica RCB
     103       4127  Shewanella oneidensis
     104       4119  Silicibacter pomeroyi
     105       4116  Gloeobacter violaceus
     106       4106  Theileria parva
     107       4099  Pongo pygmaeus (Orangutan)
     108       4081  Photorhabdus luminescens subsp. laumondii
     109       4075  Plasmodium falciparum
     110       4056  Corynebacterium glutamicum (Brevibacterium flavum)
     111       4051  Chromobacterium violaceum
     112       4051  Cryptosporidium parvum
     113       4046  Methanosarcina acetivorans
     114       4037  Haloarcula marismortui (Halobacterium marismortui)
     115       4027  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
     116       4023  Vibrio vulnificus
     117       4007  Cryptosporidium hominis
     118       4006  Salmonella typhi


   3.3  Taxonomic distribution of the sequences

   Kingdom        sequences (% of the database)
    Archaea           50509 (  3%)
    Bacteria         804377 ( 37%)
    Eukaryota        914970 ( 43%)
    Viruses          333442 ( 18%)
    Other              1078 ( <1%)

   Within Eukaryota:

   Category            sequences (% of Eukaryota) (% of the complete database)
     Human                  58027 (  6%)           (  3%)
     Other Mammalia         98381 ( 11%)           (  5%)
     Other Vertebrata      128012 ( 14%)           (  6%)
     Viridiplantae         185692 ( 20%)           (  9%)
     Fungi                  89476 ( 15%)           (  6%)
     Insecta                89476 ( 10%)           (  4%)
     Nematoda               36401 (  4%)           (  2%)
     Other                 183435 ( 20%)           (  9%)



4.  SEQUENCE SIZE

   4.1  Repartition of the sequences by size (excluding fragments)

              From   To  Number             From   To   Number
                  1-  50   26909             1001-1100    13065
                 51- 100  126492             1101-1200     9306
                101- 150  159806             1201-1300     6932
                151- 200  147708             1301-1400     4548
                201- 250  149510             1401-1500     3797
                251- 300  138902             1501-1600     2629
                301- 350  135080             1601-1700     2119
                351- 400  109439             1701-1800     1745
                401- 450   86159             1801-1900     1340
                451- 500   75059             1901-2000     1143
                501- 550   58434             2001-2100      861
                551- 600   41682             2101-2200     1010
                601- 650   32186             2201-2300      823
                651- 700   24992             2301-2400      645
                701- 750   21448             2401-2500      481
                751- 800   18053             >2500         4181
                801- 850   14994
                851- 900   13392
                901- 950   10107
                951-1000    8026


 
   4.2  Longest and shortest sequences

   The shortest sequence is Q16047_HUMAN:     4 amino acids.
   The longest sequence is  Q8WZ42_HUMAN: 34350 amino acids.


5.  STATISTICS FOR SOME LINE TYPES

The following table summarizes the total number of some TrEMBL 
lines, as well as the number of entries with at least one such line, and the
frequency of the lines.

                                   Total    Number of  Average
Line type / subtype                number   entries    per entry
---------------------------------  -------- ---------  ---------

References (RL)                    2964602              1.41
   Journal                         1700868   1464012    0.81
   Submitted to EMBL/GenBank/DDBJ  1219962    912710    0.58
   Thesis                             4784      4732   <0.01
   Book citation                      4076      4032   <0.01
   Submitted to other databases        440       432   <0.01
   Other                             34472     20641    0.02

Comments (CC)                      1056686              0.50
   CAUTION                          323137    323137    0.15
   SIMILARITY                       237377    234839    0.11
   FUNCTION                         131897    117967    0.06
   SUBCELLULAR LOCATION             108462    108460    0.05
   CATALYTIC ACTIVITY               105716     91616    0.05
   SUBUNIT                           65915     65915    0.03
   COFACTOR                          42117     42117    0.02
   PATHWAY                           32883     32502    0.02
   MISCELLANEOUS                      3629      3619   <0.01
   INTERACTION                        3468      3468   <0.01
   DOMAIN                             1951      1592   <0.01
   MASS SPECTROMETRY                   119        63   <0.01
   ALLERGEN                             15        15   <0.01

Features (FT)                      1162978              0.55
   NON_TER                         1088124    650363    0.52
   CHAIN                             42871     25628    0.02
   SIGNAL                            31423     30474    0.01
   TRANSIT                             560       556   <0.01

Cross-references (DR)             14786754              7.02
   GO                              4294849   1243800    2.04
   InterPro                        2765243   1431739    1.31
   EMBL                            2446971   2099223    1.16
   Pfam                            1748797   1332161    0.83
   PROSITE                         1016814    646081    0.48
   PRINTS                           431666    357696    0.21
   SMART                            352928    268332    0.17
   HSSP                             286785    286508    0.14
   SMR                              277524    277496    0.13
   ProDom                           222934    214109    0.11
   TIGRFAMs                         205592    190426    0.10
   PIR                              196746    161117    0.09
   Ensembl                          117293    117293    0.06
   TIGR                              91544     85531    0.04
   Gramene                           58354     58319    0.03
   PANTHER                           53906     53896    0.03
   PIRSF                             40276     39473    0.02
   MGI                               35859     33668    0.02
   FlyBase                           22134     22084    0.01
   WormPep                           19260     19178    0.01
   WormBase                          19250     19178    0.01
   TAIR                              17779     17718    0.01
   ZFIN                              10704     10700    0.01
   MEROPS                             8295      8031   <0.01
   IntAct                             5715      5715   <0.01
   LegioList                          5607      5577   <0.01
   ListiList                          4796      4779   <0.01
   AGD                                4416      4416   <0.01
   PhotoList                          4192      4068   <0.01
   HGNC                               3538      3538   <0.01
   PDB                                2968      1762   <0.01
   TubercuList                        2557      2551   <0.01
   RGD                                2331      2316   <0.01
   GeneDB_Spombe                      2063      2057   <0.01
   SagaList                           1796      1702   <0.01
   SGD                                1323      1321   <0.01
   TRANSFAC                            989       977   <0.01
   Leproma                             982       981   <0.01
   DictyBase                           979       979   <0.01
   MypuList                            607       603   <0.01
   REBASE                              125       120   <0.01
   PHCI-2DPAGE                         108       108   <0.01
   ANU-2DPAGE                           70        70   <0.01
   SWISS-2DPAGE                         63        63   <0.01
   Reactome                             20        20   <0.01
   PMMA-2DPAGE                           3         3   <0.01
   Siena-2DPAGE                          2         2   <0.01
   COMPLUYEAST-2DPAGE                    1         1   <0.01

   
Number of explicitly cross-referenced databases: 69

6.  MISCELLANEOUS STATISTICS

Total number of distinct authors cited in TrEMBL: 216643

Total number of entries encoded on Plastid; Chloroplast: 42172
Total number of entries encoded on Mitochondrion: 108892
Total number of entries encoded on Plastid; Cyanelle: 7
Total number of entries encoded on Plastid; Apicoplast: 142
Total number of entries encoded on Plastid; Non-photosynthetic plastid: 198
Total number of entries encoded on Plastid: 1833
Total number of entries encoded on Plasmid: 37058

Number of fragments: 652514



Submissions and Updates

We welcome feedback from our users. We would especially appreciate your notifying us if you find that sequences belonging to your field of expertise are missing from the database. We also would like to be notified about annotations to be updated, if, for example, the function of a protein has been clarified or if new information about post-translational modifications has become available.

Submit new sequence data, updates and corrections at http://www.uniprot.org/support/submissions.shtml

For all queries regarding submissions to UniProtkb and to submit new protein sequence data, please contact:

UniProt Knowledgebase
The EMBL Outstation - The European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 462
Telefax: (+44 1223) 494 468
E-mail:


Download information

Bi-Weekly releases

The latest data of the UniProt Knowledgebase is available in various format (flatfile, XML or FASTA) at http://www.uniprot.org/database/download.shtml. The data is further supplemented by a file containing the sequences of all additional splice isoforms annotated in UniProtKB/Swiss-Prot. This data set is documented in the file ftp://ftp.expasy.org/databases/uniprot/current_release/knowledgebase/complete/README.varsplic

Major releases

For users who wish to download the UniProt Knowledgebase only occasionally, we distribute the latest major release (updated 4 times per year) in flatfile format. Previous UniProtKB/Swiss-Prot and UniProtKB/TrEMBL are archived under ftp://ftp.uniprot.org/pub/databases/uniprot/previous_major_releases. The UniProt Knowledgebase major release is also available on CD-ROM from the EBI.


Contact

EMBL Outstation
European Bioinformatics Institute (EBI)
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 444
Fax: (+44 1223) 494 468
Electronic mail address: /
WWW server: http://www.ebi.ac.uk/


Swiss Institute of Bioinformatics (SIB)
Centre Medical Universitaire
1, rue Michel Servet
1211 Geneva 4
Switzerland

Telephone: (+41 22) 702 50 50
Fax: (+41 22) 702 58 58
Electronic mail address:
WWW server: http://www.expasy.org/


Protein Information Resource (PIR)
Georgetown University Medical Center
3900 Reservoir Road, NW
Box 571455
Washington, DC 20057-1455
United States of America

Telephone: (+1 202) 687 1039
Fax: (+1 202) 687 0057)
Electronic mail address:
WWW server: http://pir.georgetown.edu

Citation

If you want to cite UniProt in a publication please use the following reference:

Bairoch A., Apweiler R., Wu C.H., Barker W.C., Boeckmann B., Ferro S., Gasteiger E., Huang H., Lopez R., Magrane M., Martin M.J., Natale D.A., O'Donovan C., Redaschi N., Yeh L.S., The Universal Protein Resource (UniProt), Nucleic Acids Res. 33: D154-D159 (2005).