UniProt Knowledgebase
Swiss-Prot Protein Knowledgebase
TrEMBL Protein Database

Release notes
UniProtKB release 7.0 of 7-Feb-2006

Content

  Introduction
  UniProtKB/Swiss-Prot Protein Knowledgebase release statistics
  UniProtKB/TrEMBL Protein Database release statistics

  Submissions and Updates
  Download information
  Contact
  Citation

  Related documents: UniProtKB user manual, Recent changes, Forthcoming changes.

Introduction

Release 7.0 of the UniProt Knowledgebase is composed of the UniProtKB/Swiss-Prot Protein Knowledgebase release 49.0 and the UniProtKB/TrEMBL Protein Database release 32.0.

More information on these databases can be found in the user manual What is the UniProt Knowledgebase ?.


UniProtKB/Swiss-Prot protein knowledgebase release 49.0 statistics

Release 49.0 of 07-Feb-2006 of Swiss-Prot contains 207'132 sequence entries, comprising 75'438'310 amino acids abstracted from 139'151 references.

The growth of the database is summarized below.

Release Date Number of entries Number of amino acids
2.0 09/86 3'939 900'163
3.0 11/86 4'160 969'641
4.0 04/87 4'387 1'036'010
5.0 09/87 5'205 1'327'683
6.0 01/88 6'102 1'653'982
7.0 04/88 6'821 1'885'771
8.0 08/88 7'724 2'224'465
9.0 11/88 8'702 2'498'140
10.0 03/89 10'008 2'952'613
11.0 07/89 10'856 3'265'966
12.0 10/89 12'305 3'797'482
13.0 01/90 13'837 4'347'336
14.0 04/90 15'409 4'914'264
15.0 08/90 16'941 5'486'399
16.0 11/90 18'364 5'986'949
17.0 02/91 20'024 6'524'504
18.0 05/91 20'772 6'792'034
19.0 08/91 21'795 7'173'785
20.0 11/91 22'654 7'500'130
21.0 03/92 23'742 7'866'596
22.0 05/92 25'044 8'375'696
23.0 08/92 26'706 9'011'391
24.0 12/92 28'154 9'545'427
25.0 04/93 29'955 10'214'020
26.0 07/93 31'808 10'875'091
27.0 10/93 33'329 11'484'420
28.0 02/94 36'000 12'496'420
29.0 06/94 38'303 13'464'008
30.0 10/94 40'292 14'147'368
31.0 02/95 43'470 15'335'248
32.0 11/95 49'340 17'385'503
33.0 02/96 52'205 18'531'384
34.0 10/96 59'021 21'210'389
35.0 11/97 69'113 25'083'768
36.0 07/98 74'019 26'840'295
37.0 12/98 77'977 28'268'293
38.0 07/99 80'000 29'085'965
39.0 05/00 86'593 31'411'114
40.0 10/01 101'602 37'315'215
41.0 02/03 122'564 44'986'459
42.0 10/03 135'850 50'046'799
43.0 03/04 146'720 54'093'154
44.0 07/04 153'871 56'608'159
45.0 10/04 163'235 59'631'787
46.0 02/05 168'297 61'443'278
47.0 05/05 181'577 65'746'672
48.0 09/05 194'317 70'391'852
49.0 02/06 207'132 75'438'310

In rare cases, Swiss-Prot entries are removed. Deleted entries are almost exclusively Open Reading Frames (ORFs) that have been wrongly predicted to code for proteins. When there is enough evidence that these hypothetical proteins are not real we take the decision to remove them from Swiss-Prot. In the document delac_sp.txt, you will find a list of all accession numbers which were previously present in UniProtKB/Swiss-Prot, but which have now been deleted from the database.


Status of the model organisms

We have selected a number of organisms that are the target of genome sequencing and/or mapping projects and for which we intend to:

From our efforts to annotate human sequence entries as completely as possible arose the HPI project, and the bacterial model organisms became the focus of the HAMAP project. Here is the current status of the model organisms which are not covered by these two projects:

Organism Database cross-references Index file Number of sequences
A.thaliana TAIR arath.txt 3'957
C.albicans None yet calbican.txt 479
C.elegans Wormpep celegans.txt 2'784
D.discoideum DictyBase dicty.txt 325
D.melanogaster FlyBase fly.txt 2'338
M.musculus MGD mgdtosp.txt 10'523
S.cerevisiae SGD yeast.txt 5'271
S.pombe GeneDB_SPombe pombe.txt 2'945

UniProtKB/Swiss-Prot release statistics


1.  INTRODUCTION

Release 49.0 of 07-Feb-2006 of UniProtKB/Swiss-Prot contains 207'132 sequence entries,
comprising 75'438'310 amino acids abstracted from 139'151 references.

12'815 sequences have been added since release 48, the sequence data of 991 existing 
entries has been updated and the annotations of all entries have been revised. 
This represents an increase of 7%.

2.  AMINO ACID COMPOSITION

   2.1  Composition in percent for the complete database

   Ala (A) 7.83   Gln (Q) 3.95   Leu (L) 9.64   Ser (S) 6.86
   Arg (R) 5.35   Glu (E) 6.64   Lys (K) 5.93   Thr (T) 5.42
   Asn (N) 4.18   Gly (G) 6.93   Met (M) 2.38   Trp (W) 1.15
   Asp (D) 5.32   His (H) 2.29   Phe (F) 4.00   Tyr (Y) 3.06
   Cys (C) 1.52   Ile (I) 5.91   Pro (P) 4.83   Val (V) 6.71

   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00


   2.2  Classification of the amino acids by their frequency

   Leu, Ala, Gly, Ser, Val, Glu, Lys, Ile, Thr, Arg, Asp, Pro, Asn, Phe,
   Gln, Tyr, Met, His, Cys, Trp


3.  TAXONOMIC ORIGIN

   Total number of species represented in this release of UniProtKB/Swiss-Prot: 9731

   The first twenty species represent 69270 sequences:  33.4 % of the total
   number of entries.


   3.1 Table of the frequency of occurrence of species

        Species represented 1x: 4676
                            2x: 1525
                            3x:  748
                            4x:  488
                            5x:  319
                            6x:  287
                            7x:  197
                            8x:  159
                            9x:  140
                           10x:   78
                       11- 20x:  404
                       21- 50x:  308
                       51-100x:  110
                         >100x:  292


   3.2  Table of the most represented species

  ------  ---------  --------------------------------------------
  Number  Frequency  Species
  ------  ---------  --------------------------------------------
       1      13433  Homo sapiens (Human)
       2      10523  Mus musculus (Mouse)
       3       5271  Saccharomyces cerevisiae (Baker's yeast)
       4       4865  Rattus norvegicus (Rat)
       5       4849  Escherichia coli
       6       3957  Arabidopsis thaliana (Mouse-ear cress)
       7       2945  Schizosaccharomyces pombe (Fission yeast)
       8       2824  Bacillus subtilis
       9       2784  Caenorhabditis elegans
      10       2338  Drosophila melanogaster (Fruit fly)
      11       1796  Escherichia coli O157:H7
      12       1789  Bos taurus (Bovine)
      13       1782  Methanococcus jannaschii
      14       1772  Haemophilus influenzae
      15       1549  Salmonella typhimurium
      16       1476  Escherichia coli O6
      17       1444  Shigella flexneri
      18       1405  Mycobacterium tuberculosis
      19       1323  Gallus gallus (Chicken)
      20       1145  Mycobacterium bovis
      21       1141  Salmonella typhi
      22       1121  Xenopus laevis (African clawed frog)
      23       1057  Pseudomonas aeruginosa
      24       1022  Sus scrofa (Pig)
      25        967  Archaeoglobus fulgidus
      26        966  Synechocystis sp. (strain PCC 6803)
      27        929  Pongo pygmaeus (Orangutan)
      28        846  Vibrio cholerae
      29        844  Yersinia pestis
      30        836  Rhizobium meliloti (Sinorhizobium meliloti)
      31        784  Oryctolagus cuniculus (Rabbit)
      32        748  Aquifex aeolicus
      33        724  Oryza sativa (Rice)
      34        711  Pasteurella multocida
      35        687  Vibrio parahaemolyticus
      36        687  Mycoplasma pneumoniae
      37        657  Staphylococcus aureus (strain Mu50 / ATCC 700699)
      38        654  Staphylococcus aureus (strain N315)
      39        650  Streptomyces coelicolor
      40        643  Bacillus halodurans
      41        639  Staphylococcus aureus (strain MW2)
      42        636  Staphylococcus aureus (strain COL)
      43        634  Staphylococcus aureus (strain MSSA476)
      44        633  Staphylococcus aureus (strain MRSA252)
      45        633  Vibrio vulnificus
      46        627  Canis familiaris (Dog)
      47        624  Mycobacterium leprae
      48        619  Brachydanio rerio (Zebrafish) (Danio rerio)
      49        613  Vibrio vulnificus (strain YJ016)
      50        608  Treponema pallidum
      51        596  Anabaena sp. (strain PCC 7120)
      52        585  Methanobacterium thermoautotrophicum
      53        572  Buchnera aphidicola subsp. Acyrthosiphon pisum
      54        565  Pseudomonas putida (strain KT2440)
      55        565  Helicobacter pylori (Campylobacter pylori)
      56        562  Buchnera aphidicola subsp. Schizaphis graminum
      57        560  Pseudomonas syringae pv. tomato
      58        550  Bacillus anthracis
      59        548  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
      60        547  Rickettsia prowazekii
      61        547  Staphylococcus epidermidis (strain ATCC 12228)
      62        546  Helicobacter pylori J99 (Campylobacter pylori J99)
      63        542  Bradyrhizobium japonicum
      64        536  Lactococcus lactis subsp. lactis (Streptococcus lactis)
      65        529  Ralstonia solanacearum (Pseudomonas solanacearum)
      66        526  Zea mays (Maize)
      67        526  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
      68        525  Listeria monocytogenes
      69        525  Photorhabdus luminescens subsp. laumondii
      70        519  Listeria innocua
      71        513  Rhizobium loti (Mesorhizobium loti)
      72        508  Xanthomonas campestris pv. campestris
      73        507  Buchnera aphidicola subsp. Baizongia pistaciae
      74        505  Neisseria meningitidis serogroup B
      75        502  Neisseria meningitidis serogroup A
      76        495  Clostridium acetobutylicum
      77        493  Shewanella oneidensis
      78        492  Pan troglodytes (Chimpanzee)
      79        490  Neurospora crassa
      80        486  Mycoplasma genitalium
      81        486  Caulobacter crescentus
      82        479  Candida albicans (Yeast)
      83        477  Bacillus cereus (strain ATCC 14579 / DSM 31)
      84        473  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
      85        470  Thermotoga maritima
      86        470  Xanthomonas axonopodis pv. citri
      87        464  Streptococcus pneumoniae
      88        458  Xylella fastidiosa
      89        455  Yersinia pseudotuberculosis
      90        455  Listeria monocytogenes serotype 4b (strain F2365)
      91        449  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
      92        446  Deinococcus radiodurans
      93        440  Mimivirus
      94        440  Pyrococcus horikoshii
      95        440  Haemophilus ducreyi
      96        436  Brucella melitensis
      97        436  Methanosarcina acetivorans
      98        435  Oceanobacillus iheyensis
      99        435  Pyrococcus abyssi
     100        435  Brucella suis
     101        433  Corynebacterium glutamicum (Brevibacterium flavum)
     102        433  Clostridium perfringens
     103        432  Chlamydia trachomatis
     104        432  Halobacterium salinarium (Halobacterium halobium)
     105        427  Kluyveromyces lactis (Yeast)
     106        419  Borrelia burgdorferi (Lyme disease spirochete)
     107        416  Methanosarcina mazei (Methanosarcina frisia)
     108        415  Ashbya gossypii (Yeast) (Eremothecium gossypii)
     109        413  Chlamydia pneumoniae (Chlamydophila pneumoniae)
     110        411  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
     111        408  Nicotiana tabacum (Common tobacco)
     112        408  Pyrococcus furiosus
     113        404  Rhizobium sp. (strain NGR234)
     114        403  Chlamydia muridarum
     115        400  Thermoanaerobacter tengcongensis
     116        396  Lactobacillus plantarum
     117        391  Campylobacter jejuni
     118        389  Ovis aries (Sheep)
     119        389  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
     120        387  Sulfolobus solfataricus
     121        386  Streptococcus mutans
     122        384  Synechococcus elongatus (Thermosynechococcus elongatus)
     123        384  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
     124        379  Chromobacterium violaceum
     125        379  Streptococcus pyogenes serotype M1
     126        377  Streptococcus pyogenes serotype M6
     127        377  Bordetella pertussis
     128        376  Bordetella parapertussis
     129        376  Enterococcus faecalis (Streptococcus faecalis)
     130        374  Streptococcus pyogenes serotype M18
     131        373  Rickettsia conorii
     132        373  Streptococcus pyogenes serotype M3
     133        370  Candida glabrata (Yeast) (Torulopsis glabrata)
     134        369  Staphylococcus aureus
     135        368  Streptomyces avermitilis
     136        353  Pyrococcus kodakaraensis (Thermococcus kodakaraensis)
     137        352  Chlorobium tepidum
     138        342  Aeropyrum pernix
     139        341  Corynebacterium efficiens
     140        340  Bacillus cereus (strain ATCC 10987)
     141        338  Methanopyrus kandleri
     142        337  Photobacterium profundum (Photobacterium sp. (strain SS9))
     143        336  Leptospira interrogans
     144        328  Nitrosomonas europaea
     145        326  Leptospira interrogans serogroup Icterohaemorrhagiae serovar copenhageni
     146        325  Dictyostelium discoideum (Slime mold)
     147        323  Salmonella paratyphi-a
     148        319  Emericella nidulans (Aspergillus nidulans)
     149        317  Sulfolobus tokodaii
     150        316  Pisum sativum (Garden pea)
     151        313  Streptococcus agalactiae serotype III
     152        310  Streptococcus agalactiae serotype V
     153        305  Gloeobacter violaceus
     154        303  Thermoplasma acidophilum
     155        302  Lycopersicon esculentum (Tomato)
     156        295  Yarrowia lipolytica (Candida lipolytica)
     157        293  Triticum aestivum (Wheat)
     158        292  Synechococcus sp. (strain WH8102)
     159        289  Fusobacterium nucleatum subsp. nucleatum
     160        287  Prochlorococcus marinus (strain MIT 9313)
     161        287  Rhodopseudomonas palustris
     162        284  Prochlorococcus marinus
     163        283  Bacillus thuringiensis subsp. konkukian
     164        281  Macaca mulatta (Rhesus macaque)
     165        281  Acinetobacter sp. (strain ADP1)
     166        280  Pseudomonas putida
     167        278  Sulfolobus acidocaldarius
     168        276  Hordeum vulgare (Barley)
     169        274  Coxiella burnetii
     170        271  Cavia porcellus (Guinea pig)
     171        269  Pyrobaculum aerophilum
     172        269  Glycine max (Soybean)
     173        268  Bacteriophage T4
     174        268  Prochlorococcus marinus subsp. pastoris (strain CCMP 1378 / MED4)
     175        267  Thermoplasma volcanium
     176        265  Clostridium tetani
     177        261  Solanum tuberosum (Potato)
     178        259  Bacteroides thetaiotaomicron
     179        258  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
     180        258  Rhodopirellula baltica
     181        257  Mycobacterium paratuberculosis
     182        254  Rhodobacter capsulatus (Rhodopseudomonas capsulata)
     183        254  Vaccinia virus (strain Copenhagen) (VACV)
     184        254  Wolinella succinogenes
     185        249  Bacillus clausii (strain KSM-K16)
     186        248  Ureaplasma parvum (Ureaplasma urealyticum biotype 1)
     187        246  Spinacia oleracea (Spinach)
     188        244  Burkholderia pseudomallei (Pseudomonas pseudomallei)
     189        244  Bacillus cereus (strain ZK / E33L)
     190        243  Mannheimia succiniciproducens (strain MBEL55E)
     191        242  Wigglesworthia glossinidia brevipalpis
     192        240  Thermus thermophilus (strain HB8 / ATCC 27634 / DSM 579)
     193        238  Geobacter sulfurreducens
     194        237  Bifidobacterium longum
     195        235  Bacillus stearothermophilus
     196        234  Corynebacterium diphtheriae
     197        232  Equus caballus (Horse)
     198        231  Chlamydophila caviae
     199        229  Porphyromonas gingivalis (Bacteroides gingivalis)
     200        225  Desulfovibrio vulgaris (strain Hildenborough / ATCC 29579 / NCIMB 8303)
     201        224  Burkholderia mallei (Pseudomonas mallei)
     202        224  Helicobacter hepaticus
     203        224  Methanococcus maripaludis
     204        221  Methylococcus capsulatus
     205        220  Porphyra purpurea
     206        219  Thermus thermophilus (strain HB27 / ATCC BAA-163 / DSM 7039)
     207        217  Haloarcula marismortui (Halobacterium marismortui)
     208        216  Chlamydomonas reinhardtii
     209        212  Zymomonas mobilis
     210        212  Synechococcus sp. (strain PCC 6301) (Anacystis nidulans)
     211        209  Klebsiella pneumoniae
     212        209  Leifsonia xyli subsp. xyli
     213        205  Blochmannia floridanus
     214        204  Geobacillus kaustophilus
     215        203  Nocardia farcinica
     216        200  Vaccinia virus (strain Western Reserve / WR) (VACV)


   3.3  Taxonomic distribution of the sequences


   Kingdom        sequences (% of the database)
    Archaea           10124 (  5%)
    Bacteria          96390 ( 47%)
    Eukaryota         90758 ( 44%)
    Viruses            9860 (  5%)


   Within Eukaryota:


    Category            sequences (% of Eukaryota) (% of the complete database)
     Human                  13434 ( 15%)           (  6%)
     Other Mammalia         27101 ( 30%)           ( 13%)
     Other Vertebrata        8051 (  9%)           (  4%)
     Viridiplantae          14694 ( 16%)           (  7%)
     Fungi                  13810 ( 15%)           (  7%)
     Insecta                 4492 (  5%)           (  2%)
     Nematoda                3133 (  3%)           (  2%)
     Other                   6043 (  7%)           (  3%)


4.  SEQUENCE SIZE

   Repartition of the sequences by size (excluding fragments)

               From   To  Number             From   To   Number
                  1-  50    4139             1001-1100     1796
                 51- 100   14793             1101-1200     1183
                101- 150   21093             1201-1300      898
                151- 200   20181             1301-1400      675
                201- 250   20675             1401-1500      557
                251- 300   17760             1501-1600      335
                301- 350   18308             1601-1700      241
                351- 400   16671             1701-1800      198
                401- 450   13067             1801-1900      196
                451- 500   10877             1901-2000      162
                501- 550    8282             2001-2100      104
                551- 600    5613             2101-2200      153
                601- 650    4693             2201-2300      132
                651- 700    3352             2301-2400       89
                701- 750    2810             2401-2500       70
                751- 800    2326             >2500          521
                801- 850    1932
                851- 900    2052
                901- 950    1565
                951-1000    1200


   The average sequence length in UniProtKB/Swiss-Prot is 364 amino acids.

   The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
   The longest sequence is  SYNE1_HUMAN (Q8NF91):  8797 amino acids.


5.  JOURNAL CITATIONS

   Note: the following citation statistics reflect the number of distinct
         journal citations.

   Total number of journals cited in this release of UniProtKB/Swiss-Prot: 1662


   5.1 Table of the frequency of journal citations

        Journals cited 1x:  586
                       2x:  225
                       3x:  128
                       4x:   81
                       5x:   56
                       6x:   39
                       7x:   31
                       8x:   41
                       9x:   18
                      10x:   16
                  11- 20x:  116
                  21- 50x:  145
                  51-100x:   59
                    >100x:  121


   5.2  List of the most cited journals in UniProtKB/Swiss-Prot

   Nb    Citations   Journal name
   --    ---------   -------------------------------------------------------------
    1        13103   Journal of Biological Chemistry
    2         6437   Proceedings of the National Academy of Sciences of the U.S.A.
    3         4277   Journal of Bacteriology
    4         3988   Gene
    5         3925   Nucleic Acids Research
    6         3442   Biochemical and Biophysical Research Communications
    7         3316   FEBS Letters
    8         3026   Biochemistry
    9         2866   The EMBO Journal
   10         2842   European Journal of Biochemistry
   11         2599   Nature
   12         2504   Biochimica et Biophysica Acta
   13         2294   Journal of Molecular Biology
   14         2257   Molecular and Cellular Biology
   15         2158   Genomics
   16         2069   Cell
   17         1667   Biochemical Journal
   18         1560   Science
   19         1379   Molecular Microbiology
   20         1277   Plant Molecular Biology
   21         1251   Molecular and General Genetics
   22         1080   Journal of Cell Biology
   23         1034   Journal of Biochemistry
   24         1006   Virology
   25          989   Human Molecular Genetics
   26          956   Journal of Virology
   27          946   Nature Genetics
   28          899   Genes and Development
   29          828   Plant Physiology
   30          815   Oncogene
   31          807   The American Journal of Human Genetics
   32          747   Human Mutation
   33          699   Journal of Immunology
   34          693   Infection and Immunity
   35          664   Structure
   36          663   Development
   37          652   Archives of Biochemistry and Biophysics
   38          641   Yeast
   39          616   Journal of General Virology
   40          608   Genetics
   41          573   Microbiology
   42          530   FEMS Microbiology Letters
   43          520   Nature Structural Biology
   44          490   Blood
   45          465   Human Genetics
   46          462   The Plant Cell
   47          456   Current Genetics
   48          455   Molecular Biology of the Cell
   49          408   Applied and Environmental Microbiology
   50          404   Cancer Research
   51          403   Developmental Biology
   52          395   Journal of Clinical Investigation
   53          393   Molecular and Biochemical Parasitology
   54          391   Journal of Cell Science
   55          381   Mammalian Genome
   56          379   Protein Science
   57          378   Mechanisms of Development
   58          375   Neuron
   59          375   The Plant Journal
   60          367   Molecular Endocrinology
   61          362   Acta Crystallographica, Section D
   62          358   Molecular Cell
   63          354   The Journal of Experimental Medicine
   64          346   Immunogenetics
   65          340   Journal of Neuroscience
   66          331   Journal of Molecular Evolution
   67          321   Endocrinology
   68          320   DNA and Cell Biology
   69          304   Current Biology
   70          294   Journal of Neurochemistry
   71          286   DNA Sequence
   72          283   Biological Chemistry Hoppe-Seyler
   73          270   American Journal of Physiology
   74          267   Molecular Biology and Evolution
   75          266   The Journal of Clinical Endocrinology and Metabolism
   76          260   Bioscience, Biotechnology, and Biochemistry
   77          258   Brain Research. Molecular Brain Research
   78          243   Toxicon
   79          241   Journal of General Microbiology
   80          238   Cytogenetics and Cell Genetics
   81          221   Comparative Biochemistry and Physiology
   82          214   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
   83          207   Antimicrobial Agents and Chemotherapy
   84          201   Proteins
   85          196   Molecular Pharmacology
   86          186   Journal of Investigative Dermatology
   87          186   Journal of Medical Genetics
   88          170   DNA Research
   89          170   Peptides
   90          166   Plant and Cell Physiology
   91          162   Molecular Plant-Microbe Interactions
   92          162   Virus Research
   93          161   Genome Research
   94          159   Biology of Reproduction
   95          158   DNA
   96          152   Tissue Antigens
   97          151   European Journal of Immunology
   98          146   Biochimie
   99          141   Molecular and Cellular Endocrinology
  100          139   American Journal of Medical Genetics
  101          138   Bioorganicheskaia Khimiia
  102          135   Hemoglobin
  103          128   Experimental Cell Research
  104          127   Nature Cell Biology
  105          126   Archives of Microbiology
  106          124   Annals of Neurology
  107          124   Molecular Phylogenetics and Evolution
  108          121   Neurology
  109          120   Insect Biochemistry and Molecular Biology
  110          118   Agricultural and Biological Chemistry
  111          117   European Journal of Human Genetics
  112          113   Journal of Human Genetics
  113          113   Immunity
  114          113   RNA
  115          112   General and Comparative Endocrinology
  116          111   Developmental Dynamics
  117          106   Diabetes
  118          103   Molecular Reproduction and Development
  119          103   Molecular Immunology
  120          103   Planta
  121          102   Genes to Cells
  122          100   Journal of Protein Chemistry


6.  STATISTICS FOR SOME LINE TYPES

The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
as well as the number of entries with at least one such line, and the
frequency of the lines.

                                   Total    Number of  Average
Line type / subtype                number   entries    per entry
---------------------------------  -------- ---------  ---------

References (RL)                     407878              1.97
   Journal                          361072    192976    1.74
   Submitted to EMBL/GenBank/DDBJ    43559     37255    0.21
   Submitted to Swiss-Prot             726       723   <0.01
   Unpublished observations            567       563   <0.01
   Book citation                       547       535   <0.01
   Plant Gene Register                 519       507   <0.01
   Submitted to other databases        388       380   <0.01
   Thesis                              341       339   <0.01
   Patent                              131       129   <0.01
   Unpublished results                  22        22   <0.01
   Worm Breeder's Gazette                6         6   <0.01

Comments (CC)                       801430              3.87
   SIMILARITY                       225736    186967    1.09
   FUNCTION                         140862    137032    0.68
   SUBCELLULAR LOCATION             106708    106708    0.52
   CATALYTIC ACTIVITY                75237     69754    0.36
   SUBUNIT                           71489     71489    0.35
   PATHWAY                           39782     34536    0.19
   COFACTOR                          30534     27405    0.15
   TISSUE SPECIFICITY                21716     21716    0.10
   MISCELLANEOUS                     17969     16342    0.09
   PTM                               15424     13217    0.07
   DOMAIN                            10600      9263    0.05
   ALTERNATIVE PRODUCTS               8509      8509    0.04
   CAUTION                            7869      6989    0.04
   INDUCTION                          5801      5801    0.03
   DEVELOPMENTAL STAGE                5218      5218    0.03
   INTERACTION                        4956      4956    0.02
   DISEASE                            3190      2329    0.02
   ENZYME REGULATION                  3089      3089    0.01
   MASS SPECTROMETRY                  2070      1747    0.01
   DATABASE                           1562      1406    0.01
   BIOPHYSICOCHEMICAL PROPERTIES      1303      1303    0.01
   POLYMORPHISM                        531       519   <0.01
   ALLERGEN                            406       406   <0.01
   RNA EDITING                         403       403   <0.01
   TOXIC DOSE                          280       278   <0.01
   BIOTECHNOLOGY                       125       125   <0.01
   PHARMACEUTICAL                       61        61   <0.01

Features (FT)                      1502318              7.25
   CHAIN                            210458    203970    1.02
   STRAND                           147090      7249    0.71
   TRANSMEM                         135865     29419    0.66
   TURN                              95995      7364    0.46
   METAL                             88482     21513    0.43
   CONFLICT                          75912     26372    0.37
   TOPO_DOM                          70026     14125    0.34
   HELIX                             67093      7146    0.32
   CARBOHYD                          64943     16393    0.31
   DISULFID                          64555     16785    0.31
   DOMAIN                            62427     33891    0.30
   ACT_SITE                          48527     28416    0.23
   REPEAT                            45273      6546    0.22
   VARIANT                           37244      7394    0.18
   BINDING                           30308     14568    0.15
   MOD_RES                           29797     14761    0.14
   NP_BIND                           28495     20106    0.14
   REGION                            27780     14613    0.13
   SIGNAL                            20949     20947    0.10
   COMPBIAS                          20505     11351    0.10
   VARSPLIC                          17662      7652    0.09
   MUTAGEN                           14751      3693    0.07
   ZN_FING                           14118      5500    0.07
   MOTIF                             12527      8696    0.06
   SITE                              10950      6128    0.05
   NON_TER                           10844      8278    0.05
   INIT_MET                           9461      9384    0.05
   PROPEP                             6855      5725    0.03
   COILED                             6420      3939    0.03
   DNA_BIND                           6094      5691    0.03
   LIPID                              6044      3972    0.03
   PEPTIDE                            5845      3574    0.03
   TRANSIT                            3637      3603    0.02
   CA_BIND                            2417       978    0.01
   CROSSLNK                           1210       942    0.01
   NON_CONS                           1120       523    0.01
   UNSURE                              418       170   <0.01
   SE_CYS                              221       155   <0.01

Cross-references (DR)              2236664             10.80
   InterPro                         423342    189133    2.04
   EMBL                             395184    199200    1.91
   Pfam                             248875    182310    1.20
   PROSITE                          188497    116608    0.91
   GO                                97279     27620    0.47
   PIR                               94760     88562    0.46
   PRINTS                            78657     61371    0.38
   TIGRFAMs                          76795     71754    0.37
   HSSP                              76069     76069    0.37
   HAMAP                             71745     71631    0.35
   BioCyc                            67849     62817    0.33
   SMART                             58049     44253    0.28
   ProDom                            54565     52510    0.26
   PANTHER                           48143     45588    0.23
   Ensembl                           38163     38153    0.18
   PDB                               30838      8497    0.15
   SMR                               26812     26812    0.13
   TIGR                              20204     19648    0.10
   PIRSF                             17045     16795    0.08
   LinkHub                           14271     14271    0.07
   HGNC                              12793     12737    0.06
   MIM                               11422      9364    0.06
   MGI                               10357     10318    0.05
   IntAct                             6588      6588    0.03
   SGD                                5328      5263    0.03
   GermOnline                         4926      4880    0.02
   RGD                                4605      4602    0.02
   EcoGene                            4225      4223    0.02
   EchoBASE                           4159      4127    0.02
   TAIR                               3998      3926    0.02
   MEROPS                             3958      3837    0.02
   H-InvDB                            3676      3658    0.02
   WormPep                            3260      2782    0.02
   GeneDB_Spombe                      2978      2943    0.01
   FlyBase                            2967      2920    0.01
   WormBase                           2859      2781    0.01
   TRANSFAC                           2811      2522    0.01
   SubtiList                          2766      2765    0.01
   Gramene                            2092      2084    0.01
   StyGene                            1505      1502    0.01
   TubercuList                        1433      1397    0.01
   GeneFarm                           1305      1299    0.01
   SWISS-2DPAGE                       1166      1166    0.01
   ListiList                          1045      1037    0.01
   Reactome                            998       998   <0.01
   Leproma                             627       624   <0.01
   ZFIN                                613       606   <0.01
   PhotoList                           525       525   <0.01
   MaizeDB                             432       427   <0.01
   AGD                                 421       415   <0.01
   HIV                                 370       365   <0.01
   OGP                                 369       369   <0.01
   REBASE                              352       348   <0.01
   ECO2DBASE                           351       299   <0.01
   LegioList                           334       334   <0.01
   DictyBase                           326       324   <0.01
   SagaList                            314       313   <0.01
   GlycoSuiteDB                        282       282   <0.01
   PHCI-2DPAGE                         239       239   <0.01
   MypuList                            181       181   <0.01
   Aarhus/Ghent-2DPAGE                 128        98   <0.01
   Siena-2DPAGE                        103       103   <0.01
   HSC-2DPAGE                           85        85   <0.01
   PhosSite                             64        62   <0.01
   COMPLUYEAST-2DPAGE                   59        59   <0.01
   PMMA-2DPAGE                          52        52   <0.01
   Rat-heart-2DPAGE                     28        28   <0.01
   PptaseDB                             27        27   <0.01
   ANU-2DPAGE                           20        20   <0.01

Number of explicitly cross-referenced databases: 70
Number of implicitly cross-referenced databases: 29


7.  MISCELLANEOUS STATISTICS

Total number of distinct authors cited in UniProtKB/Swiss-Prot: 216069

Total number of entries encoded on a Mitochondrion: 3397
Total number of entries encoded on a Plasmid: 3073
Total number of entries encoded on a Plastid: 20
Total number of entries encoded on a Plastid; Apicoplast: 2
Total number of entries encoded on a Plastid; Chloroplast: 5174
Total number of entries encoded on a Plastid; Cyanelle: 145
Total number of entries encoded on a Plastid; Non-photosynthetic plastid: 86

Number of fragments: 8433
Number of additional sequences encoded on splice variants: 13333



UniProtKB/TrEMBL protein database release 32.0 statistics


1.  INTRODUCTION

Release 32.0 of 07-February-2006 of UniProtKB/TrEMBL has been produced in synch
with UniProtKB/Swiss-Prot release 49 and EMBL/DDBJ/GenBank nucleotide sequence
database release 85 and updates until the 30-January-2006. It contains
2'605'574 sequence entries comprising 838'379'783 amino acids.
 

In the document delac_tr.txt, you will find a list of all accession numbers
which were previously present in UniProtKB/TrEMBL, but which have now been
deleted from the database. Most deletions are due to the deletion of the
corresponding CDS in the source nucleotide sequence databases EMBL-
Bank/DDBJ/GenBank. In addition, some entries are recognised to be Open
Reading frames (ORFs) that have been wrongly predicted to code for proteins.
When there is enough evidence that these hypothetical proteins are not real,
we take the decision to remove them from UniProtKB/TrEMBL. 


2.  AMINO ACID COMPOSITION

   2.1  Composition in percent for the complete database

   Ala (A) 8.20   Gln (Q) 3.87   Leu (L) 9.82   Ser (S) 6.97
   Arg (R) 5.50   Glu (E) 6.06   Lys (K) 5.32   Thr (T) 5.67
   Asn (N) 4.32   Gly (G) 6.99   Met (M) 2.39   Trp (W) 1.34
   Asp (D) 5.18   His (H) 2.26   Phe (F) 4.06   Tyr (Y) 3.05
   Cys (C) 1.42   Ile (I) 5.93   Pro (P) 4.91   Val (V) 6.58

   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.05


   2.2  Classification of the amino acids by their frequency

   Leu, Ala, Gly, Ser, Val, Glu, Ile, Thr, Arg, Lys, Asp, Pro, Asn, Phe,
   Gln, Tyr, Met, His, Cys, Trp


3.  TAXONOMIC ORIGIN

   Total number of species represented in this release of
   UniProtKB/TrEMBL: 103997

   The first twenty species represent 604592 sequences: 23.2 % of the
   total number of entries.


   3.1 Table of the frequency of occurrence of species

        Species represented 1x:49711
                            2x:19812
                            3x: 9920
                            4x: 5494
                            5x: 3115
                            6x: 2428
                            7x: 1675
                            8x: 1380
                            9x: 1119
                           10x:  923
                       11- 20x: 4231
                       21- 50x: 2134
                       51-100x:  860
                         >100x: 1195


   3.2  Table of the most represented species

  ------  ---------  --------------------------------------------
  Number  Frequency  Species
  ------  ---------  --------------------------------------------
       1     146096  Human immunodeficiency virus 1
       2      57551  Homo sapiens (Human)
       3      57096  Oryza sativa (japonica cultivar-group)
       4      51339  Mus musculus (Mouse)
       5      42093  Arabidopsis thaliana (Mouse-ear cress)
       6      28014  Tetraodon nigroviridis (Green puffer)
       7      26030  Hepatitis C virus
       8      25255  Drosophila melanogaster (Fruit fly)
       9      20339  Caenorhabditis elegans
      10      20120  Trypanosoma cruzi
      11      15136  Anopheles gambiae str. PEST
      12      14669  Plasmodium chabaudi
      13      14617  Dictyostelium discoideum (Slime mold)
      14      13851  Brachydanio rerio (Zebrafish) (Danio rerio)
      15      13144  Caenorhabditis briggsae
      16      12337  Xenopus laevis (African clawed frog)
      17      12181  Aspergillus oryzae
      18      11767  Plasmodium berghei
      19      11748  Gibberella zeae (Fusarium graminearum)
      20      11209  uncultured bacterium
      21      10803  Neurospora crassa
      22      10435  Hepatitis B virus (HBV)
      23      10158  Aspergillus fumigatus (Sartorya fumigata)
      24       9889  Rattus norvegicus (Rat)
      25       9739  Trypanosoma brucei
      26       9693  Schistosoma japonicum (Blood fluke)
      27       9405  Aspergillus nidulans FGSC A4
      28       9090  Entamoeba histolytica HM-1:IMSS
      29       9050  Candida albicans SC5314
      30       8102  Bradyrhizobium japonicum
      31       8063  Solibacter usitatus Ellin6076
      32       7937  Frankia sp. EAN1pec
      33       7800  Plasmodium yoelii yoelii
      34       7740  Escherichia coli
      35       7715  Burkholderia sp. (strain 383) (Burkholderia cepacia 
      36       7663  Burkholderia vietnamiensis G4
      37       7559  Streptomyces coelicolor
      38       7432  Bradyrhizobium sp. BTAi1
      39       7341  Streptomyces avermitilis
      40       7165  Rhizobium loti (Mesorhizobium loti)
      41       7085  Leishmania major
      42       7049  Burkholderia cenocepacia HI2424
      43       7013  Rhodopirellula baltica
      44       6979  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
      45       6752  Hahella chejuensis KCTC 2396
      46       6567  Pseudomonas aeruginosa
      47       6562  Bos taurus (Bovine)
      48       6526  Burkholderia ambifaria AMMD
      49       6505  Cryptococcus neoformans (Filobasidiella neoformans)
      50       6475  Cryptococcus neoformans var. neoformans B-3501A
      51       6456  Burkholderia cenocepacia AU 1054
      52       6451  Ustilago maydis 521
      53       6408  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
      54       6394  Giardia lamblia ATCC 50803
      55       6329  Burkholderia pseudomallei (strain 1710b)
      56       6316  Ralstonia metallidurans (strain CH34)
      57       6310  Yarrowia lipolytica (Candida lipolytica)
      58       6228  Bacillus anthracis
      59       6129  Bacillus thuringiensis serovar israelensis ATCC 35646
      60       6084  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
      61       6079  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
      62       5905  Bacillus cereus G9241
      63       5737  Nocardia farcinica
      64       5728  Pseudomonas fluorescens (strain PfO-1)
      65       5707  Rhizobium meliloti (Sinorhizobium meliloti)
      66       5686  Burkholderia pseudomallei (Pseudomonas pseudomallei)
      67       5661  Crocosphaera watsonii
      68       5646  Polaromonas sp. JS666
      69       5638  Anabaena variabilis (strain ATCC 29413)
      70       5593  Gallus gallus (Chicken)
      71       5561  Burkholderia thailandensis E264
      72       5550  Anabaena sp. (strain PCC 7120)
      73       5494  Bacillus cereus (strain ATCC 10987)
      74       5394  Bacillus cereus (strain ZK / E33L)
      75       5312  Chimpanzee immunodeficiency virus (SIV-cpz) 
      76       5288  Helicobacter pylori (Campylobacter pylori)
      77       5245  Pseudomonas putida F1
      78       5234  Plasmodium falciparum
      79       5223  Plasmodium falciparum (isolate 3D7)
      80       5153  Yersinia pestis
      81       5084  Paracoccus denitrificans PD1222
      82       5053  Clostridium beijerincki NCIMB 8052
      83       5050  Streptococcus pneumoniae
      84       5019  Pseudomonas syringae pv. syringae (strain B728a)
      85       5018  Photobacterium profundum (Photobacterium sp. (strain SS9))
      86       5009  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
      87       5005  Kluyveromyces lactis (Yeast)
      88       4971  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
      89       4955  Pseudomonas syringae pv. tomato
      90       4938  Azotobacter vinelandii AvOP
      91       4935  Rhodopseudomonas palustris BisB18
      92       4929  Candida glabrata (Yeast) (Torulopsis glabrata)
      93       4911  Nocardioides sp. JS614
      94       4896  Rhodopseudomonas palustris BisA53
      95       4827  Colwellia psychrerythraea (strain 34H / ATCC BAA-681) (Vibrio psychroerythus)
      96       4818  Escherichia coli O157:H7
      97       4809  Bacillus thuringiensis subsp. konkukian
      98       4769  Bacillus cereus (strain ATCC 14579 / DSM 31)
      99       4751  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
     100       4748  Pseudomonas putida (strain KT2440)


   3.3  Taxonomic distribution of the sequences

   Kingdom        sequences (% of the database)
    Archaea           63173 (  2%)
    Bacteria        1193711 ( 46%)
    Eukaryota        984105 ( 38%)
    Viruses          362130 ( 14%)
    Other              2455 ( <1%)

   Within Eukaryota:

    Category            sequences (% of Eukaryota) (% of the complete database)
     Human                      0 (  0%)           (  0%)
     Other Mammalia        174802 ( 18%)           (  7%)
     Other Vertebrata      137413 ( 14%)           (  5%)
     Viridiplantae         204082 ( 21%)           (  8%)
     Fungi                 137160 ( 14%)           (  5%)
     Insecta                98720 ( 10%)           (  4%)
     Nematoda               36538 (  4%)           (  1%)
     Other                 195390 ( 20%)           (  7%)


4.  SEQUENCE SIZE

   4.1  Repartition of the sequences by size (excluding fragments)

               From   To  Number             From   To   Number
                  1-  50   30122             1001-1100    15745
                 51- 100  163087             1101-1200    11054
                101- 150  210030             1201-1300     7947
                151- 200  196359             1301-1400     5191
                201- 250  198812             1401-1500     4323
                251- 300  186329             1501-1600     2996
                301- 350  177963             1601-1700     2360
                351- 400  142131             1701-1800     1985
                401- 450  113612             1801-1900     1498
                451- 500   97281             1901-2000     1280
                501- 550   73289             2001-2100      953
                551- 600   52919             2101-2200     1068
                601- 650   40519             2201-2300      877
                651- 700   31396             2301-2400      695
                701- 750   27159             2401-2500      514
                751- 800   22979             >2500         4643
                801- 850   18351
                851- 900   16327
                901- 950   12048
                951-1000    9446

 
   4.2  Longest and shortest sequences

   The shortest sequence is Q16047_HUMAN:     4 amino acids.
   The longest sequence is  Q3ASY8_CHLCH: 36805 amino acids.


5.  STATISTICS FOR SOME LINE TYPES

The following table summarizes the total number of some UniProtKB/TrEMBL 
lines, as well as the number of entries with at least one such line, and the
frequency of the lines.
                                   Total    Number of  Average
Line type / subtype                number   entries    per entry
---------------------------------  -------- ---------  ---------

References (RL)                    3900912              1.50
   Journal                         2025956   1662223    0.78
   Submitted to EMBL/GenBank/DDBJ  1829984   1216539    0.70
   Thesis                             4894      4841   <0.01
   Book citation                      4117      4074   <0.01
   Submitted to other databases        435       427   <0.01
   Other                             35526     21675    0.01

Comments (CC)                      1391074              0.53
   CAUTION                          555822    555822    0.21
   SIMILARITY                       287304    283060    0.11
   SUBCELLULAR LOCATION             139002    139002    0.05
   FUNCTION                         135438    134264    0.05
   CATALYTIC ACTIVITY                98620     95511    0.04
   SUBUNIT                           65771     65771    0.03
   COFACTOR                          51801     51334    0.02
   PATHWAY                           44306     42318    0.02
   DOMAIN                             5498      3877   <0.01
   MISCELLANEOUS                      3737      3727   <0.01
   INTERACTION                        3643      3643   <0.01
   MASS SPECTROMETRY                   116        61   <0.01
   ALLERGEN                             16        16   <0.01

Features (FT)                      1334625              0.51
   NON_TER                         1204261    720162    0.46
   SIGNAL                            84376     81398    0.03
   CHAIN                             45421     27275    0.02
   TRANSIT                             567       563   <0.01

Cross-references (DR)             18719573              7.18
   GO                              5741378   1631013    2.20
   InterPro                        3343552   1700220    1.28
   EMBL                            2991301   2596793    1.15
   Pfam                            2100412   1572588    0.81
   PROSITE                         1218339    783259    0.47
   PRINTS                           519797    431928    0.20
   SMART                            394067    312319    0.15
   SMR                              294379    294364    0.11
   BioCyc                           290893    275436    0.11
   TIGRFAMs                         289333    268049    0.11
   HSSP                             282910    282629    0.11
   ProDom                           277265    266338    0.11
   PANTHER                          246734    236085    0.09
   PIR                              195329    159779    0.07
   Ensembl                          111130    111128    0.04
   TIGR                              96019     89923    0.04
   Gramene                           57215     57183    0.02
   PIRSF                             56960     56161    0.02
   MGI                               46996     44473    0.02
   FlyBase                           26906     26866    0.01
   TAIR                              20427     20360    0.01
   WormPep                           19117     19036    0.01
   WormBase                          19116     19036    0.01
   LinkHub                           15357     15357    0.01
   ZFIN                              11986     11982   <0.01
   MEROPS                             8168      7910   <0.01
   IntAct                             5821      5821   <0.01
   LegioList                          5569      5539   <0.01
   ListiList                          4770      4753   <0.01
   AGD                                4295      4295   <0.01
   PhotoList                          4155      4031   <0.01
   PDB                                3162      1872   <0.01
   HGNC                               3063      3063   <0.01
   TubercuList                        2555      2549   <0.01
   RGD                                2144      2132   <0.01
   GeneDB_Spombe                      1963      1957   <0.01
   SagaList                           1780      1686   <0.01
   SGD                                1327      1323   <0.01
   Leproma                             980       979   <0.01
   DictyBase                           979       979   <0.01
   TRANSFAC                            954       942   <0.01
   MypuList                            601       597   <0.01
   REBASE                              124       119   <0.01
   PHCI-2DPAGE                         108       108   <0.01
   ANU-2DPAGE                           65        65   <0.01
   SWISS-2DPAGE                         52        52   <0.01
   Reactome                             14        14   <0.01
   PMMA-2DPAGE                           3         3   <0.01
   Siena-2DPAGE                          2         2   <0.01
   COMPLUYEAST-2DPAGE                    1         1   <0.01

Number of explicitly cross-referenced databases: 70


6.  MISCELLANEOUS STATISTICS

Total number of distinct authors cited in UniProtKB/TrEMBL: 222640

Total number of entries encoded on a Mitochondrion: 124483
Total number of entries encoded on a Plasmid: 41026
Total number of entries encoded on a Plastid: 2319
Total number of entries encoded on a Plastid; Apicoplast: 125
Total number of entries encoded on a Plastid; Chloroplast: 45103
Total number of entries encoded on a Plastid; Cyanelle: 5
Total number of entries encoded on a Plastid; Non-photosynthetic plastid: 

Number of fragments: 722286


Submissions and Updates

We welcome feedback from our users. We would especially appreciate your notifying us if you find that sequences belonging to your field of expertise are missing from the database. We also would like to be notified about annotations to be updated, if, for example, the function of a protein has been clarified or if new information about post-translational modifications has become available.

Submit new sequence data, updates and corrections at http://www.uniprot.org/support/submissions.shtml

For all queries regarding submissions to UniProtkb and to submit new protein sequence data, please contact:

UniProt Knowledgebase
The EMBL Outstation - The European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 462
Telefax: (+44 1223) 494 468
E-mail:


Download information

Bi-Weekly releases

The latest data of the UniProt Knowledgebase is available in various format (flatfile, XML or FASTA) at http://www.uniprot.org/database/download.shtml. The data is further supplemented by a file containing the sequences of all additional splice isoforms annotated in UniProtKB/Swiss-Prot. This data set is documented in the file ftp://ftp.expasy.org/databases/uniprot/current_release/knowledgebase/complete/README.varsplic

Major releases

For users who wish to download the UniProt Knowledgebase only occasionally, we distribute the latest major release (updated 3 times per year) in flatfile format. Previous UniProtKB/Swiss-Prot and UniProtKB/TrEMBL are archived under ftp://ftp.uniprot.org/pub/databases/uniprot/previous_major_releases. The UniProt Knowledgebase major release is also available on CD-ROM from the EBI.


Contact

EMBL Outstation
European Bioinformatics Institute (EBI)
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 444
Fax: (+44 1223) 494 468
Electronic mail address: /
WWW server: http://www.ebi.ac.uk/


Swiss Institute of Bioinformatics (SIB)
Centre Medical Universitaire
1, rue Michel Servet
1211 Geneva 4
Switzerland

Telephone: (+41 22) 702 50 50
Fax: (+41 22) 702 58 58
Electronic mail address:
WWW server: http://www.expasy.org/


Protein Information Resource (PIR)
Georgetown University Medical Center
3900 Reservoir Road, NW
Box 571455
Washington, DC 20057-1455
United States of America

Telephone: (+1 202) 687 1039
Fax: (+1 202) 687 0057)
Electronic mail address:
WWW server: http://pir.georgetown.edu

Citation

If you want to cite UniProt in a publication please use the following reference:

Wu C.H., Apweiler R., Bairoch A., Natale D.A., Barker W.C., Boeckmann B., Ferro S., Gasteiger E., Huang H., Lopez R., Magrane M., Martin M.J., Mazumder R., O'Donovan C., Redaschi N., Suzek B. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 34: D187-D191 (2006).