UniProt Knowledgebase
Swiss-Prot Protein Knowledgebase
TrEMBL Protein Database

Release notes
UniProtKB release 10.0 of 06-Mar-2007

Content

  Introduction
  UniProtKB/Swiss-Prot Protein Knowledgebase release statistics
  UniProtKB/TrEMBL Protein Database release statistics

  Submissions and Updates
  Download information
  Contact
  Citation

  Related documents: UniProtKB user manual, Recent changes, Forthcoming changes.

Introduction

Release 10.0 of the UniProt Knowledgebase is composed of the UniProtKB/Swiss-Prot Protein Knowledgebase release 52.0 and the UniProtKB/TrEMBL Protein Database release 35.0.

More information on these databases can be found in the user manual What is the UniProt Knowledgebase ?.


UniProtKB/Swiss-Prot protein knowledgebase release 52.0 statistics

Release 52.0 of 06-Mar-07 of UniProtKB/Swiss-Prot contains 261'513 sequence entries, comprising 95'638'062 amino acids abstracted from 153'035 references.

The growth of the database is summarized below.

Release Date Number of entries Number of amino acids
2.0 09/86 3'939 900'163
3.0 11/86 4'160 969'641
4.0 04/87 4'387 1'036'010
5.0 09/87 5'205 1'327'683
6.0 01/88 6'102 1'653'982
7.0 04/88 6'821 1'885'771
8.0 08/88 7'724 2'224'465
9.0 11/88 8'702 2'498'140
10.0 03/89 10'008 2'952'613
11.0 07/89 10'856 3'265'966
12.0 10/89 12'305 3'797'482
13.0 01/90 13'837 4'347'336
14.0 04/90 15'409 4'914'264
15.0 08/90 16'941 5'486'399
16.0 11/90 18'364 5'986'949
17.0 02/91 20'024 6'524'504
18.0 05/91 20'772 6'792'034
19.0 08/91 21'795 7'173'785
20.0 11/91 22'654 7'500'130
21.0 03/92 23'742 7'866'596
22.0 05/92 25'044 8'375'696
23.0 08/92 26'706 9'011'391
24.0 12/92 28'154 9'545'427
25.0 04/93 29'955 10'214'020
26.0 07/93 31'808 10'875'091
27.0 10/93 33'329 11'484'420
28.0 02/94 36'000 12'496'420
29.0 06/94 38'303 13'464'008
30.0 10/94 40'292 14'147'368
31.0 02/95 43'470 15'335'248
32.0 11/95 49'340 17'385'503
33.0 02/96 52'205 18'531'384
34.0 10/96 59'021 21'210'389
35.0 11/97 69'113 25'083'768
36.0 07/98 74'019 26'840'295
37.0 12/98 77'977 28'268'293
38.0 07/99 80'000 29'085'965
39.0 05/00 86'593 31'411'114
40.0 10/01 101'602 37'315'215
41.0 02/03 122'564 44'986'459
42.0 10/03 135'850 50'046'799
43.0 03/04 146'720 54'093'154
44.0 07/04 153'871 56'608'159
45.0 10/04 163'235 59'631'787
46.0 02/05 168'297 61'443'278
47.0 05/05 181'577 65'746'672
48.0 09/05 194'317 70'391'852
49.0 02/06 207'132 75'438'310
50.0 05/06 222'289 81'585'146
51.0 10/06 241'242 88'541'632
52.0 03/07 261'513 95'638'062

In rare cases, UniProtKB/Swiss-Prot entries are removed. Deleted entries are almost exclusively Open Reading Frames (ORFs) that have been wrongly predicted to code for proteins. When there is enough evidence that these hypothetical proteins are not real we take the decision to remove them from UniProtKB/Swiss-Prot. In the document delac_sp.txt, you will find a list of all accession numbers which were previously present in UniProtKB/Swiss-Prot, but which have now been deleted from the database.


Status of the model organisms

We have selected a number of organisms that are the target of genome sequencing and/or mapping projects and for which we intend to:

From our efforts to annotate human sequence entries as completely as possible arose the HPI project, and the bacterial model organisms became the focus of the HAMAP project. Here is the current status of the model organisms which are not covered by these two projects:

Organism Database cross-references Index file Number of sequences
A.thaliana TAIR arath.txt 5'065
C.albicans None yet calbican.txt 604
C.elegans Wormpep celegans.txt 3'081
D.discoideum DictyBase dicty.txt 350
D.melanogaster FlyBase fly.txt 2'588
M.musculus MGD mgdtosp.txt 12'408
S.cerevisiae SGD yeast.txt 6'239
S.pombe GeneDB_SPombe pombe.txt 3'217

UniProtKB/Swiss-Prot release statistics

1.  INTRODUCTION

Release 52.0 of 06-Mar-07 of UniProtKB/Swiss-Prot contains 261513 sequence entries,
comprising 95638062 amino acids abstracted from 153035 references. 

20329 sequences have been added since release 51.0, the sequence data of
11364 existing entries has been updated and the annotations of
196464 entries have been revised.


2.  AMINO ACID COMPOSITION

   2.1  Composition in percent for the complete database

   Ala (A) 7.87   Gln (Q) 3.96   Leu (L) 9.65   Ser (S) 6.84
   Arg (R) 5.42   Glu (E) 6.66   Lys (K) 5.92   Thr (T) 5.41
   Asn (N) 4.13   Gly (G) 6.95   Met (M) 2.39   Trp (W) 1.13
   Asp (D) 5.34   His (H) 2.29   Phe (F) 3.95   Tyr (Y) 3.02
   Cys (C) 1.50   Ile (I) 5.91   Pro (P) 4.82   Val (V) 6.73

   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00


   2.2  Classification of the amino acids by their frequency

   Leu, Ala, Gly, Ser, Val, Glu, Lys, Ile, Arg, Thr, Asp, Pro, Asn, Gln,
   Phe, Tyr, Met, His, Cys, Trp


3.  TAXONOMIC ORIGIN

   Total number of species represented in this release of UniProtKB/Swiss-Prot: 10849

   The first twenty species represent 80696 sequences:  30.9 % of the total
   number of entries.


   3.1 Table of the frequency of occurrence of species

        Species represented 1x: 5201
                            2x: 1664
                            3x:  815
                            4x:  532
                            5x:  349
                            6x:  318
                            7x:  218
                            8x:  183
                            9x:  159
                           10x:   82
                       11- 20x:  433
                       21- 50x:  325
                       51-100x:  169
                         >100x:  401


   3.2  Table of the most represented species

  ------  ---------  --------------------------------------------
  Number  Frequency  Species
  ------  ---------  --------------------------------------------
       1      15945  Homo sapiens (Human)
       2      12710  Mus musculus (Mouse)
       3       6163  Saccharomyces cerevisiae (Baker's yeast)
       4       5864  Rattus norvegicus (Rat)
       5       4978  Arabidopsis thaliana (Mouse-ear cress)
       6       4931  Escherichia coli
       7       3420  Bos taurus (Bovine)
       8       3176  Schizosaccharomyces pombe (Fission yeast)
       9       3006  Caenorhabditis elegans
      10       2849  Bacillus subtilis
      11       2485  Drosophila melanogaster (Fruit fly)
      12       1883  Escherichia coli O157:H7
      13       1782  Methanococcus jannaschii
      14       1780  Xenopus laevis (African clawed frog)
      15       1774  Haemophilus influenzae
      16       1665  Gallus gallus (Chicken)
      17       1626  Salmonella typhimurium
      18       1585  Pongo pygmaeus (Orangutan)
      19       1550  Escherichia coli O6
      20       1524  Shigella flexneri
      21       1416  Mycobacterium tuberculosis
      22       1222  Salmonella typhi
      23       1160  Sus scrofa (Pig)
      24       1158  Mycobacterium bovis
      25       1135  Brachydanio rerio (Zebrafish) (Danio rerio)
      26       1125  Oryza sativa (Rice)
      27       1107  Pseudomonas aeruginosa
      28        976  Synechocystis sp. (strain PCC 6803)
      29        971  Archaeoglobus fulgidus
      30        905  Yersinia pestis
      31        887  Vibrio cholerae
      32        884  Mimivirus
      33        876  Rhizobium meliloti (Sinorhizobium meliloti)
      34        829  Oryctolagus cuniculus (Rabbit)
      35        801  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
      36        756  Aquifex aeolicus
      37        748  Staphylococcus aureus (strain Mu50 / ATCC 700699)
      38        746  Staphylococcus aureus (strain N315)
      39        737  Pasteurella multocida
      40        734  Vibrio parahaemolyticus
      41        730  Staphylococcus aureus (strain MW2)
      42        728  Staphylococcus aureus (strain COL)
      43        726  Staphylococcus aureus (strain MSSA476)
      44        724  Staphylococcus aureus (strain MRSA252)
      45        687  Mycoplasma pneumoniae
      46        686  Streptomyces coelicolor
      47        681  Canis familiaris (Dog)
      48        677  Vibrio vulnificus
      49        673  Bacillus halodurans
      50        658  Vibrio vulnificus (strain YJ016)
      51        632  Mycobacterium leprae
      52        629  Anabaena sp. (strain PCC 7120)
      53        618  Staphylococcus epidermidis (strain ATCC 12228)
      54        618  Pseudomonas syringae pv. tomato
      55        617  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
      56        617  Neurospora crassa
      57        617  Yersinia pseudotuberculosis
      58        612  Pseudomonas putida (strain KT2440)
      59        612  Bacillus anthracis
      60        611  Treponema pallidum
      61        606  Candida albicans (Yeast)
      62        605  Ashbya gossypii (Yeast) (Eremothecium gossypii)
      63        601  Photorhabdus luminescens subsp. laumondii
      64        587  Methanobacterium thermoautotrophicum
      65        581  Bradyrhizobium japonicum
      66        575  Rickettsia prowazekii
      67        574  Helicobacter pylori (Campylobacter pylori)
      68        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
      69        571  Kluyveromyces lactis (Yeast) (Candida sphaerica)
      70        570  Ralstonia solanacearum (Pseudomonas solanacearum)
      71        568  Pan troglodytes (Chimpanzee)
      72        568  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
      73        562  Buchnera aphidicola subsp. Schizaphis graminum
      74        561  Salmonella paratyphi-a
      75        556  Lactococcus lactis subsp. lactis (Streptococcus lactis)
      76        556  Rhizobium loti (Mesorhizobium loti)
      77        556  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
      78        555  Helicobacter pylori J99 (Campylobacter pylori J99)
      79        554  Zea mays (Maize)
      80        549  Listeria monocytogenes
      81        541  Listeria innocua
      82        540  Xanthomonas campestris pv. campestris
      83        537  Shewanella oneidensis
      84        537  Bacillus cereus (strain ATCC 14579 / DSM 31)
      85        530  Neisseria meningitidis serogroup A
      86        530  Neisseria meningitidis serogroup B
      87        518  Candida glabrata (Yeast) (Torulopsis glabrata)
      88        517  Clostridium acetobutylicum
      89        516  Caulobacter crescentus (Caulobacter vibrioides)
      90        507  Buchnera aphidicola subsp. Baizongia pistaciae
      91        506  Xanthomonas axonopodis pv. citri
      92        491  Streptococcus pneumoniae
      93        488  Thermotoga maritima
      94        483  Mycoplasma genitalium
      95        482  Oceanobacillus iheyensis
      96        481  Listeria monocytogenes serotype 4b (strain F2365)
      97        481  Xylella fastidiosa
      98        474  Brucella suis
      99        473  Salmonella choleraesuis
     100        473  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
     101        472  Brucella melitensis
     102        472  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
     103        470  Photobacterium profundum (Photobacterium sp. (strain SS9))
     104        466  Haemophilus ducreyi
     105        465  Deinococcus radiodurans
     106        457  Methanosarcina acetivorans
     107        453  Corynebacterium glutamicum (Brevibacterium flavum)
     108        453  Clostridium perfringens
     109        449  Rickettsia conorii
     110        446  Pyrococcus horikoshii
     111        445  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
     112        441  Bacillus cereus (strain ATCC 10987)
     113        441  Pyrococcus abyssi
     114        440  Halobacterium salinarium (Halobacterium halobium)
     115        440  Bordetella pertussis
     116        437  Methanosarcina mazei (Methanosarcina frisia)
     117        435  Chromobacterium violaceum
     118        435  Chlamydia trachomatis
     119        432  Bordetella parapertussis
     120        432  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
     121        431  Emericella nidulans (Aspergillus nidulans)
     122        424  Borrelia burgdorferi (Lyme disease spirochete)
     123        424  Yarrowia lipolytica (Candida lipolytica)
     124        422  Thermoanaerobacter tengcongensis
     125        421  Nicotiana tabacum (Common tobacco)
     126        421  Pyrococcus furiosus
     127        420  Lactobacillus plantarum
     128        418  Synechococcus elongatus (Thermosynechococcus elongatus)
     129        416  Chlamydia pneumoniae (Chlamydophila pneumoniae)
     130        415  Streptococcus pyogenes serotype M6
     131        414  Ovis aries (Sheep)
     132        412  Campylobacter jejuni
     133        412  Enterococcus faecalis (Streptococcus faecalis)
     134        411  Streptococcus mutans
     135        410  Streptomyces avermitilis
     136        406  Chlamydia muridarum
     137        406  Rhizobium sp. (strain NGR234)
     138        404  Bacillus thuringiensis subsp. konkukian
     139        397  Sulfolobus solfataricus
     140        396  Streptococcus pyogenes serotype M1
     141        394  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
     142        394  Solanum lycopersicum (Tomato) (Lycopersicon esculentum)
     143        391  Streptococcus pyogenes serotype M18
     144        390  Streptococcus pyogenes serotype M3
     145        381  Acinetobacter sp. (strain ADP1)
     146        380  Burkholderia pseudomallei (Pseudomonas pseudomallei)
     147        376  Shigella sonnei (strain Ss046)
     148        373  Bacillus cereus (strain ZK / E33L)
     149        372  Chlorobium tepidum
     150        371  Rhodopseudomonas palustris
     151        371  Nitrosomonas europaea
     152        369  Corynebacterium efficiens
     153        368  Pyrococcus kodakaraensis (Thermococcus kodakaraensis)
     154        367  Vibrio fischeri (strain ATCC 700601 / ES114)
     155        366  Bacillus clausii (strain KSM-K16)
     156        360  Rickettsia bellii (strain RML369-C)
     157        356  Methanopyrus kandleri
     158        356  Mannheimia succiniciproducens (strain MBEL55E)
     159        355  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
     160        354  Staphylococcus haemolyticus (strain JCSC1435)
     161        354  Gloeobacter violaceus
     162        353  Burkholderia mallei (Pseudomonas mallei)
     163        352  Shigella boydii serotype 4 (strain Sb227)
     164        351  Leptospira interrogans
     165        349  Staphylococcus saprophyticus subsp. saprophyticus 
     166        349  Rickettsia felis (Rickettsia azadi)
     167        348  Aeropyrum pernix
     168        346  Streptococcus agalactiae serotype III
     169        343  Streptococcus agalactiae serotype V
     170        341  Leptospira interrogans serogroup Icterohaemorrhagiae serovar copenhageni
     171        340  Solanum tuberosum (Potato)
     172        339  Shigella dysenteriae serotype 1 (strain Sd197)
     173        339  Pisum sativum (Garden pea)
     174        337  Methylococcus capsulatus
     175        337  Dictyostelium discoideum (Slime mold)
     176        337  Synechococcus sp. (strain WH8102)
     177        334  Sulfolobus tokodaii
     178        332  Rickettsia typhi
     179        332  Prochlorococcus marinus (strain MIT 9313)
     180        332  Glycine max (Soybean)
     181        331  Prochlorococcus marinus
     182        331  Geobacillus kaustophilus
     183        323  Mycobacterium paratuberculosis
     184        322  Staphylococcus aureus
     185        320  Rhodopirellula baltica
     186        317  Idiomarina loihiensis
     187        316  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
     188        316  Geobacter sulfurreducens
     189        316  Thermoplasma acidophilum
     190        315  Synechococcus sp. (strain ATCC 27144 / PCC 6301 / SAUG 1402/1) 
     191        314  Pseudomonas syringae pv. syringae (strain B728a)
     192        312  Prochlorococcus marinus subsp. pastoris (strain CCMP 1378 / MED4)
     193        311  Aspergillus fumigatus (Sartorya fumigata)
     194        311  Fusobacterium nucleatum subsp. nucleatum
     195        309  Coxiella burnetii
     196        307  Triticum aestivum (Wheat)
     197        303  Macaca mulatta (Rhesus macaque)
     198        300  Brucella abortus
     199        299  Azoarcus sp. (strain EbN1)
     200        296  Nocardia farcinica
     201        296  Synechococcus sp. (strain PCC 7942) (Anacystis nidulans R2)
     202        296  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
     203        295  Thermus thermophilus (strain HB8 / ATCC 27634 / DSM 579)
     204        294  Wolinella succinogenes
     205        293  Zymomonas mobilis
     206        292  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
     207        292  Bacteroides thetaiotaomicron
     208        291  Desulfovibrio vulgaris (strain Hildenborough / ATCC 29579 / NCIMB 8303)
     209        290  Sulfolobus acidocaldarius
     210        287  Clostridium tetani
     211        285  Symbiobacterium thermophilum
     212        285  Pseudomonas putida
     213        285  Silicibacter pomeroyi
     214        285  Pyrobaculum aerophilum
     215        285  Legionella pneumophila subsp. pneumophila 
     216        284  Haemophilus influenzae (strain 86-028NP)
     217        284  Xanthomonas oryzae pv. oryzae
     218        283  Hordeum vulgare (Barley)
     219        282  Neisseria gonorrhoeae (strain ATCC 700825 / FA 1090)
     220        282  Cavia porcellus (Guinea pig)
     221        281  Legionella pneumophila (strain Paris)
     222        281  Thermoplasma volcanium
     223        279  Legionella pneumophila (strain Lens)
     224        279  Pseudomonas fluorescens (strain PfO-1)
     225        277  Corynebacterium diphtheriae
     226        273  Thermus thermophilus (strain HB27 / ATCC BAA-163 / DSM 7039)
     227        273  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
     228        269  Spinacia oleracea (Spinach)
     229        268  Bacteriophage T4
     230        267  Burkholderia sp. (strain 383) (Burkholderia cepacia 
     231        262  Rhodobacter capsulatus (Rhodopseudomonas capsulata)
     232        261  Helicobacter hepaticus
     233        261  Methanococcus maripaludis
     234        260  Wigglesworthia glossinidia brevipalpis
     235        259  Haloarcula marismortui (Halobacterium marismortui)
     236        259  Equus caballus (Horse)
     237        259  Bifidobacterium longum
     238        258  Xanthomonas campestris pv. campestris (strain 8004)
     239        257  Ureaplasma parvum (Ureaplasma urealyticum biotype 1)
     240        257  Staphylococcus aureus (strain NCTC 8325)
     241        256  Colwellia psychrerythraea (strain 34H / ATCC BAA-681) (Vibrio psychroerythus)
     242        255  Leifsonia xyli subsp. xyli
     243        254  Vaccinia virus (strain Copenhagen) (VACV)
     244        253  Gluconobacter oxydans (Gluconobacter suboxydans)
     245        252  Dechloromonas aromatica (strain RCB)
     246        251  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
     247        251  Porphyromonas gingivalis (Bacteroides gingivalis)
     248        249  Bartonella henselae (Rochalimaea henselae)
     249        248  Staphylococcus aureus (strain bovine RF122)
     250        247  Campylobacter jejuni (strain RM1221)
     251        244  Chlamydophila caviae
     252        243  Bacteroides fragilis
     253        243  Desulfotalea psychrophila
     254        240  Blochmannia floridanus
     255        238  Lactobacillus johnsonii
     256        237  Cryptococcus neoformans (Filobasidiella neoformans)
     257        237  Propionibacterium acnes
     258        236  Bartonella quintana (Rochalimaea quintana)
     259        235  Bacillus stearothermophilus (Geobacillus stearothermophilus)
     260        235  Burkholderia pseudomallei (strain 1710b)
     261        234  Pseudoalteromonas haloplanktis (strain TAC 125)
     262        232  Xanthomonas campestris pv. vesicatoria (strain 85-10)
     263        232  Nitrosococcus oceani (strain ATCC 19707 / NCIMB 11848)
     264        230  Thiobacillus denitrificans (strain ATCC 25259)
     265        229  Brucella abortus (strain 2308)
     266        225  Gorilla gorilla gorilla (Lowland gorilla)
     267        224  Chlamydomonas reinhardtii
     268        221  Bdellovibrio bacteriovorus
     269        220  Francisella tularensis subsp. tularensis
     270        220  Ustilago maydis (Smut fungus)
     271        220  Streptococcus thermophilus (strain ATCC BAA-250 / LMG 18311)
     272        219  Staphylococcus aureus (strain USA300)
     273        218  Streptococcus thermophilus (strain CNRZ 1066)
     274        217  Porphyra purpurea
     275        217  Escherichia coli (strain UTI89 / UPEC)
     276        213  Psychrobacter arcticum
     277        212  Klebsiella pneumoniae
     278        211  Felis silvestris catus (Cat)
     279        210  Pelobacter carbinolicus (strain DSM 2380 / Gra Bd 1)
     280        210  Nitrobacter winogradskyi (strain Nb-255 / ATCC 25391)
     281        208  Cricetulus griseus (Chinese hamster)
     282        208  Treponema denticola
     283        207  Porphyra yezoensis
     284        207  Rhodospirillum rubrum (strain ATCC 11170 / NCIB 8255)
     285        206  Lactobacillus acidophilus
     286        204  Caenorhabditis briggsae
     287        201  Escherichia coli O6:K15:H31 (strain 536 / UPEC)
     288        201  Mesocricetus auratus (Golden hamster)
     289        200  Vaccinia virus (strain Western Reserve / WR) (VACV)


   
   3.3  Taxonomic distribution of the sequences


   Kingdom        sequences (% of the database)
    Archaea           10908 (  4%)
    Bacteria         127559 ( 49%)
    Eukaryota        112139 ( 43%)
    Viruses           10907 (  4%)


   Within Eukaryota:

    Category            sequences (% of Eukaryota) (% of the complete database)
     Human                  15946 ( 14%)           (  6%)
     Other Mammalia         34810 ( 31%)           ( 13%)
     Other Vertebrata       10293 (  9%)           (  4%)
     Viridiplantae          18768 ( 17%)           (  7%)
     Fungi                  16974 ( 15%)           (  6%)
     Insecta                 4794 (  4%)           (  2%)
     Nematoda                3451 (  3%)           (  1%)
     Other                   7103 (  6%)           (  3%)


4.  SEQUENCE SIZE

   Repartition of the sequences by size (excluding fragments)

               From   To  Number             From   To   Number
                  1-  50    5020             1001-1100     2190
                 51- 100   19316             1101-1200     1469
                101- 150   27430             1201-1300     1178
                151- 200   26015             1301-1400      973
                201- 250   26277             1401-1500      809
                251- 300   22407             1501-1600      399
                301- 350   22622             1601-1700      321
                351- 400   20653             1701-1800      276
                401- 450   16334             1801-1900      243
                451- 500   14148             1901-2000      205
                501- 550   10709             2001-2100      127
                551- 600    7271             2101-2200      180
                601- 650    6284             2201-2300      177
                651- 700    4277             2301-2400      113
                701- 750    3615             2401-2500       92
                751- 800    2907             >2500          679
                801- 850    2438
                851- 900    2548
                901- 950    1920
                951-1000    1531


   The average sequence length in UniProtKB/Swiss-Prot is 365 amino acids.

   The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
   The longest sequence is  TITIN_HUMAN (Q8WZ42): 34350 amino acids.


5.  JOURNAL CITATIONS

   Note: the following citation statistics reflect the number of distinct
         journal citations.

   Total number of journals cited in this release of UniProtKB/Swiss-Prot: 1756


   5.1 Table of the frequency of journal citations

        Journals cited 1x:  618
                       2x:  236
                       3x:  133
                       4x:   89
                       5x:   67
                       6x:   47
                       7x:   35
                       8x:   35
                       9x:   34
                      10x:   16
                  11- 20x:  130
                  21- 50x:  149
                  51-100x:   70
                    >100x:  129


   5.2  List of the most cited journals in UniProtKB/Swiss-Prot

   Nb    Citations   Journal name
   --    ---------   -------------------------------------------------------------
    1        14640   Journal of Biological Chemistry
    2         7008   Proceedings of the National Academy of Sciences of the U.S.A.
    3         4469   Journal of Bacteriology
    4         4188   Gene
    5         4053   Nucleic Acids Research
    6         3779   Biochemical and Biophysical Research Communications
    7         3520   FEBS Letters
    8         3259   Biochemistry
    9         3220   The EMBO Journal
   10         2924   European Journal of Biochemistry
   11         2785   Nature
   12         2709   Molecular and Cellular Biology
   13         2646   Biochimica et Biophysica Acta
   14         2497   Journal of Molecular Biology
   15         2292   Genomics
   16         2256   Cell
   17         1818   Biochemical Journal
   18         1718   Science
   19         1483   Molecular Microbiology
   20         1346   Plant Molecular Biology
   21         1286   Journal of Virology
   22         1268   Molecular and General Genetics
   23         1267   Journal of Cell Biology
   24         1113   Virology
   25         1088   Human Molecular Genetics
   26         1064   Journal of Biochemistry
   27         1051   Nature Genetics
   28         1046   Genes and Development
   29          942   Oncogene
   30          939   Plant Physiology
   31          909   The American Journal of Human Genetics
   32          822   Human Mutation
   33          781   Journal of Immunology
   34          775   Development
   35          765   Infection and Immunity
   36          741   Genetics
   37          731   Structure
   38          691   Yeast
   39          685   Archives of Biochemistry and Biophysics
   40          656   Journal of General Virology
   41          652   Molecular Biology of the Cell
   42          619   Microbiology
   43          576   Blood
   44          557   FEMS Microbiology Letters
   45          553   The Plant Cell
   46          541   Nature Structural Biology
   47          505   Molecular Cell
   48          497   Human Genetics
   49          492   Journal of Cell Science
   50          489   Current Genetics
   51          486   Cancer Research
   52          475   Developmental Biology
   53          452   Mechanisms of Development
   54          443   The Plant Journal
   55          434   Applied and Environmental Microbiology
   56          426   Protein Science
   57          422   Neuron
   58          418   Journal of Clinical Investigation
   59          417   Mammalian Genome
   60          417   Acta Crystallographica, Section D
   61          409   Current Biology
   62          402   Molecular and Biochemical Parasitology
   63          393   Journal of Neuroscience
   64          390   Molecular Endocrinology
   65          380   The Journal of Experimental Medicine
   66          370   Immunogenetics
   67          345   Journal of Molecular Evolution
   68          338   DNA and Cell Biology
   69          335   Journal of Neurochemistry
   70          333   Endocrinology
   71          317   DNA Sequence
   72          315   Toxicon
   73          302   The Journal of Clinical Endocrinology and Metabolism
   74          300   American Journal of Physiology
   75          291   Molecular Biology and Evolution
   76          286   Biological Chemistry Hoppe-Seyler
   77          285   Brain Research. Molecular Brain Research
   78          281   Bioscience, Biotechnology, and Biochemistry
   79          249   Cytogenetics and Cell Genetics
   80          242   Comparative Biochemistry and Physiology
   81          242   Journal of General Microbiology
   82          238   Proteins
   83          224   Journal of Medical Genetics
   84          220   Peptides
   85          218   Antimicrobial Agents and Chemotherapy
   86          215   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
   87          215   Molecular Pharmacology
   88          202   Journal of Investigative Dermatology
   89          194   Biology of Reproduction
   90          191   Genome Research
   91          189   Plant and Cell Physiology
   92          189   Nature Cell Biology
   93          183   DNA Research
   94          180   Molecular Plant-Microbe Interactions
   95          180   Virus Research
   96          175   European Journal of Immunology
   97          172   Experimental Cell Research
   98          164   RNA
   99          160   Biochimie
  100          158   Tissue Antigens
  101          158   DNA
  102          156   Neurology
  103          152   Molecular and Cellular Endocrinology
  104          151   Developmental Dynamics
  105          149   Molecular Phylogenetics and Evolution
  106          149   Hemoglobin
  107          147   American Journal of Medical Genetics
  108          145   Bioorganicheskaia Khimiia
  109          140   Archives of Microbiology
  110          140   Annals of Neurology
  111          138   Genes to Cells
  112          138   European Journal of Human Genetics
  113          134   Insect Biochemistry and Molecular Biology
  114          132   Journal of Human Genetics
  115          129   Immunity
  116          128   Planta
  117          123   Animal Genetics
  118          123   Developmental Cell
  119          121   Molecular Reproduction and Development
  120          118   Agricultural and Biological Chemistry
  121          118   General and Comparative Endocrinology
  122          117   Diabetes
  123          111   Molecular Immunology
  124          109   Glycobiology
  125          109   Investigative Ophthalmology and Visual Science
  126          107   The New England Journal of Medicine
  127          106   Journal of Protein Chemistry
  128          102   Molecular and Cellular Neuroscience
  129          101   Archives of Virology
  130          100   British Journal of Haematology


6.  STATISTICS FOR SOME LINE TYPES

The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
as well as the number of entries with at least one such line, and the
frequency of the lines.

                                   Total    Number of  Average
Line type / subtype                number   entries    per entry
---------------------------------  -------- ---------  ---------

References (RL)                     514754              1.97
   Journal                          448577    234379    1.72
   Submitted to EMBL/GenBank/DDBJ    62118     54649    0.24
   Submitted to Swiss-Prot            1146      1128   <0.01
   Submitted to other databases        640       626   <0.01
   Unpublished observations            637       631   <0.01
   Book citation                       578       566   <0.01
   Plant Gene Register                 537       525   <0.01
   Thesis                              380       378   <0.01
   Patent                              135       133   <0.01
   Worm Breeder's Gazette                6         6   <0.01

Comments (CC)                      1058365              4.05
   SIMILARITY                       297431    239032    1.14
   FUNCTION                         184967    178423    0.71
   SUBCELLULAR LOCATION             143499    143499    0.55
   SUBUNIT                           98812     98812    0.38
   CATALYTIC ACTIVITY                98305     90234    0.38
   PATHWAY                           52267     44460    0.20
   COFACTOR                          39248     35113    0.15
   TISSUE SPECIFICITY                24970     24970    0.10
   MISCELLANEOUS                     22037     19810    0.08
   PTM                               21043     17067    0.08
   DOMAIN                            15822     13637    0.06
   ALTERNATIVE PRODUCTS              11342     11342    0.04
   CAUTION                           10296      9071    0.04
   INTERACTION                        7093      7093    0.03
   INDUCTION                          7087      7087    0.03
   DEVELOPMENTAL STAGE                6212      6212    0.02
   ENZYME REGULATION                  4520      4520    0.02
   DISEASE                            3600      2604    0.01
   WEB RESOURCE                       3448      2894    0.01
   MASS SPECTROMETRY                  2693      2242    0.01
   BIOPHYSICOCHEMICAL PROPERTIES      1653      1653    0.01
   POLYMORPHISM                        586       570   <0.01
   RNA EDITING                         477       477   <0.01
   ALLERGEN                            419       419   <0.01
   TOXIC DOSE                          324       319   <0.01
   BIOTECHNOLOGY                       141       141   <0.01
   PHARMACEUTICAL                       73        73   <0.01

Features (FT)                      1744500              6.67
   CHAIN                            266207    257979    1.02
   TRANSMEM                         167633     36823    0.64
   METAL                            111155     27262    0.43
   STRAND                            92324      8678    0.35
   CONFLICT                          90312     31283    0.35
   HELIX                             88592      9112    0.34
   DOMAIN                            86017     47759    0.33
   TOPO_DOM                          85194     17293    0.33
   CARBOHYD                          76138     19104    0.29
   DISULFID                          74123     18930    0.28
   BINDING                           64883     22914    0.25
   ACT_SITE                          62217     35914    0.24
   MOD_RES                           58057     23003    0.22
   REPEAT                            54932      8200    0.21
   VARIANT                           44600      9160    0.17
   NP_BIND                           40877     29321    0.16
   REGION                            38954     20753    0.15
   COMPBIAS                          25530     14709    0.10
   VAR_SEQ                           24590     10681    0.09
   SIGNAL                            24413     24403    0.09
   TURN                              23826      7391    0.09
   MUTAGEN                           19304      4726    0.07
   ZN_FING                           18962      7399    0.07
   MOTIF                             18018     11888    0.07
   SITE                              15906      9034    0.06
   INIT_MET                          10975     10975    0.04
   NON_TER                           10704      8215    0.04
   COILED                             9800      6360    0.04
   PROPEP                             7982      6739    0.03
   LIPID                              7548      4882    0.03
   DNA_BIND                           6921      6424    0.03
   PEPTIDE                            6644      4151    0.03
   TRANSIT                            4550      4504    0.02
   CA_BIND                            2693      1111    0.01
   CROSSLNK                           2024      1381    0.01
   NON_CONS                           1175       523   <0.01
   UNSURE                              469       183   <0.01
   SE_CYS                              251       182   <0.01

Cross-references (DR)              3650964             13.96
   InterPro                         617838    238799    2.36
   EMBL                             492611    253089    1.88
   GO                               339578    137455    1.30
   Pfam                             325839    231011    1.25
   PROSITE                          246106    150243    0.94
   KEGG                             180404    162878    0.69
   GenomeReviews                    150533    134388    0.58
   HAMAP                            103797    103678    0.40
   TIGRFAMs                         101433     95037    0.39
   PIR                               98446     91898    0.38
   PRINTS                            93227     73725    0.36
   HSSP                              80523     80523    0.31
   SMART                             76697     58494    0.29
   BioCyc                            72324     66863    0.28
   ProDom                            70565     68226    0.27
   UniGene                           59630     54887    0.23
   Gene3D                            59342     52618    0.23
   Ensembl                           51212     51212    0.20
   GermOnline                        42029     41413    0.16
   PANTHER                           41377     41043    0.16
   PDB                               38554     10498    0.15
   SMR                               35989     35989    0.14
   ArrayExpress                      35710     35710    0.14
   RZPD-ProtExp                      27256     12772    0.10
   TIGR                              23892     23273    0.09
   PIRSF                             20894     20636    0.08
   LinkHub                           17639     17639    0.07
   HGNC                              15422     15355    0.06
   IntAct                            13399     13399    0.05
   MIM                               12802     10405    0.05
   MGI                               12583     12537    0.05
   DIP                                8824      8774    0.03
   SGD                                6235      6148    0.02
   CYGD                               6223      6134    0.02
   RGD                                5692      5689    0.02
   MEROPS                             5364      5058    0.02
   TAIR                               5039      4947    0.02
   EcoGene                            4311      4308    0.02
   EchoBASE                           4158      4126    0.02
   H-InvDB                            3677      3659    0.01
   WormPep                            3617      3002    0.01
   WormBase                           3270      3188    0.01
   FlyBase                            3234      3110    0.01
   GeneDB_Spombe                      3209      3174    0.01
   TRANSFAC                           2878      2584    0.01
   SubtiList                          2790      2789    0.01
   Gramene                            2789      2789    0.01
   Reactome                           2706      1545    0.01
   GeneFarm                           1831      1812    0.01
   DrugBank                           1826       502    0.01
   StyGene                            1579      1575    0.01
   HPA                                1486      1324    0.01
   TubercuList                        1444      1408    0.01
   SWISS-2DPAGE                       1179      1179   <0.01
   ZFIN                               1120      1108   <0.01
   ListiList                          1091      1083   <0.01
   REPRODUCTION-2DPAGE                 829       829   <0.01
   Leproma                             635       632   <0.01
   AGD                                 611       605   <0.01
   PhotoList                           601       601   <0.01
   LegioList                           560       560   <0.01
   MaizeGDB                            442       437   <0.01
   OGP                                 377       376   <0.01
   REBASE                              364       358   <0.01
   HIV                                 361       351   <0.01
   PeroxiBase                          361       350   <0.01
   ECO2DBASE                           351       299   <0.01
   SagaList                            347       346   <0.01
   DictyBase                           340       337   <0.01
   GlycoSuiteDB                        282       282   <0.01
   PHCI-2DPAGE                         241       241   <0.01
   MypuList                            192       192   <0.01
   DOSAC-COBS-2DPAGE                   149       147   <0.01
   Aarhus/Ghent-2DPAGE                 128        98   <0.01
   Siena-2DPAGE                        103       103   <0.01
   HSC-2DPAGE                           85        85   <0.01
   PhosSite                             70        70   <0.01
   Cornea-2DPAGE                        67        67   <0.01
   COMPLUYEAST-2DPAGE                   59        59   <0.01
   euHCVdb                              55        44   <0.01
   PMMA-2DPAGE                          52        52   <0.01
   PptaseDB                             29        29   <0.01
   Rat-heart-2DPAGE                     28        28   <0.01
   ANU-2DPAGE                           22        22   <0.01

Number of explicitly cross-referenced databases: 85
Number of implicitly cross-referenced databases: 26


7.  MISCELLANEOUS STATISTICS

Total number of distinct authors cited in UniProtKB/Swiss-Prot: 237138

Total number of entries encoded on a Mitochondrion: 4306
Total number of entries encoded on a Plasmid: 3295
Total number of entries encoded on a Plastid: 26
Total number of entries encoded on a Plastid; Apicoplast: 6
Total number of entries encoded on a Plastid; Chloroplast: 8037
Total number of entries encoded on a Plastid; Cyanelle: 145
Total number of entries encoded on a Plastid; Non-photosynthetic plastid: 91

Number of fragments: 8360
Number of additional sequences produced by alternative splicing, initiation or promoter usage: 18319


UniProtKB/TrEMBL protein database release 35.0 statistics


1.  INTRODUCTION

Release 35.0 of 06-Mar-2007 of UniProtKB/TrEMBL contains 3874166 sequence entries
comprising 1260291226 amino acids.

696753 sequences have been added since release 34, the sequence data of
10513 existing entries has been updated and the annotations of
2124731 entries have been revised. This represents an increase of 23%.


2.  AMINO ACID COMPOSITION

   2.1  Composition in percent for the complete database

   Ala (A) 8.37   Gln (Q) 4.00   Leu (L) 9.83   Ser (S) 6.86
   Arg (R) 5.52   Glu (E) 6.04   Lys (K) 5.28   Thr (T) 5.62
   Asn (N) 4.30   Gly (G) 6.98   Met (M) 2.39   Trp (W) 1.33
   Asp (D) 5.23   His (H) 2.23   Phe (F) 4.06   Tyr (Y) 3.04
   Cys (C) 1.37   Ile (I) 5.96   Pro (P) 4.85   Val (V) 6.60

   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.05

   2.2  Classification of the amino acids by their frequency

   Leu, Ala, Gly, Ser, Val, Glu, Ile, Thr, Arg, Lys, Asp, Pro, Asn, Phe,
   Gln, Tyr, Met, His, Cys, Trp


3.  TAXONOMIC ORIGIN

   Total number of species represented in this release of UniProtKB/TrEMBL: 127380

   The first twenty species represent 763609 sequences:  19.7 % of the
   total number of entries.


   3.1 Table of the frequency of occurrence of species

        Species represented 1x:58527
                            2x:23921
                            3x:12255
                            4x: 6910
                            5x: 4051
                            6x: 2999
                            7x: 2175
                            8x: 1781
                            9x: 1381
                           10x: 1525
                       11- 20x: 6496
                       21- 50x: 2672
                       51-100x: 1081
                         >100x: 1606

   3.2  Table of the most represented species

  ------  ---------  --------------------------------------------
  Number  Frequency  Species
  ------  ---------  --------------------------------------------
       1     177135  Human immunodeficiency virus 1
       2      71810  Oryza sativa (japonica cultivar-group)
       3      53146  Homo sapiens (Human)
       4      52403  Mus musculus (Mouse)
       5      50189  Trichomonas vaginalis G3
       6      45187  Arabidopsis thaliana (Mouse-ear cress)
       7      39844  Paramecium tetraurelia
       8      35448  Hepatitis C virus
       9      28040  Tetraodon nigroviridis (Green puffer)
      10      27313  Tetrahymena thermophila SB210
      11      26501  Drosophila melanogaster (Fruit fly)
      12      20214  Caenorhabditis elegans
      13      20166  Trypanosoma cruzi
      14      18711  Medicago truncatula (Barrel medic)
      15      18430  Brachydanio rerio (Zebrafish) (Danio rerio)
      16      17188  uncultured bacterium
      17      16864  Aedes aegypti (Yellowfever mosquito)
      18      16432  Phaeosphaeria nodorum SN15
      19      14666  Plasmodium chabaudi
      20      13922  Hepatitis B virus (HBV)
      21      13557  Aspergillus niger
      22      13415  Anopheles gambiae str. PEST
      23      13082  Dictyostelium discoideum AX4
      24      13074  Caenorhabditis briggsae
      25      12674  Xenopus laevis (African clawed frog)
      26      12032  Aspergillus oryzae
      27      11780  Plasmodium berghei
      28      11650  Gibberella zeae (Fusarium graminearum)
      29      10980  Chaetomium globosum CBS 148.51
      30      10662  Neurospora crassa
      31      10403  Neosartorya fischeri  (Aspergillus fischerianus 
      32      10393  Aspergillus terreus NIH2624
      33      10278  Coccidioides immitis RS
      34      10084  Drosophila pseudoobscura (Fruit fly)
      35      10006  Aspergillus fumigatus (Sartorya fumigata)
      36       9719  Schistosoma japonicum (Blood fluke)
      37       9640  Emericella nidulans (Aspergillus nidulans)
      38       9446  Trypanosoma brucei
      39       9343  Candida albicans (Yeast)
      40       9232  Rattus norvegicus (Rat)
      41       9113  Aspergillus clavatus NRRL 1
      42       9089  Entamoeba histolytica HM-1:IMSS
      43       8994  Rhodococcus sp. (strain RHA1)
      44       8811  Escherichia coli
      45       8512  Stigmatella aurantiaca DW4/3-1
      46       8436  Burkholderia xenovorans (strain LB400)
      47       8249  Microscilla marina ATCC 23134
      48       8244  Bos taurus (Bovine)
      49       8097  Bradyrhizobium japonicum
      50       7975  Ostreococcus tauri
      51       7937  Frankia sp. EAN1pec
      52       7834  Burkholderia phymatum STM815
      53       7808  Plasmodium yoelii yoelii
      54       7761  Solibacter usitatus (strain Ellin6076)
      55       7663  Burkholderia vietnamiensis G4
      56       7524  Streptomyces coelicolor
      57       7490  Helicobacter pylori (Campylobacter pylori)
      58       7461  Burkholderia cenocepacia MC0-3
      59       7449  Burkholderia sp. (strain 383) (Burkholderia cepacia 
      60       7432  Bradyrhizobium sp. BTAi1
      61       7310  Burkholderia phytofirmans PsJN
      62       7300  Streptomyces avermitilis
      63       7207  Myxococcus xanthus (strain DK 1622)
      64       7139  Rhizobium loti (Mesorhizobium loti)
      65       7113  Leishmania major
      66       7042  Hepatitis C virus subtype 1b
      67       6996  Burkholderia ambifaria MC40-6
      68       6994  Rhizobium leguminosarum bv. viciae (strain 3841)
      69       6952  Rhodopirellula baltica
      70       6921  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
      71       6882  Burkholderia cenocepacia (strain HI2424)
      72       6792  Pseudomonas aeruginosa
      73       6708  Frankia alni (strain ACN14a)
      74       6679  Psychroflexus torquis ATCC 700755
      75       6597  Mycobacterium smegmatis (strain ATCC 700084 / mc(2)155)
      76       6592  Burkholderia cepacia (strain ATCC 53795 / AMMD)
      77       6566  Hahella chejuensis (strain KCTC 2396)
      78       6564  Burkholderia multivorans ATCC 17616
      79       6511  Ralstonia eutropha  (Cupriavidus necator 
      80       6488  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
      81       6471  Ustilago maydis (Smut fungus)
      82       6420  Plasmodium falciparum
      83       6398  Cryptococcus neoformans (Filobasidiella neoformans)
      84       6394  Giardia lamblia ATCC 50803
      85       6363  Cryptococcus neoformans var. neoformans B-3501A
      86       6337  Sinorhizobium medicae WSM419
      87       6313  Burkholderia cenocepacia (strain AU 1054)
      88       6272  Stappia aggregata IAM 12614
      89       6269  Oryza sativa (Rice)
      90       6267  Simian immunodeficiency virus (isolate CPZ GAB1) (SIV-cpz) 
      91       6186  Yarrowia lipolytica (Candida lipolytica)
      92       6181  Bacillus anthracis
      93       6176  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
      94       6154  Ralstonia metallidurans (strain CH34 / ATCC 43123 / DSM 2839)
      95       6129  Bacillus thuringiensis serovar israelensis ATCC 35646
      96       6110  Lyngbya sp. PCC 8106
      97       6095  Burkholderia pseudomallei (strain 1710b)
      98       6003  Delftia acidovorans SPH-1
      99       5962  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
     100       5909  Mycobacterium vanbaalenii (strain DSM 7251 / PYR-1)

   3.3  Taxonomic distribution of the sequences

   Kingdom        sequences (% of the database)
    Archaea           85628 (  2%)
    Bacteria        1953096 ( 50%)
    Eukaryota       1353357 ( 35%)
    Viruses          438444 ( 12%)
    Other              3639 ( <1%)



   Within Eukaryota:

    Category            sequences (% of Eukaryota) (% of the complete database)
     Human                  53146 (  4%)           (  1%)
     Other Mammalia        128207 (  9%)           (  3%)
     Other Vertebrata      166359 ( 12%)           (  4%)
     Viridiplantae         277650 ( 21%)           (  7%)
     Fungi                 221421 ( 16%)           (  6%)
     Insecta               140469 ( 10%)           (  4%)
     Nematoda               36997 (  3%)           (  1%)
     Other                 329108 ( 24%)           (  8%)



4.  SEQUENCE SIZE

   Repartition of the sequences by size (excluding fragments)

               From   To  Number             From   To   Number
                  1-  50   49629             1001-1100    23412
                 51- 100  256397             1101-1200    16751
                101- 150  325180             1201-1300    11671
                151- 200  309661             1301-1400     7811
                201- 250  311786             1401-1500     6409
                251- 300  298194             1501-1600     4703
                301- 350  279147             1601-1700     3695
                351- 400  220660             1701-1800     3063
                401- 450  179845             1801-1900     2267
                451- 500  153062             1901-2000     1925
                501- 550  111613             2001-2100     1566
                551- 600   81746             2101-2200     1583
                601- 650   61473             2201-2300     1247
                651- 700   47710             2301-2400     1057
                701- 750   41866             2401-2500      846
                751- 800   37368             >2500      7206
                801- 850   27923
                851- 900   24629
                901- 950   17965
                951-1000   14061




   The average sequence length in UniProtKB/TrEMBL is   325 amino acids.

   The shortest sequence is Q96AT0_HUMAN:     4 amino acids.
   The longest sequence is  Q3ASY8_CHLCH: 36805 amino acids.



5.  STATISTICS FOR SOME LINE TYPES

The following table summarizes the total number of some UniProtKB/TrEMBL lines,
as well as the number of entries with at least one such line, and the
frequency of the lines.

                                   Total    Number of  Average
Line type / subtype                number   entries    per entry
---------------------------------  -------- ---------  ---------

References (RL)                    5698366              1.47
   Submitted to EMBL/GenBank/DDBJ  3001879   2155019    0.77
   Journal                         2633713   2185824    0.68
   Thesis                             6111      6059   <0.01
   Book citation                      4173      4128   <0.01
   Submitted to other databases        281       275   <0.01
   Other                             52209     34730    0.01

Comments (CC)                      1596760              0.41
   CAUTION                          764048    764048    0.20
   SIMILARITY                       288557    283163    0.07
   SUBCELLULAR LOCATION             134969    134969    0.03
   FUNCTION                         121188    115775    0.03
   CATALYTIC ACTIVITY               110804    100343    0.03
   SUBUNIT                           83645     83645    0.02
   COFACTOR                          59953     59716    0.02
   PATHWAY                           19869     16688    0.01
   DOMAIN                             5538      5051   <0.01
   INTERACTION                        4527      4527   <0.01
   MISCELLANEOUS                      3652      3652   <0.01
   ALLERGEN                              6         6   <0.01
   MASS SPECTROMETRY                     4         4   <0.01

Features (FT)                      1871141              0.48
   NON_TER                         1551259    926827    0.40
   CHAIN                            190187    160684    0.05
   SIGNAL                           129160    129160    0.03
   TRANSIT                             535       535   <0.01

Cross-references (DR)             27737959              7.16
   GO                              6176751   2117700    1.59
   InterPro                        5183167   2359043    1.34
   EMBL                            4421091   3866114    1.14
   Pfam                            2961494   2202542    0.76
   PROSITE                         1630918   1054799    0.42
   GenomeReviews                   1149944   1105731    0.30
   KEGG                             874421    836997    0.23
   Gene3D                           758975    651836    0.20
   PRINTS                           661334    551431    0.17
   SMART                            562318    439627    0.15
   TIGRFAMs                         417608    385499    0.11
   SMR                              398503    398493    0.10
   ProDom                           390620    372224    0.10
   BioCyc                           281152    266236    0.07
   HSSP                             272626    272224    0.07
   UniGene                          253722    234448    0.07
   PANTHER                          239476    237162    0.06
   PIR                              185394    150287    0.05
   TIGR                             153089    146665    0.04
   RZPD-ProtExp                     114853     36208    0.03
   ArrayExpress                     101817    101712    0.03
   Ensembl                           93999     93997    0.02
   PIRSF                             87590     86698    0.02
   Gramene                           71013     71013    0.02
   MGI                               44956     43553    0.01
   HGNC                              38055     38004    0.01
   euHCVdb                           30120     30120    0.01
   FlyBase                           24842     24806    0.01
   TAIR                              19580     19520    0.01
   WormPep                           19308     19223   <0.01
   WormBase                          19065     18982   <0.01
   LinkHub                           13923     13923   <0.01
   ZFIN                              12974     12972   <0.01
   DictyBase                         12926     12926   <0.01
   MEROPS                            11947     11509   <0.01
   LegioList                          5345      5315   <0.01
   IntAct                             5246      5246   <0.01
   ListiList                          4724      4707   <0.01
   PDB                                4407      2648   <0.01
   AGD                                4096      4096   <0.01
   PhotoList                          4081      3957   <0.01
   RGD                                4044      3711   <0.01
   REBASE                             3697      3672   <0.01
   TubercuList                        2545      2539   <0.01
   DIP                                2487      2482   <0.01
   GeneDB_Spombe                      1779      1766   <0.01
   SagaList                           1749      1655   <0.01
   Leproma                             972       971   <0.01
   PeroxiBase                          902       901   <0.01
   TRANSFAC                            881       870   <0.01
   MypuList                            590       586   <0.01
   SGD                                 407       406   <0.01
   CYGD                                133       130   <0.01
   PHCI-2DPAGE                         106       106   <0.01
   ANU-2DPAGE                           63        63   <0.01
   Reactome                             49        36   <0.01
   REPRODUCTION-2DPAGE                  40        40   <0.01
   SWISS-2DPAGE                         39        39   <0.01
   PMMA-2DPAGE                           3         3   <0.01
   Siena-2DPAGE                          2         2   <0.01
   COMPLUYEAST-2DPAGE                    1         1   <0.01

Number of explicitly cross-referenced databases: 85


6.  MISCELLANEOUS STATISTICS

Total number of distinct authors cited in UniProtKB/TrEMBL: 245473

Total number of entries encoded on a Mitochondrion: 156079
Total number of entries encoded on a Plasmid: 62697
Total number of entries encoded on a Plastid: 3559
Total number of entries encoded on a Plastid; Apicoplast: 179
Total number of entries encoded on a Plastid; Chloroplast: 53902
Total number of entries encoded on a Plastid; Cyanelle: 7
Total number of entries encoded on a Plastid; Non-photosynthetic plastid: 181

Number of fragments: 929039


Submissions and Updates

We welcome feedback from our users. We would especially appreciate your notifying us if you find that sequences belonging to your field of expertise are missing from the database. We also would like to be notified about annotations to be updated, if, for example, the function of a protein has been clarified or if new information about post-translational modifications has become available.

Submit new sequence data, updates and corrections at http://www.uniprot.org/support/submissions.shtml

For all queries regarding submissions to UniProtKB and to submit new protein sequence data, please contact:

UniProt Knowledgebase
The EMBL Outstation - The European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 462
Telefax: (+44 1223) 494 468
E-mail: datasubs@ebi.ac.uk


Download information

Bi-Weekly releases

The latest data of the UniProt Knowledgebase is available in various format (flatfile, XML or FASTA) at http://www.uniprot.org/database/download.shtml. The data is further supplemented by a file containing the sequences of all additional alternative isoforms annotated in UniProtKB/Swiss-Prot. This data set is documented in the file ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/README.varsplic

Major releases

For users who wish to download the UniProt Knowledgebase only occasionally, we distribute the latest major release (updated 3 times per year) in flatfile format. Previous UniProtKB/Swiss-Prot and UniProtKB/TrEMBL are archived under ftp://ftp.uniprot.org/pub/databases/uniprot/previous_major_releases. The UniProt Knowledgebase major release is also available on DVD from the EBI.


Contact

EMBL Outstation
European Bioinformatics Institute (EBI)
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 444
Fax: (+44 1223) 494 468
Electronic mail address: datalib@ebi.ac.uk / swissprot@ebi.ac.uk
WWW server: http://www.ebi.ac.uk/


Swiss Institute of Bioinformatics (SIB)
Centre Medical Universitaire
1, rue Michel Servet
1211 Geneva 4
Switzerland

Telephone: (+41 22) 379 50 50
Fax: (+41 22) 379 58 58
Electronic mail address: swiss-prot@expasy.org
WWW server: http://www.expasy.org/


Protein Information Resource (PIR)
Georgetown University Medical Center
3300 Whitehaven St., Suite 1200
Washington, DC 20008
United States of America

Telephone: (+1 202) 687 1039
Fax: (+1 202) 687 0057)
Electronic mail address: pirmail@georgetown.edu
WWW server: http://pir.georgetown.edu

Citation

If you want to cite UniProt in a publication please use the following reference:

The UniProt Consortium
"The Universal Protein Resource (UniProt)"
Nucleic Acids Res. 35:D193-D197(2007) doi:10.1093/nar/gkl929