CATH / Gene3D v4.2

95 million protein domains classified into 6,119 superfamilies

22-23 July 2020 The CATH website experienced some technical issues during this period as a result of a power outage. Everything should now be working as expected now - apologies for the inconvenience.
CATH v4.3 is nearly here! This has taken longer than anticipated due to issues relating to the recent lockdown, however we are still working hard to generate all the associated data for this upcoming release. In the meantime, get access to the very latest classification information in our daily updates. The core classification files for CATH v4.2 are available to download.

3D Structure

Find out what 3D structure your protein adopts

Protein Evolution

Learn about a particular protein family and how it evolved

Protein Function

Investigate the function of your protein

Conserved Sites

Look at protein sites that are highly conserved and implicated in function

Download Data

Download data files and query CATH via webservices

Learn more

Find out how CATH is created and maintained, how to link to CATH and more

What is CATH-Gene3D?

CATH is a classification of protein structures downloaded from the Protein Data Bank. We group protein domains into superfamilies when there is sufficient evidence they have diverged from a common ancestor.

Gene3D uses the information in CATH to predict the locations of structural domains on millions of protein sequences available in public databases. This allows us to include additional annotations to the CATH-Gene3D database such as functional information and active site residues.

If you have any questions, comments or suggestions please get in touch via Twitter, ask a question in our online forum or visit our support page.

Latest Release Statistics Info

CATH-Plus 4.2.0 CATH (daily snapshot)
PDB Release 17-05-2017
Domains 434857 536769
Superfamilies 6119 6631
Annotated PDBs 131091 164635

Gene3D v16
Protein Sequences52,073,853
CATH Domain Predictions95,665,487

Citing this resource

If you find the information in this resource useful, please consider using the following citations:

CATH: expanding the horizons of structure-based functional annotations for genome sequences
Ian Sillitoe, Natalie Dawson, Tony E Lewis, Sayoni Das, Jonathan G Lees, Paul Ashford, Adeyelu Tolulope, Harry M Scholes, Ilya Senatorov, Andra Bujan, Fatima Ceballos Rodriguez-Conde, Benjamin Dowling, Janet Thornton, Christine A Orengo.
Nucleic Acids Res. 2019 Jan
Gene3D: Extensive prediction of globular domains in proteins.
Lewis TE, Sillitoe I, Dawson N, Lam SD, Clarke T, Orengo CA, Lees JG.
Nucleic Acids Res. 2018 Jan


The CATH and Gene3D resources have enjoyed generous funding from a number of research councils.

BBSRC logo MRC logo NIH logo Wellcome logo ERC logo