IUPred2 has been tested using the latest CAID dataset, as well as custom dataset consisting of fully annotated proteins at a residue level (more than 95% of the sequence has an order or disorder annotation). This dataset can be downloaded from here.
Dataset | AUC |
CAID | 0.743 |
Fully annotated dataset | 0.761 |
The performance of the second version of IUPred has been tested using customized datasets. Intrinsically disordered protein regions (IDRs) were taken from the DisProt database as the positive testing dataset. Only IDRs with a length of at least 9 residues were included.
The negative dataset comprises protein regions that are known to represent independently folding, stable monomeric units encompassing a single domain according to CATH definitions. These structures were collected from the PDB and were filtered to include only one protein region from each UniRef90 sequence cluster. Structures with flexible residues, as evidenced by highly dissimilar NMR models or missing X-ray coordinates, were removed.
The positive and negative datasets used for benchmarking can be downloaded from here and here, respectively.
Number of residues | Number of proteins | |
Positive (DisProt) | 84,479 | 1,195 |
Negative (monomeric structures) | 178,957 | 1,095 |
Using the above bechmark sets, IUPred2 can be characterized with the following binary
classifier measures:
Sensitivity (True Positive Rate) | 61.85% |
Specificity (True Negative Rate) | 94.03% |
Precision* | 91.20% |
*as the value of precision depends heavily on the relative sizes of the positive and negative datasets, the database sizes were scaled to be equal to achieve an unbiased measure
Benchmarking ANCHOR2The performance of ANCHOR2 was tested on the recently published DIBS database, as the positive testing dataset. DIBS represens the largest currently available set of experimentally verified IDRs capable of forming ordered structures upon binding to protein domains. Only entries not used in the training of ANCHOR2 were used in testing.
For negative testing, the same monomeric single-domain protein dataset was used as for testing IUPred2, but allowing for structures with up to 20% of flexible residues. Only entries not used in the training of ANCHOR2 were used in testing. Furthemore, ANCHOR2 was also evaluated on a set of flexible linkers that are disordered but are known to lack a primary binding function.
To get a fuller picture about the efficiency of ANCHOR2 on sequence sets with different compositions, two auxiliary datasets were also considered. The first is composed of disordered regions from DisProt, the next is a collection of random (decoy) segments from the human proteome excluding transmembrane regions, structured Pfam domains and extracellular proteins. Both datasets are expected to contain disordered binding regions, albeit to a significantly lower extent, compared to DIBS.
The positive, negative, and auxiliary datasets used for benchmarking can be downloaded from here.
Number of residues | Number of proteins | |
Positive (DIBS) | 2,135 | 140 |
Negative (monomeric structures) | 583,033 | 3,320 |
Negative (flexible linkers) | 5,425 | 389 |
Auxiliary (DisProt) | 79,049 | 1,042 |
Auxiliary (decoy) | 76,860 | 5,040 |
Using the above bechmark sets, ANCHOR2 can be characterized with the following binary
classifier measures.
As ANCHOR often specifically identifies only strongly binding sub-regions inside larger
binding regions,
segment-based sensitivity was also calculated. In this case a binding region was considered
found if it
incorporates at least one ANCHOR-identified region, regardless of possible difference in
length:
Residue-based metrics | Segment-based metrics | |
Sensitivity (True Positive Rate) | 62.67% | 69.29% |
Specificity (True Negative Rate on ordered monomers) | 98.26% | - |
Specificity (True Negative Rate on flexible linkers) | 94.58% | - |
Fraction of predicted binding residues in auxiliary DisProt dataset | 50.00% | - |
Fraction of predicted binding residues in auxiliary decoy dataset | 10.93% | - |
As currently there are no comprehensive datasets collecting a large number of experimentally verified examples for other types of context-dependent IDRs targeted by IUPred3, the rigorous testing of these features are not possible as of yet. In accord, these features are marked as ‘Experimental’. However, a number of select examples are available in the How to use section.