************************************************************************
********** REPORT OF PROTEIN ANALYSIS  by the WHAT IF program **********
************************************************************************

Date : 2010-07-21
This report was created by WHAT IF version WHATCHECK 8.0

This document contains a report of findings by the {\sc what if} program
during the analysis of a PDB-file. Each reported fact has
an assigned severity, one of:

error  : severe errors encountered during the analyses. Items marked as errors
         are considered severe problems requiring immediate attention.
warning: Either less severe problems or uncommon structural features. These
         still need special attention.
note   : Statistical values, plots, or other verbose results of tests and
         analyses that have been performed.

If alternate conformations are present, only the first is evaluated. Hydrogen
atoms are only included if explicitly requested, and even then they are not
used in all checks. The software functions less well for non-canonical amino
acids and exotic ligands than for the 20 canonical residues and canonical
nucleic acids.

Some remarks regarding the output:

Residues/atoms in tables are normally given in a few parts:

A number. This is the internal sequence number of the residue used
    by WHAT IF. The first residues in the file get number 1, 2, etc.
The residue type. Normally this is a three letter amino acid type.
The sequence number, between brackets. This is the residue number
    as it was given in the input file. It can be followed by the insertion
    code.
The chain identifier. A single character. If no chain identifier
    was given in the input file, this will be a minus sign or a blank.
A model number. If no model number exists, like in most X-ray files,
    this will be a blank or occasionally a minus sign.
In case an atom is part of the output, the atom will be listed using
    the PDB nomenclature for type and identifier.

To indicate the normality of a score, the score
   may be expressed as a Z-value or Z-score. This is just the number
   of standard deviations that the score deviates from the expected
   value.  A property of Z-values is that the root-mean-square of a
   group of Z-values (the RMS Z-value) is expected to be 1.0. Z-values
   above 4.0 and below $-4.0$ are very uncommon. If a Z-score is used
   in WHAT IF, the accompanying text will explain how the expected
   value and standard deviation were obtained.
The names of nucleic acids are DGUA, DTHY, OCYT, OADE, etc. The first
   character is a D or O for DNA or RNA respectively.
   This is done to circumvent ambiguities in the many old PDB files in which
   DNA and RNA were both called A, C, G, and T.




========================================================================
==== Compound code /pdb/pdb1crn.ent                                 ====
========================================================================
 
# 1 # Warning: Class of conventional cell differs from CRYST1 cell
The crystal class of the conventional cell is different from the
crystal class of the cell given on the CRYST1 card. If the new
class is supported by the coordinates this is an indication of a
wrong space group assignment.
 
 The CRYST1 cell dimensions
    A    =  40.960  B   =  18.650  C    =  22.520
    Alpha=  90.000  Beta=  90.770  Gamma=  90.000
 
 Dimensions of a reduced cell
    A    =  18.650  B   =  22.520  C    =  40.960
    Alpha=  89.230  Beta=  90.000  Gamma=  90.000
 
 Dimensions of the conventional cell
    A    =  18.650  B   =  22.520  C    =  40.960
    Alpha=  90.770  Beta=  90.000  Gamma=  90.000
 
 Transformation to conventional cell
  0.000000  1.000000  0.000000
  0.000000  0.000000 -1.000000
 -1.000000  0.000000  0.000000
 
Crystal class of the cell: MONOCLINIC
 
Crystal class of the conventional CELL: ORTHORHOMBIC
 
Space group name: P 1 21 1
 
Bravais type of conventional cell is: P
     1 -   10   THR THR CYS CYS PRO  SER ILE VAL ALA ARG
    11 -   20   SER ASN PHE ASN VAL  CYS ARG LEU PRO GLY
    21 -   30   THR PRO GLU ALA ILE  CYS ALA THR TYR THR
    31 -   40   GLY CYS ILE ILE ILE  PRO GLY ALA THR CYS
    41 -   46   PRO GLY ASP TYR ALA  ASN
 
Content of the SOUP. See the writeup for an explanation.
Protein .................... : 1
Drug, ligand or co-factor .. : 0
DNA or RNA ................. : 0
Single atom entity ......... : 1
(Groups of) water .......... : 0
Drug with known topology ... : 0
Sugar or sugar-like ........ : 0
Residues with alternate atom : 0
 
 Molecule      Range              Type              Set name
     1    1 (    1)   46 (   46)A Protein           pdb1crn.ent           1
     2   47 (   46)   47 (   46)A N O2 <-    46     pdb1crn.ent           4
 
# 2 # Error: Negated value in scale matrix
One or more of the values of the scale matrix are wrong.

Possible cause: Comparison with the matrix derived from the CRYST1
card reveals that values have been inverted in sign.
 
 SCALE matrices (as given and from CRYST1)
  0.024414  0.000000 -0.000328     0.024414  0.000000  0.000328
  0.000000  0.053619  0.000000     0.000000  0.053619  0.000000
  0.000000  0.000000  0.044409     0.000000  0.000000  0.044409
No potential non-crystallographically symmetric pairs detected
 
# 3 # Warning: Conventional cell is pseudo-cell
The extra symmetry that would be implied by the transition to the
previously mentioned conventional cell has not been observed. It must be
concluded that the crystal lattice has pseudo-symmetry.
 
# 4 # Note: Matthews coefficient OK

The Matthews coefficient [REF] is defined as the density of the
protein structure in cubic Angstroms per Dalton. Normal values are
between 1.5 (tightly packed, little room for solvent) and 4.0
(loosely packed, much space for solvent). Some very loosely packed
structures can get values a bit higher than that.
 
 Molecular weight of all polymer chains: 4722.412
 Volume of the Unit Cell V= 17201.699
 Cell multiplicity: 2
 Matthews coefficient for observed atoms Vm= 1.821
 
# 5 # Note: No atoms with high occupancy detected at special positions
Either there were no atoms at special positions, or all atoms at
special positions have adequately reduced occupancies.
An atom is considered to be located at a special position if it is
within 0.3 Angstrom from one of its own symmetry copies. See also the
next check...
 
# 6 # Note: All atoms are sufficiently far away from symmetry axes
None of the atoms in the structure is closer than 0.77 Angstrom to
a proper symmetry axis.
 
# 7 # Note: Ligand topologies OK
The topology could be determined for all ligands (or there are no ligands
for which a topology is needed, in which case there is absolutely no
problem, of course). That is good because it means that all ligands can
be included in the hydrogen bond optimization and related options.
 
# 8 # Note: No strange inter-chain connections detected
No covalent bonds have been detected between molecules with
non-identical chain identifiers.
 
# 9 # Note: No duplicate atom names in ligands
All atom names in ligands seem adequately unique.
 
# 10 # Note: No mixed usage of alternate atom problems detected
Either this structure does not contain alternate atoms, or they have not
been mixed up, or the errors have remained unnoticed.
 
# 11 # Note: In all cases the primary alternate atom was used
WHAT IF saw no need to make any alternate atom corrections (which means they
are all correct, or there aren't any).
 
# 12 # Note: No overlapping non-alternates detected
Either this structure does not contain overlapping non-alternate atoms,
or they are all correct, or the errors have remained unnoticed.
 
# 13 # Note: No residues detected inside ligands
Either this structure does not contain ligands with amino acid groups
inside it, or their naming is proper (enough).
 
# 14 # Note: No attached groups interfere with hydrogen bond calculations
It seems there are no sugars, lipids, etc., bound (very close) to
atoms that otherwise could form hydrogen bonds.
 
# 15 # Note: All residues have a complete backbone.
No residues have missing backbone atoms.
 
# 16 # Note: No probable atoms with zero occupancy detected.
Either there are no atoms with zero occupancy, or they are not present in
the file, or their positions are sufficiently improbable to warrant a
zero occupancy.
 
# 17 # Note: No crippling errors.
Problems can exist that make it impossible to continue the validation.
WHAT IF seems not to have encountered any of these.
 
# 18 # Note: Non-canonicals
WHAT IF has not detected any non-canonical residue that it doesn't
understand, or there are no non-canonical residues in the PDB file.
 
Content of the SOUP. See the writeup for an explanation.
Protein .................... : 1
Drug, ligand or co-factor .. : 0
DNA or RNA ................. : 0
Single atom entity ......... : 1
(Groups of) water .......... : 0
Drug with known topology ... : 0
Sugar or sugar-like ........ : 0
Residues with alternate atom : 0
 
 Molecule      Range              Type              Set name
     1    1 (    1)   46 (   46)A Protein           pdb1crn.ent           1
     2   47 (   46)   47 (   46)A N O2 <-    46     pdb1crn.ent           4
 
# 19 # Note: Content of the PDB file as interpreted by WHAT IF
Content of the PDB file as interpreted by WHAT IF.
WHAT IF has read your PDB file, and stored it internally in
what is called 'the soup'. The content of this soup is listed here.
An extensive explanation of all frequently used WHAT IF output formats
can be found at http://swift.cmbi.ru.nl/. Look under output formats.
A course on reading this 'Molecules' table is part of the WHAT\_CHECK
web pages [REF].
 
     1     1 (    1)    46 (   46) A Protein             pdb1crn.ent
     2    47 (   46)    47 (   46) A N O2 <-    46       pdb1crn.ent
 
# 20 # Note: Ramachandran plot
In this Ramachandran plot x-signs represent glycines, squares represent
prolines, and plus-signs represent the other residues. If too many
plus-signs fall outside the contoured areas then the molecule is poorly
refined (or worse). Proline can only occur in the narrow region around
phi=60 that also falls within the other contour islands.

In a colour picture, the residues that are part of a helix are
shown in blue, strand residues in red.  "Allowed" regions for
helical residues are drawn in blue, for strand residues in red, and
for all other residues in green.
A full explanation of the Ramachandran plot together with a series of
examples can be found at the WHAT\_CHECK website [REF].
 
In the TeX file, a plot has been inserted here
 
 Chain identifier: A
 
# 21 # Note: Secondary structure
This is the secondary structure according to DSSP. Only helix (H),
overwound or 3/10-helix (3), strand (S), turn (T) and coil (blank)
are shown [REF]. All DSSP related information can be found at
http://swift.cmbi.ru.nl/gv/dssp/.
This is not really a structure validation option, but a very scattered
secondary structure (i.e. many strands of only a few residues length,
many Ts inside helices, etc) tends to indicate a poor structure. A full
explanation of the DSSP secondary structure determination program
together with a series of examples can be found at the WHAT\_CHECK
website [REF].
 
 Secondary structure assignment
                     10        20        30        40
                      |         |         |         |
    1 -   46 TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN
(   1)-(  46) SS TTHHHHHHHHHHHTTT  HHHHHHHHT SS TTT   333
 
 
 
 
 
# 22 # Note: No rounded coordinates detected
No significant rounding of atom coordinates has been detected.
 
# 23 # Note: No artificial side chains detected
No artificial side-chain positions characterized by chi-1=0.00 or
chi-1=180.00 have been detected.
 
# 24 # Note: No missing atoms detected in residues
All expected atoms are present in residues. This validation option has
not looked at 'things' that can or should be attached to the elemantary
building blocks (amino acids, nucleotides). Even the C-terminal oxygens
are treated separately.
 
# 25 # Note: No C-terminal nitrogen detected
The PDB indicates that a residue is not the true C-terminus
by including only the backbone N of the next residue. This has not been
observed in this PDB file.
 
# 26 # Note: Test capping of (pseudo) C-termini
No extra capping groups were found on pseudo C-termini. This can imply
that no pseudo C-termini are present.
 
# 27 # Note: Proper C-terminal capping groups found
All (presumably) real C-termini either contain a proper capping group (OXT,
or something else), or they are followed by a single Nitrogen, indicating
that the rest of the chain is invisible.
 
# 28 # Note: No OXT found in the middle of chains
No OXT groups were found in the middle of protein chains.
 
# 29 # Note: Weights checked OK
All atomic occupancy factors ('weights') fall in the 0.0--1.0 range.
 
# 30 # Note: Normal distribution of occupancy values

The distribution of the occupancy values in this file seems 'normal'.

Be aware that this evaluation is merely the result of comparing this
file with about 500 well-refined high-resolution files in the PDB. If
this file has much higher or much lower resolution than the PDB files
used in WHAT IF's training set, non-normal values might very well be
perfectly fine, or normal values might actually be not so normal.
So, this check is actually more an indicator and certainly not a check
in which I have great confidence.
 
# 31 # Note: All occupancies seem to add up to 0.0 - 1.0.
In principle, the occupancy of all alternates of one atom should add up
till 0.0 - 1.0. 0.0 is used for the missing atom (i.e. an atom not
seen in the electron density).
Obviously, there is nothing terribly wrong when a few occupancies
add up to a bit more than 1.0, because the mathematics of refinement
allow for that. However, if it happens often, it seems worth evaluating
this in light of the refinement protocol used.
 
# 32 # Note: Average B-factor OK
The average B-factor of buried atoms is within expected values for
a room-temperature X-ray study.
 
Average B-factor for buried atoms :  5.509
 
Crystal temperature : -1.000
 
# 33 # Warning: More than 5 percent of buried atoms has low B-factor
For normal protein structures, no more than about 1 percent of the
B factors of buried atoms is below 5.0. The fact that this value is
much higher in the current structure could be a signal of overrefined
B-factors, restraints or constraints to too-low values, misuse of the
B-factor field in the PDB file, or a scaling problem. If the
average B factor is low too, it is probably a low temperature
structure determination.
 
Percentage of buried atoms with B less than 5 :  49.12
 
# 34 # Note: B-factor plot
The average atomic B-factor per residue is plotted as function of
the residue number.
 
In the TeX file, a plot has been inserted here
 
 Chain identifier: A
 
# 35 # Note: Introduction to the nomenclature section.
Nomenclature problems seem, at first, rather unimportant. After all who
cares if we call the delta atoms in leucine delta 2 and delta 1 rather than
the other way around. Chemically speaking that is correct. But structures
have not been solved and deposited just for chemists to look at them. Most
times a structure is used, it is by software in a bioinformatics lab. And
if they compare structures in which the one used C delta 1 and 2 and the
other uses C delta 2 and 1, then that comparison will fail. Also, we
recalculate all structures every so many years to make sure that everybody
always can get access to the best coordinates that can be obtained from
the (your?) experimental data. These recalculations will be troublesome if
there are nomenclature problems.

Several Nomenclature problems actually are worse than that. At the
WHTA\_CHECK website [REF] you can get an overview of the importance of all
nomenclature problems that we list.
 
# 36 # Note: Valine nomenclature OK
No errors were detected in valine nomenclature.
 
# 37 # Note: Threonine nomenclature OK
No errors were detected in threonine nomenclature.
 
# 38 # Note: Isoleucine nomenclature OK
No errors were detected in isoleucine nomenclature.
 
# 39 # Note: Leucine nomenclature OK
No errors were detected in leucine nomenclature.
 
# 40 # Note: Arginine nomenclature OK
No errors were detected in arginine nomenclature.
 
# 41 # Note: Tyrosine torsion conventions OK
No errors were detected in tyrosine torsion angle conventions.
 
# 42 # Note: Phenylalanine torsion conventions OK
No errors were detected in phenylalanine torsion angle conventions.
 
# 43 # Note: Aspartic acid torsion conventions OK
No errors were detected in aspartic acid torsion angle conventions.
 
# 44 # Note: Glutamic acid torsion conventions OK
No errors were detected in glutamic acid torsion angle conventions.
 
# 45 # Note: Phosphate group names OK
No errors were detected in phosphate group naming conventions.
 
# 46 # Note: Heavy atom naming OK
No errors were detected in the atom names for non-hydrogen atoms. Please
be aware that the PDB wants us to deliberately make some nomenclature errors;
especially in non-canonical amino acids.
 
# 47 # Note: Chain names are OK
All chain names assigned to polymer molecules are unique, and all
residue numbers are strictly increasing within each chain.
 
# 48 # Warning: Unusual bond lengths
The bond lengths listed in the table below were found to deviate
more than 4 sigma from standard bond lengths (both standard values
and sigmas for amino acid residues have been taken from Engh and
Huber [REF], for DNA they were taken from Parkinson et al [REF]). In
the table below for each unusual bond the bond length and the
number of standard deviations it differs from the normal value is
given.

Atom names starting with "-" belong to the previous residue in the
chain. If the second atom name is "-SG*", the disulphide bridge has
a deviating length.
 
  37 GLY   (  37-)  A  -   N    CA    1.52    4.2
 
# 49 # Note: Normal bond length variability
Bond lengths were found to deviate normally from the standard bond
lengths (values for Protein residues were taken from Engh and Huber
[REF], for DNA/RNA from Parkinson et al [REF]).
 
 RMS Z-score for bond lengths: 1.124
 RMS-deviation in bond distances: 0.024
 
# 50 # Warning: Possible cell scaling problem
Comparison of bond distances with Engh and Huber [REF] standard
values for protein residues and Parkinson et al [REF] values for
DNA/RNA shows a significant systematic deviation. It could be that
the unit cell used in refinement was not accurate enough. The
deformation matrix given below gives the deviations found: the
three numbers on the diagonal represent the relative corrections
needed along the A, B and C cell axis. These values are 1.000 in a
normal case, but have significant deviations here (significant at
the 99.99 percent confidence level)

There are a number of different possible causes for the
discrepancy.  First the cell used in refinement can be different
from the best cell calculated. Second, the value of the wavelength
used for a synchrotron data set can be miscalibrated. Finally, the
discrepancy can be caused by a dataset that has not been corrected
for significant anisotropic thermal motion.

Please note that the proposed scale matrix has NOT been restrained
to obey the space group symmetry. This is done on purpose. The
distortions can give you an indication of the accuracy of the
determination.

If you intend to use the result of this check to change the cell dimension
of your crystal, please read the extensive literature on this topic first.
This check depends on the wavelength, the cell dimensions, and on the
standard bond lengths and bond angles used by your refinement software.
 
 Unit Cell deformation matrix
  0.987188  0.000293 -0.008811
  0.000293  0.991730 -0.001286
 -0.008811 -0.001286  0.992544
 Proposed new scale matrix
  0.024730 -0.000007 -0.000111
 -0.000015  0.054066  0.000070
  0.000399  0.000058  0.044746
 With corresponding cell
    A    =  40.437  B   =  18.496  C    =  22.348
    Alpha=  90.148  Beta=  90.254  Gamma=  89.965
 
 The CRYST1 cell dimensions
    A    =  40.960  B   =  18.650  C    =  22.520
    Alpha=  90.000  Beta=  89.230  Gamma=  90.000
 
 Variance: 144.753
 (Under-)estimated Z-score: 8.867
 
# 51 # Warning: Unusual bond angles
The bond angles listed in the table below were found to deviate
more than 4 sigma from standard bond angles (both standard values
and sigma for protein residues have been taken from Engh and Huber
[REF], for DNA/RNA from Parkinson et al [REF]).  In the table below
for each strange angle the bond angle and the number of standard
deviations it differs from the standard values is given. Please
note that disulphide bridges are neglected. Atoms starting with "-"
belong to the previous residue in the sequence.
 
   1 THR   (   1-)  A  -   CA   CB   OG1 103.06   -4.4
   1 THR   (   1-)  A  -   CG2  CB   OG1 117.41    4.1
  12 ASN   (  12-)  A  -   ND2  CG   OD1 127.61    5.0
  14 ASN   (  14-)  A  -   ND2  CG   OD1 128.63    6.0
  39 THR   (  39-)  A  -   CA   CB   OG1 103.59   -4.0
  45 ALA   (  45-)  A  -   N    CA   CB  103.98   -4.3
 
# 52 # Note: Normal bond angle variability
Bond angles were found to deviate normally from the mean standard
bond angles (normal values for protein residues were taken from
Engh and Huber [REF], for DNA/RNA from Parkinson et al [REF]). The
RMS Z-score given below is expected to be around 1.0 for a normally
restrained data set, and this is indeed observed for very high
resolution X-ray structures.
 
 RMS Z-score for bond angles: 1.285
 RMS-deviation in bond angles: 2.346
 
# 53 # Note: Residue hand error(s)
You are asking for a hand-check. WHAT IF has over the course of this
session perhaps corrected the handedness of atoms in several residues.
These residues are listed here. You better check these by hand.
 
# 54 # Note: Chirality OK
All protein atoms have proper chirality.
The average deviation= 1.384
 
# 55 # Note: Improper dihedral angle distribution OK
The RMS Z-score for all improper dihedrals in the structure is within
normal ranges.
 
 Improper dihedral RMS Z-score : 1.110
                     10        20        30        40
                      |         |         |         |
    1 -   46 TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN
(   1)-(  46) SS TTHHHHHHHHHHHTTT  HHHHHHHHT SS TTT   333
 
 
 
 
 
# 56 # Note: Tau angles OK
All of the tau angles of amino acids that actually have a tau angle
fall within expected RMS deviations.
 
# 57 # Note: Normal tau angle deviations
The RMS Z-score for the tau angles in the structure falls within the
normal rannge that we guess to be 0.5 - 1.5.
Be aware, we determined the tau normal distributions from 500
high-resolution X-ray structures, rather than from CSD data, so we cannot
be 100 percent certain about these numbers.
 
 Tau angle RMS Z-score : 1.287
 
# 58 # Note: Side chain planarity OK
All of the side chains of residues that have a planar group are
planar within expected RMS deviations.
 
# 59 # Note: Atoms connected to aromatic rings OK
All of the atoms that are connected to planar aromatic rings in side
chains of amino-acid residues are in the plane within expected RMS
deviations.
Since there is no DNA and no protein with hydrogens, no uncalibrated
planarity check was performed.
 
# 60 # Warning: Unusual PRO puckering amplitudes
The proline residues listed in the table below have a puckering
amplitude that is outside of normal ranges. Puckering parameters
were calculated by the method of Cremer and Pople [REF]. Normal PRO
rings have a puckering amplitude Q between 0.20 and 0.45
Angstrom. If Q is lower than 0.20 Angstrom for a PRO residue, this
could indicate disorder between the two different normal ring forms
(with C-gamma below and above the ring, respectively). If Q is
higher than 0.45 Angstrom something could have gone wrong during the
refinement. Be aware that this is a warning with a low confidence level.
See: Who checks the checkers? Four validation tools applied to eight
atomic resolution structures [REF]
 
  19 PRO   (  19-)  A  -   0.13 LOW
  36 PRO   (  36-)  A  -   0.04 LOW
 
# 61 # Note: PRO puckering phases OK
Puckering phases for all PRO residues are normal
You will see two tables.

The first table gives a per-residue Z-score.  These scores give an
impression of how `normal' the torsion angles in protein residues
are. All torsion angles except omega are used for calculating a
`normality' score. Average values and standard deviations were
obtained from the residues in the WHAT IF database. These are used
to calculate Z-scores.  A residue with a Z-score of below -2.0 is
poor, and a score of less than -3.0 is worrying.  For such residues
more than one torsion angle is in a highly unlikely position.

The second table will list all residues with strange backbone angles.
An explanation of why the residue was listed is also included.

 
# 62 # Note: Torsion angles OK
All individual residues have normal overall torsion angle scores.
 
# 63 # Note: Backbone torsion angles OK
All individual residues have normal backbone torsion angles.
 Ramachandran Z-score : -0.307
 
# 64 # Note: Ramachandran Z-score OK
The score expressing how well the backbone conformations of all residues
are corresponding to the known allowed areas in the Ramachandran plot is
within expected ranges for well-refined structures.
 
 Ramachandran Z-score : -0.307
Omega average and std. deviation= 180.420 3.769
Significant deviations from expected 5.5!!!
 
# 65 # Warning: Omega angles too tightly restrained
The omega angles for trans-peptide bonds in a structure are
expected to give a gaussian distribution with the average around
+178 degrees and a standard deviation around 5.5 degrees. These
expected values were obtained from very accurately determined
structures.  Many protein structures are too tightly restrained.
This seems to be the case with the current structure too, as the
observed standard deviation is below 4.0 degrees.
 
 Standard deviation of omega values : 3.769
 chi-1/chi-2 correlation Z-score : -0.900
 
# 66 # Note: chi-1/chi-2 angle correlation Z-score OK
The score expressing how well the chi-1/chi-2 angles of all residues
are corresponding to the populated areas in the database is
within expected ranges for well-refined structures.
 
 chi-1/chi-2 correlation Z-score : -0.900
 
# 67 # Note: Backbone oxygen evaluation OK
All residues for which the local backbone conformation could be
found in the WHAT IF database have a normal backbone oxygen
position.
 
# 68 # Note: Rotamers checked OK
None of the residues that have a normal backbone environment have
abnormal rotamers.
 
# 69 # Warning: Unusual backbone conformations
For the residues listed in the table below, the backbone formed by
itself and two neighbouring residues on either side is in a
conformation that is not seen very often in the database of solved
protein structures.  The number given in the table is the number of
similar backbone conformations in the database with the same amino
acid in the centre.

For this check, backbone conformations are compared with database
structures using C-alpha superpositions with some restraints on the
backbone oxygen positions.

A residue mentioned in the table can be part of a strange loop, or
there might be something wrong with it or its directly surrounding
residues. There are a few of these in every protein, but in any
case it is worth looking at!
 
  19 PRO   (  19-)  A  -     0
   5 PRO   (   5-)  A  -     1
  30 THR   (  30-)  A  -     1
  44 TYR   (  44-)  A  -     1
   4 CYS   (   4-)  A  -     2
  32 CYS   (  32-)  A  -     2
  36 PRO   (  36-)  A  -     2
  38 ALA   (  38-)  A  -     2
 
# 70 # Note: Backbone conformation Z-score OK
The backbone conformation analysis gives a score that is normal
for well refined protein structures.
 
 Backbone conformation Z-score : 0.109
 
# 71 # Note: No Van der Waals overlaps
All interatomic distances (including symmetry transformations) have
been verified. No unusual contacts were found.  No pair of atoms
has an unusual short contact distance.
 
# 72 # Note: Inside/Outside residue distribution normal
The distribution of residue types over the inside and the outside of the
protein is normal.
 
inside/outside RMS Z-score : 1.012
 
# 73 # Note: Inside/Outside RMS Z-score plot
The Inside/Outside distribution normality RMS Z-score over a 15
residue window is plotted as function of the residue number. High
areas in the plot (above 1.5) indicate unusual inside/outside
patterns.
 
In the TeX file, a plot has been inserted here
 
 Chain identifier: A
 
# 74 # Note: Packing environment OK
None of the individual amino acid residues has a bad packing environment.
 
# 75 # Note: No series of residues with bad packing environment
There are no stretches of three or more residues each having a quality
control score worse than -4.0.
 
# 76 # Note: Structural average packing environment OK
The structural average quality control value is within normal ranges.
 
 
Average for range     1 -   46 :  -0.435
 
# 77 # Note: Quality value plot
The quality value smoothed over a 10 residue window is plotted as
function of the residue number. Low areas in the plot (below
-2.0) indicate "unusual" packing.
 
In the TeX file, a plot has been inserted here
 
 Chain identifier: A
----Residue-------      State    AllAll    BB-BB    BB-SC    SC-BB    SC-SC
---------------------------------------------------------------------------
============================================================
 All   contacts    : Average = -0.057 Z-score =  -0.34
 BB-BB contacts    : Average =  0.038 Z-score =   0.19
 BB-SC contacts    : Average = -0.110 Z-score =  -0.91
 SC-BB contacts    : Average = -0.036 Z-score =  -0.17
 SC-SC contacts    : Average = -0.135 Z-score =  -0.62
============================================================
 
# 78 # Note: Second generation packing environment OK
None of the individual amino acid residues has a bad packing environment.
 
# 79 # Note: No series of residues with abnormal new packing environment
There are no stretches of four or more residues each having a quality
control Z-score worse than -1.75.
 
# 80 # Note: Structural average packing Z-score OK
The structural average for the second generation quality control
value is within normal ranges.
 
 All   contacts    : Average = -0.057 Z-score =  -0.34
 BB-BB contacts    : Average =  0.038 Z-score =   0.19
 BB-SC contacts    : Average = -0.110 Z-score =  -0.91
 SC-BB contacts    : Average = -0.036 Z-score =  -0.17
 SC-SC contacts    : Average = -0.135 Z-score =  -0.62
 
# 81 # Note: Second generation quality Z-score plot
The second generation quality Z-score smoothed over a 10 residue window
is plotted as function of the residue number. Low areas in the plot (below
-1.3) indicate "unusual" packing.
 
In the TeX file, a plot has been inserted here
 
 Chain identifier: A
Since there are no waters, the water check has been skipped.
O2 located for residue 46
Number of donors: 64
Number of H-atoms: 73
Number of donor groups: 2
Symmetry related molecules will be taken into account
Calculating accessibilities and coordinates
Number of positive ions : 0
Finding possible acceptors for all donors...
Total number of potential acceptors: 52
Locating affected donors for all ambiguities...
Number of donors affected by ambiguities: 9
Initializing group penalty for all donors 64
DBG> Total flip penalty: 0.000
Fraction of hydrogen network done : 0.000
Fraction of hydrogen network done : 0.114
Fraction of hydrogen network done : 0.228
Fraction of hydrogen network done : 0.342
Fraction of hydrogen network done : 0.456
Fraction of hydrogen network done : 0.570
Fraction of hydrogen network done : 0.614
Fraction of hydrogen network done : 0.728
Fraction of hydrogen network done : 0.842
Fraction of hydrogen network done : 0.956
Fraction of hydrogen network done : 1.000
Total log10 N solved by Cutting.....: 0.000
Total log10 N solved by Treshold Acc: 0.000
Total log10 N solved by Brute Force.: 13.655
Of these only TA is a heuristic method.
Total number of positions evaluated 316
 
# 82 # Note: HIS, ASN, GLN side chains OK
All of the side chain conformations of Histidine, Asparagine and
Glutamine residues were found to be optimal for hydrogen bonding.
Unsatisfied donor :    6 SER  (   6-) A  -   N
Unsatisfied donor :    8 VAL  (   8-) A  -   N
Unsatisfied donor :   45 ALA  (  45-) A  -   N
 
# 83 # Warning: Buried unsatisfied hydrogen bond donors
The buried hydrogen bond donors listed in the table below have a
hydrogen atom that is not involved in a hydrogen bond in the
optimized hydrogen bond network.

Hydrogen bond donors that are buried inside the protein normally
use all of their hydrogens to form hydrogen bonds within the
protein. If there are any non hydrogen bonded buried hydrogen bond
donors in the structure they will be listed here. In very good
structures the number of listed atoms will tend to zero.

Waters are not listed by this option.
 
   6 SER   (   6-)  A  -   N
   8 VAL   (   8-)  A  -   N
  45 ALA   (  45-)  A  -   N
Acceptor does not accept :    3 CYS  (   3-) A  -   SG
Acceptor does not accept :    4 CYS  (   4-) A  -   SG
Acceptor does not accept :   26 CYS  (  26-) A  -   SG
Acceptor does not accept :   32 CYS  (  32-) A  -   SG
--Potential donor :   30 THR  (  30-) A  -   OG1
Acceptor does not accept :   44 TYR  (  44-) A  -   O
 
# 84 # Note: Buried hydrogen bond acceptors OK
All buried polar side-chain hydrogen bond acceptors are involved in a
hydrogen bond in the optimized hydrogen bond network.
 
# 85 # Note: Content of the PDB file as interpreted by WHAT IF
Content of the PDB file as interpreted by WHAT IF.
WHAT IF has read your PDB file, and stored it internally in
what is called 'the soup'. The content of this soup is listed here.
An extensive explanation of all frequently used WHAT IF output formats
can be found at http://swift.cmbi.ru.nl/. Look under output formats.
A course on reading this 'Molecules' table is part of the WHAT\_CHECK
web pages [REF].
 
     1     1 (    1)    46 (   46) A Protein             pdb1crn.ent
     2    47 (   46)    47 (   46) A N O2 <-    46       pdb1crn.ent
 
# 86 # Warning: No crystallisation information
No, or very inadequate, crystallisation information was observed upon
reading the PDB file header records. This information should be available
in the form of a series of REMARK 280 lines. Without this information a
few things, such as checking ions in the structure, cannot be performed
optimally.
 
# 87 # Note: No ions (of a type we can validate) in structure
Since there are no ions in the structure of a type we can validate, this
check will not be executed.
Since there are no waters, the water check has been skipped.
SOUP contains no water:
 
Content of the SOUP. See the writeup for an explanation.
Protein .................... : 1
Drug, ligand or co-factor .. : 0
DNA or RNA ................. : 0
Single atom entity ......... : 1
(Groups of) water .......... : 0
Drug with known topology ... : 0
Sugar or sugar-like ........ : 0
Residues with alternate atom : 0
 
 Molecule      Range              Type              Set name
     1    1 (    1)   46 (   46)A Protein           pdb1crn.ent           1
     2   47 (   46)   47 (   46)A N O2 <-    46     pdb1crn.ent           4
 
# 88 # Note: Summary report for users of a structure
This is an overall summary of the quality of the structure as
compared with current reliable structures. This summary is most
useful for biologists seeking a good structure to use for modelling
calculations.

The second part of the table mostly gives an impression of how well
the model conforms to common refinement restraint values. The
first part of the table shows a number of restraint-independent
quality indicators.
 
 Structure Z-scores, positive is better than average:
  1st generation packing quality :   0.163
  2nd generation packing quality :  -0.344
  Ramachandran plot appearance   :  -0.307
  chi-1/chi-2 rotamer normality  :  -0.900
  Backbone conformation          :   0.109
 
 RMS Z-scores, should be close to 1.0:
  Bond lengths                   :   1.124
  Bond angles                    :   1.285
  Omega angle restraints         :   0.685 (tight)
  Side chain planarity           :   0.842
  Improper dihedral distribution :   1.110
  Inside/Outside distribution    :   1.012
 
# 89 # Note: Summary report for depositors of a structure
This is an overall summary of the quality of the X-ray structure as
compared with structures solved at similar resolutions. This summary
can be useful for a crystallographer to see if the structure makes
the best possible use of the data.
Warning. This table works well for structures solved in the resolution
range of the structures in the WHAT IF database, which is presently
(summer 2008) mainly 1.1 - 1.3 Angstrom. The further the resolution of
your file deviates from this range the more meaningless this table
becomes.

The second part of the table mostly gives an impression of how well
the model conforms to common refinement restraint values. The
first part of the table shows a number of restraint-independent
quality indicators, which have been calibrated against structures
of similar resolution.
 
 Resolution found in PDB file        :   1.50
 
 Structure Z-scores, positive is better than average:
  1st generation packing quality :   0.5
  2nd generation packing quality :  -0.8
  Ramachandran plot appearance   :  -0.8
  chi-1/chi-2 rotamer normality  :  -1.4
  Backbone conformation          :  -0.4
 
 RMS Z-scores, should be close to 1.0:
  Bond lengths                   :   1.124
  Bond angles                    :   1.285
  Omega angle restraints         :   0.685 (tight)
  Side chain planarity           :   0.842
  Improper dihedral distribution :   1.110
  Inside/Outside distribution    :   1.012
==============


WHAT IF
    G.Vriend,
      WHAT IF: a molecular modelling and drug design program,
    J. Mol. Graph. 8, 52--56 (1990).

WHAT_CHECK (verification routines from WHAT IF)
    R.W.W.Hooft, G.Vriend, C.Sander and E.E.Abola,
      Errors in protein structures
    Nature 381, 272 (1996).
    (see also http://swift.cmbi.ru.nl/gv/whatcheck for a course and extra inform

Bond lengths and angles, protein residues
    R.Engh and R.Huber,
      Accurate bond and angle parameters for X-ray protein structure
      refinement,
    Acta Crystallogr. A47, 392--400 (1991).

Bond lengths and angles, DNA/RNA
    G.Parkinson, J.Voitechovsky, L.Clowney, A.T.Bruenger and H.Berman,
      New parameters for the refinement of nucleic acid-containing structures
    Acta Crystallogr. D52, 57--64 (1996).

DSSP
    W.Kabsch and C.Sander,
      Dictionary of protein secondary structure: pattern
      recognition of hydrogen bond and geometrical features
    Biopolymers 22, 2577--2637 (1983).

Hydrogen bond networks
    R.W.W.Hooft, C.Sander and G.Vriend,
      Positioning hydrogen atoms by optimizing hydrogen bond networks in
      protein structures
    PROTEINS, 26, 363--376 (1996).

Matthews' Coefficient
    B.W.Matthews
      Solvent content of Protein Crystals
    J. Mol. Biol. 33, 491--497 (1968).

Protein side chain planarity
    R.W.W. Hooft, C. Sander and G. Vriend,
      Verification of protein structures: side-chain planarity
    J. Appl. Cryst. 29, 714--716 (1996).

Puckering parameters
    D.Cremer and J.A.Pople,
      A general definition of ring puckering coordinates
    J. Am. Chem. Soc. 97, 1354--1358 (1975).

Quality Control
    G.Vriend and C.Sander,
      Quality control of protein models: directional atomic
      contact analysis,
    J. Appl. Cryst. 26, 47--60 (1993).

Ramachandran plot
    G.N.Ramachandran, C.Ramakrishnan and V.Sasisekharan,
      Stereochemistry of Polypeptide Chain Conformations
    J. Mol. Biol. 7, 95--99 (1963).

Symmetry Checks
    R.W.W.Hooft, C.Sander and G.Vriend,
      Reconstruction of symmetry related molecules from protein
      data bank (PDB) files
    J. Appl. Cryst. 27, 1006--1009 (1994).

Ion Checks
    I.D.Brown and K.K.Wu,
      Empirical Parameters for Calculating Cation-Oxygen Bond Valences
    Acta Cryst. B32, 1957--1959 (1975).

    M.Nayal and E.Di Cera,
      Valence Screening of Water in Protein Crystals Reveals Potential Na+
      Binding Sites
    J.Mol.Biol. 256 228--234 (1996).

    P.Mueller, S.Koepke and G.M.Sheldrick,
      Is the bond-valence method able to identify metal atoms in protein
      structures?
    Acta Cryst. D 59 32--37 (2003).

Checking checks
    K.Wilson, C.Sander, R.W.W.Hooft, G.Vriend, et al.
      Who checks the checkers
    J.Mol.Biol. (1998) 276,417-436.
