Setting a deviation level
Adjust the deviation level to control how similar the files must be to be grouped. If you select a deviation that is 0, or close to 0, then the files must be very similar, if you choose a higher number then more variation is allowed.
For most purposes, start with a deviation level of 30; this should indicate how varied your files are. You may adjust the value after the first analysis.
Select
Save and cluster
to start the analysis.
A
Cluster
window confirms how many files will be analysed. Select
Cluster
to continue.
All Word and PDF documents on your site will be analysed. You can choose to ignore a selection of documents after they are added to the list of clusters.
As the analysis may take some time, a status message is displayed in the top navigation bar:
If you need to stop or change the analysis, select
Cancel
in the Clusters window. You may then change the deviation and restart the process.
When the results are ready, a list of documents is shown.
Click on the arrow next to a clustered document to see the files in that cluster, each with a reason and a percentage score for the match:
The top-level 'representative' file is often the signed contract
Contracts and templates must be in Word, PDF or TXT format. If a Word document is compared to a PDF equivalent, then the match may be lower than expected (i.e. more changes) as differences in the underlying file formats may be included as changes to the text
If a cluster includes an
Exact match
, then it is likely that the files are identical. This allows you to manage duplicate files.
Select
Save and Cluster
again to include any documents added to the Files module since the last analysis.
This also removes any manual changes; except documents that have been ignored (see below), which will not be included in any future clusters