Consistency Check (Near-Duplicates)

Consistency Check for Responsiveness, Privilege, Work-Product & Confidentiality

This technical note will identify documents that potentially should be marked responsive, privileged, work-product or confidential, based on computer identification of near-duplicates.


Near-Duplicates contain
contextually similar sentences or subject, but are not exact matches. They have significantly similar versions of documents that differ by, for example, a few sentences, words or paragraphs. Our Near Duplication technology identifies and groups documents that are at least 50% similar in text content. MORE

Checking for Documents Consistency after Applying Near Duplication

The steps below show you how to check for Responsiveness, Privilege, Work-Product & Confidentiality:

Step 1: Apply Near Duplication to the documents in the case

From the 'Case>Add Case Documents'
by clicking on the 'Calculate Groups' button.   This feature will only mark near-duplicates, not delete them (no batch title selection necessary). When started, all existing files with similar contents will be grouped across the entire case (e.g. 'Group 1', 'Group 2', etc). Care should be used in re-running near duplication as different files may be identified as near-duplicates in subsequent runs. Only Account Admin Users can apply near duplication (although all users can view). MORE

Step 2: Go to the Browse page

Open the Browse page, go to the 'Fields>Show Fields' section and select the following headers: Extension, Subject, Senders, Receivers, Date Time Sent, Near Dup Group, Responsive, Work Product and Privilege. 

Step 3In the section Filter> Select Filter, apply filter on 'Extension = MSG' and 'Near Dup Group>Show Near Dup Groups'. This will display all email within groupings. Also, filter on 'Date Sent' before (a date outside of range) and 'Date After'.  

Step 4.  Save and Share filters you've applied by using the 'Filter Quick Links feature' (e.g.'Email Threads'). The saved and shared filters allows you to access specific filters for further review and can be viewed by other users in the case

Step 5. If you want to narrow down the results and show only one specific Near Dup Group, you can also apply filters by 'Near Dup Group No.', for example 3472.

Step 6. Sort by 'Master Date'  and 'Near Dup Group No.' to view emails within multiple groups.

Step 7
Analyze and resolve, if appropriate, any inconsistencies in coding within a group.

Email Threading

Detect and work with similar emails part of an email chain together after applying Near Duplication to the documents in the data set. This page will help you identify related email families by content in order to identify all the emails in a group, detect missing emails, and give you the option to keep only the relevant final email messages that needs to be reviewed. MORE

How to Identify Large NearDup Documents Grouping 

For more information please visit our technical page.

Mass Tagging Near-Duplicates

You can review and tag multiple documents detected under the near-duplicate group. MORE

Further Assistance

We also offer Project Management and Technical Services if engaged to support your
near-duplicates efforts by helping to execute specific requests for document identification. Please contact your sales rep or our Support Center if needed.