Mass Tagging Near-Duplicates

This technical note will show you how to review and tag together all the documents identified as near duplicates, speeding your document review.

The steps below show you how to apply massive tagging to near-duplicates:

Step 1: Run or Update Near Duplication to the documents in the case

From the 'Case>Add Case Documents'
by clicking on the 'Calculate Groups' button.   This feature will only mark near-duplicates, not delete them (no batch title selection necessary). When started, all existing files with similar contents will be grouped across the entire case (e.g. 'Group 1', 'Group 2', etc). Care should be used in re-running near duplication as different files may be identified as near-duplicates in subsequent runs. Only Account Admin Users can apply near duplication (although all users can view). MORE

Step 2. If you want to narrow down the results and show only one specific Near Dup Group, you can also apply filters by 'Near Dup Group No.', for example 3472.

Step 3: Copy or save a report with the group number of the larger groups

Step 4: Go to the Browse page

Open the Browse page, go to the 'Fields>Show Fields' section and select the following headers: Date, Pages, Words, Responsiveness, Privilege, Work-Product & Confidentiality

Step 5: Filter on 'Group' = each of the groups to review and code one at a time (Filter>Select Filter>Group).

Step 6: Tagging using 'Multi Doc Edit'

If the documents in a group are sufficiently similar for mass tagging, then select all in a group and apply Multi-Doc edit for the tags of Responsive, Privilege, WP, Confidentiality, etc., as appropriate. The same can be done with Custom Doc Fields.

Modifying Document Coded Data & Tag Multiple files (Modify Multiple Columns Simultaneously)

Depending on the size of the near-duplicate groups and how familiar you are with familiar with using Excel complex formulas and data management features for 'Quality Assurance & Control' during self-loading metadata, we recommend the Coding in Excel & Upload Metadata feature. MORE

How to Identify Large NearDup Documents Grouping 

For more information please visit our technical page.

Email Threading

Detect and work with similar emails part of an email chain together after applying Near Duplication to the documents in the data set. This page will help you identify related email families by content in order to identify all the emails in a group, detect missing emails, and give you the option to keep only the relevant final email messages that needs to be reviewed. MORE

Consistency Check (Near-Duplicates)

Identify documents that potentially should be marked responsive, privileged, work-product or confidential, based on computer identification of near-duplicates. MORE

Further Assistance

We also offer Project Management and Technical Services if engaged to support your
near-duplicates efforts by helping to execute specific requests for document identification. Please contact your sales rep or our Support Center if needed.