Effectively managing ever-growing data stores to meet internal and external compliance requirements is more difficult than ever due to the increased volume of data and diversity of data sources. To continue to meet legal, business, and regulatory compliance challenges, businesses must be able to keep and protect important information and quickly find what’s relevant. To be prepared for investigations or for internal or external litigation, your organization needs to preserve relevant data for analysis and review.

The Advanced eDiscovery feature in the Office 365 Security and Compliance center can help you quickly and cost-effectively locate, identify, and retrieve relevant information. No need to move content to a separate archive to store, index, and process. Office 365 Advanced eDiscovery leverages machine learning with predictive coding and text analytics to analyze unstructured data sets to focus on the data that’s most relevant to your search criteria. Advanced eDiscovery provides the following capabilities:

  • Organizes content into themes using predictive coding and text analytics intelligence, so that reviewers and investigators can quickly browse and discover key content.
  • Includes the Relevance application, which can be trained by an attorney familiar with the litigation at hand to automatically identify content that’s relevant to the case. This helps reduce the scope of the data for review and thus saves review time.
  • Organizes content by reconstructing email threads, identifying exact and near-duplicates, and intelligently batches content for review. These structures optimize the review process, replacing a traditional linear review by allowing reviewers to review documents in closely related and tightly structured groups.
  • Extracts text from image files or objects within the files using optical character recognition (OCR). In many eDiscovery cases, most unindexed content is in image files. This functionality allows image-based text to be analyzed by Advanced eDiscovery analytics, reducing the amount of manual remediation work required to analyze image files.
Diagram showing the flow of information through the eDiscovery process
Figure: Information governance and eDiscovery in Office 365 help you filter a huge volume of data down to what’s most relevant for your needs.

The intelligent analytics in Advanced eDiscovery typically yield the following results:

  • Grouping near-duplicate content reduces review time and cost by between 20% and 30%.
  • Grouping emails in threads reduces the review time and cost by between 40% and 60%.
  • Using relevance resulted in a savings of between 60% and 90%.

Once you’re done with your analysis, you can export your reduced data set from Advanced eDiscovery. All the content can be downloaded to a local workstation or copied to another location. The content is exported with associated HTML representations and text files to help review the output with third-party tools, if required. The export package includes a load file in CSV format. The CSV includes the metadata of the exported content and all the analytics metadata needed to organize the content, such as near-duplicates, threads, and themes.

Content Search and eDiscovery-related activities that are performed in the Security & Compliance Center or run with the corresponding PowerShell cmdlets are logged in the Office 365 audit log. Events are logged when administrators or anyone who’s assigned eDiscovery permissions perform certain Content Search and eDiscovery-related tasks in the Security & Compliance Center.

PowerShell scripts can also be used to run searches and export files. It’s recommended to leverage PowerShell script to automate the process.

Conclusion

I hope this blog post has helped you to clarify how Office 365 Advanced eDiscovery can help you reduce data volume and review cost. Microsoft is investing heavily to expand eDiscovery capabilities to help customers achieve more. If you’d like to learn more, feel free to contact us at [email protected].