Using the Microsoft Office 365 Endpoint

Learn how to ingest data from Microsoft Office 365 using Nuix RESTful Service.

Before you begin, ensure that your Azure tenant has been configured as described in the previous Microsoft Office 365 topics.

When processing data into a Nuix case, there are many configurable parameters that can be used to tailor what data is retrieved, how it is processed, where it is stored, and what resources should be used to perform the work. The details provided within this topic cover only the endpoint parameters related to retrieving data from Office 365.

For details about all other configurable parameters for the /cases/{caseId}/evidence/ms365 endpoint, view the reference documentation.

Use the following details to ingest data from the following Microsoft Office 365 applications:

  • Microsoft Teams
  • Microsoft Exchange
  • Microsoft SharePoint
  • Microsoft OneDrive for Business

To get started, first create a case then submit a POST request to the /cases/{caseId}/evidence/ms365 endpoint.

The following example performs a basic ingestion from Office 365, retrieving Teams chats and the Exchange mailbox of a single user from a time period of one year.

Ingestion examples

curl -L -X POST 'http://127.0.0.1:8080/nuix-restful-service/svc/cases/36c83aabe9d74808ac1d47300909b016/evidence/ms365' \
-H 'nuix-auth-token: 974cd40d-a5b8-45b1-936f-9f3745a783be' \
-H 'Content-Type: application/json' \
--data-raw '{
    "processingProfile": "Default",
    "target": {
        "tenantId": "<tenantId>",
        "clientId": "<clientId>",
        "clientSecret": "<clientSecret>",
        "from": "2019-01-01",
        "to": "2021-01-01",
        "username": "user@email.com",
        "password": "pAsSwOrD",
        "userPrincipalNames": [
            "user@email.com"
        ],
        "mailboxRetrievals": [ "MAILBOX"],
        "retrievals": [
            "USERS_CHATS", "USERS_EMAILS"
        ]
    }
}'
from org.joda.time import DateTime
import sys

casepath="<YourCasePath>"
casename="Ms365Python"
clientId="<YourClientID>"
tenantId="<YourTenantID>"
clientSecret="<YourClientSecret>"
username="<YourUsername>"
password="<YourPassword>"
current_case = utilities.caseFactory.create(casepath + casename)

######## START_OF_MS365_INGESTION_HERE #########
location = utilities.getLocationFactory().makeMicrosoftGraphLocation(
    tenantId,
    clientId,
    clientSecret,
    username, 
    password)

location = location.withTeamNames(['Managers'])
location = location.withStartDate(DateTime(2020, 1, 1, 0, 0, 0, 0))
location = location.withEndDate(DateTime(2021, 9, 13, 0, 0, 0, 0))
location = location.withRetrievals(["TEAMS_CHANNELS", "TEAMS_CALENDARS", "USERS_CHATS", "USERS_CONTACTS", "USERS_CALENDARS", "USERS_EMAILS", "ORG_CONTACTS", "SHAREPOINT"])
location = location.withMailboxRetrievals([])

print("Validating location...")
# Validating a location is optional but we need to ensure our functionality works well here too:
validationResult = utilities.getLocationFactory().validateLocation(location, None)
print("Validation Result: " + validationResult.toString())
print("Starting Processing...")

processor = current_case.createProcessor()
evidenceContainer = processor.newEvidenceContainer("scriptEvidence")
evidenceContainer.addLocation(location)
evidenceContainer.save()
processor.process()
current_case.close()
print("...Processing Complete!")

################################################
java_import org.joda.time.DateTime

casepath="<YourCasePath>"
casename="Ms365Python"
clientId="<YourClientID>"
tenantId="<YourTenantID>"
clientSecret="<YourClientSecret>"
username="<YourUsername>"
password="<YourPassword>"
current_case = utilities.caseFactory.create(casepath + casename)

######## START_OF_MS365_INGESTION_HERE #########
location = utilities.getLocationFactory.makeMicrosoftGraphLocation(
    tenantId,
    clientId,
    clientSecret.to_java().toCharArray(),
    username, 
    password.to_java().toCharArray())

location = location.withTeamNames(['Managers'])
location = location.withStartDate(DateTime.new(2020, 1, 1, 0, 0, 0, 0))
location = location.withEndDate(DateTime.new(2021, 9, 13, 0, 0, 0, 0))
location = location.withRetrievals(["TEAMS_CHANNELS", "TEAMS_CALENDARS", "USERS_CHATS", "USERS_CONTACTS", "USERS_CALENDARS", "USERS_EMAILS", "ORG_CONTACTS", "SHAREPOINT"])
location = location.withMailboxRetrievals([])

puts "Validating location..."
# Validating a location is optional but we need to ensure our functionality works well here too:
validationResult = utilities.getLocationFactory().validateLocation(location, nil)
puts "Validation Result: " + validationResult.to_s
puts "Starting Processing..."

processor = current_case.createProcessor()
evidenceContainer = processor.newEvidenceContainer("scriptEvidence")
evidenceContainer.addLocation(location)
evidenceContainer.save()
processor.process()
current_case.close()
puts "...Processing Complete!"

################################################

var DateTime = Java.type("org.joda.time.DateTime");

casepath="<YourCasePath>"
casename="Ms365Js"
clientId="<YourClientID>"
tenantId="<YourTenantID>"
clientSecret="<YourClientSecret>"
username="<YourUsername>"
password="<YourPassword>"
current_case = utilities.caseFactory.create(casepath + casename)

/********* START_OF_MS365_INGESTION_HERE ************/
var location = utilities.getLocationFactory().makeMicrosoftGraphLocation(
    tenantId,
    clientId,
    clientSecret.split(''),
    username, 
    password.split(''));

location = location.withTeamNames(['Managers']);
location = location.withStartDate(new DateTime(2020, 1, 1, 0, 0, 0, 0));
location = location.withEndDate(new DateTime(2021, 9, 13, 0, 0, 0, 0));
location = location.withRetrievals(["TEAMS_CHANNELS", "TEAMS_CALENDARS", "USERS_CHATS", "USERS_CONTACTS", "USERS_CALENDARS", "USERS_EMAILS", "ORG_CONTACTS", "SHAREPOINT"]);
location = location.withMailboxRetrievals([]);

print("Validating location...");
// Validating a location is optional but we need to ensure our functionality works well here too:
validationResult = utilities.getLocationFactory().validateLocation(location, null);
print("Validation Result: " + validationResult.toString());
print("Starting Processing...");

processor = current_case.createProcessor();
evidenceContainer = processor.newEvidenceContainer("scriptEvidence");
evidenceContainer.addLocation(location);
evidenceContainer.save();
processor.process();
current_case.close();
print("...Processing Complete!");

/***************************************/
{
    "done": true,
    "cancelled": false,
    "result": true,
    "token": "974cd40d-a5b8-45b1-936f-9f3745a783be",
    "functionKey": "58abc476-c7d0-44c5-b71c-4abd2982212f",
    "progress": 0,
    "total": 0,
    "percentComplete": null,
    "updatedOn": 1627314002180,
    "status": null,
    "statusId": null,
    "requestTime": 1627313971596,
    "startTime": 1627313971596,
    "finishTime": 1627314004551,
    "caseId": "36c83aabe9d74808ac1d47300909b016",
    "caseName": "MSOffice365",
    "hasSuccessfullyCompleted": true,
    "friendlyName": "Evidence Ingestion Function",
    "caseLocation": "C:\\ProgramData\\Nuix\\NuixCases\\MSOffice365",
    "requestor": "nuix",
    "licenseShortName": "enterprise-workstation",
    "action": "AsyncBulkIngestionFunction",
    "options": {
        "reloadQuery": null,
        "processorSettings": {
            "processText": null,
            "processLooseFileContents": null,
            "processForensicImages": null,
            "analysisLanguage": null,
            "stopWords": null,
            "stemming": null,
            "enableExactQueries": null,
            "extractNamedEntities": null,
            "extractNamedEntitiesFromText": null,
            "extractNamedEntitiesFromProperties": null,
            "extractNamedEntitiesFromTextStripped": null,
            "extractNamedEntitiesFromTextCommunications": null,
            "extractShingles": null,
            "processTextSummaries": null,
            "calculateSSDeepFuzzyHash": null,
            "calculatePhotoDNARobustHash": null,
            "detectFaces": null,
            "classifyImagesWithDeepLearning": null,
            "imageClassificationModelUrl": null,
            "extractFromSlackSpace": null,
            "carveFileSystemUnallocatedSpace": null,
            "carveUnidentifiedData": null,
            "carvingBlockSize": null,
            "recoverDeletedFiles": null,
            "extractEndOfFileSlackSpace": null,
            "smartProcessRegistry": null,
            "identifyPhysicalFiles": null,
            "createThumbnails": null,
            "skinToneAnalysis": null,
            "calculateAuditedSize": null,
            "storeBinary": null,
            "maxStoredBinarySize": null,
            "maxDigestSize": null,
            "digests": [],
            "addBccToEmailDigests": null,
            "addCommunicationDateToEmailDigests": null,
            "reuseEvidenceStores": null,
            "processFamilyFields": null,
            "hideEmbeddedImmaterialData": null,
            "reportProcessingStatus": null,
            "enableCustomProcessing": [],
            "performOcr": null,
            "ocrProfileName": null,
            "createPrintedImage": null,
            "imagingProfileName": null,
            "exportMetadata": null,
            "metadataExportProfileName": null,
            "workerItemCallback": null,
            "workerItemCallbacks": null,
            "traversalScope": null,
            "namedEntities": null
        },
        "evidence": [
            {
                "guid": null,
                "name": null,
                "customMetadata": null,
                "encoding": null,
                "custodian": null,
                "timeZone": null,
                "description": null,
                "locale": null,
                "files": null,
                "exchangeMailboxes": null,
                "s3Buckets": null,
                "sqlServers": null,
                "oracleServers": null,
                "enterpriseVaults": null,
                "sharepointSites": null,
                "mailStores": null,
                "loadFiles": null,
                "centeraClusters": null,
                "splitFiles": null,
                "dropboxes": null,
                "sshServers": null,
                "twitterLocations": null,
                "documentumServers": null,
                "ms365Locations": [
                    {
                        "type": "microsoft_graph",
                        "username": "************",
                        "password": "************",
                        "tenantId": "<tenantId>",
                        "clientId": "<clientId>",
                        "clientSecret": "************",
                        "certificateStorePath": null,
                        "certificateStorePassword": null,
                        "from": 1546315200000,
                        "to": 1625630400000,
                        "teamNames": null,
                        "userPrincipalNames": [
                            "user@email.com"
                        ],
                        "retrievals": [
                            "USERS_CHATS",
                            "USERS_EMAILS"
                        ],
                        "mailboxRetrievals": [
                            "MAILBOX"
                        ],
                        "versionFilters": null
                    }
                ]
            }
        ],
        "localWorkerCount": 2,
        "repositories": [],
        "parallelProcessingSettings": {
            "workerCount": null,
            "workerMemory": null,
            "workerTemp": null,
            "brokerMemory": null,
            "workerBrokerAddress": null,
            "useRemoteWorkers": false,
            "embedBroker": true
        },
        "rescanEvidenceRepositories": false,
        "loadProcessingJob": {
            "casePath": "C:\\ProgramData\\Nuix\\NuixCases\\MSOffice365",
            "jobGuid": "b9f4933c-96b5-4bc1-9d48-23f1046da79e",
            "processingMode": "Load",
            "startDate": 1627313976482,
            "workerCount": 2,
            "finished": true,
            "paused": false,
            "masterAddress": "192.168.56.1",
            "bytesProcessed": 0,
            "itemsProcessed": 2,
            "jobSizeTotalBytes": 0
        }
    },
    "participatingInCaseFunctionQueue": true,
    "processedBy": "72670a5e-f0e1-4046-8b2e-22a42fe2ac9a",
    "errorMsg": null
}

A complete list of target parameters for this endpoint are described below.

Credential Parameters

The following parameters are used to authenticate with the Microsoft Azure tenant.

Parameter Required Description
tenantId Yes The ID of the Azure Active Directory (AAD) tenant where the Azure authentication application is registered.
clientId Yes The application ID of the registered authentication application.
clientSecret Yes The authentication key string associated with the Application (Client) ID.
Note: Use this parameter if not authenticating with a private key certificate.
certificateStorePath Yes The location of a PKCS#12 based private key certificate.
Note: Use this parameter if not authenticating with a client secret.
certificateStorePassword Yes The password associated with the PKCS#12 based private key certificate.
username No The username of a Microsoft Teams user to retrieve calendar data from.
Note: Required only if the TEAMS_CALENDARS retrieval parameter has been defined.
password No The password associated with a Microsoft Teams user.

Scoping Parameters

The following parameters can be used to narrow the scope of data ingested into your case.

parameter Required Description
from Yes Retrieve only the items with a Last Modified date on or after the specified date.
If a Last Modified date does not exist, the item’s Creation Date will be used.
to Yes Retrieve only the items with a Last Modified date on or before the specified date.
teamNames No Scope the ingestion to specific Microsoft Teams by specifying a semicolon separated list of Team names.
userPrincipalNames No Scope the ingestion to specific users by specifying a semicolon separated list of Office 365 User Principal Names.
Note: A User Principal Name (UPN) is the name of a Windows Active Directory system user in the format of an email address. For example: john.doe@domain.com.

The from and to date filters apply only to the following Office 365 item types:

  • Exchange mail messages
  • Teams chat messages
  • Calendar events (filtered by scheduled event start time)
  • Resources from SharePoint and OneDrive for Business

Retrieval Parameters

The following retrieval parameter options are available to specify which Office 365 data is collected.

Retrievals Service Retrieves…
TEAMS_CHANNELS Teams Channel data such as chat messages and attachments from all public channels within a team.
Note: User emoji reactions to chat messages are included as child items of the parent message.
TEAMS_CALENDARS Teams Teams calendar data from all teams that a specific user is a member of.
Note: If selected, the username and password of a Microsoft Teams user must be provided in order for their calendar data to be retrieved.
USERS_CHATS Teams Chat messages from one-on-one or group conversations that take place outside of a public channel.
Note: See Private Chat Limitations for additional information on how user chats are retrieved.
USERS_CALENDARS Teams Individual user calendar data from the members of a team.
USERS_CONTACTS Exchange Personal contacts that have been saved by individual users.
USERS_EMAILS Exchange Individual user mailboxes.
Tip: Select Extract from mailbox slack space when configuring your data processing settings to also retrieve deleted and other lower-level slack space items from the mailbox.
ORG_CONTACTS Exchange Contacts created by an administrator that are shared to all users in an organization. Also known as Mail Contacts within Exchange.
SHAREPOINT SharePoint Data from SharePoint sites, subsites, lists, and users.
Note: The retrieval of SharePoint list attachments is not supported.

Private Chat Limitations

Due to restrictions in how private chat data is retrieved using the Microsoft Graph API, certain limitations exist when determining the recipients of specific messages within a chat. Limitations include:

  • Chat/conversation IDs are only generated when a new chat is initiated. The generated ID is persisted for the entire life of the chat, regardless of whether participants are later added or removed.
  • The To communication metadata property, which identifies the participants of a chat, represents only the list of participants that are included in the chat at the time of ingestion. Due to this limitation, the following must be considered:
    • New participants who are added to an existing conversation are identified within the metadata as a chat recipient. However, based on the Microsoft Teams chat history settings selected when adding the new participant, they may or may not have been granted access to view messages that were sent prior to them joining the chat.
    • Participants who left a chat after receiving messages will not be identified within the metadata as a recipient at all.

Workaround:

To establish a complete view of recipients and verify who received messages from a private chat conversation, the specified private conversation must be examined from the /Users/<userName>/Chats directory of each suspected participant. For example:

To verify if John saw a specific private chat message, the messages located within /Users/John/Chats must be examined. Looking at the same chat but from a different participant, /Users/Jane/Chats, may not properly confirm if John saw a specific message because Jane’s chat history may be incomplete and reflect only a portion of the conversation due to the limitations previously noted.

Mailbox Retrieval Parameters

The following mailbox retrieval parameter options are available to specify which Exchange mailbox data collected.

Mailbox Retrievals Required Retrieves…
MAILBOX No All current mailbox data, including recoverables such as deletions and purges.
DELETIONS No Items that have been deleted from any folder and placed in the Deleted items default folder.
RECOVERABLE_ITEMS No Items that have been deleted from the Deleted items default folder.
PURGES No Deleted items that have been marked to be removed from the mailbox database.
ARCHIVE No All archived mailbox data, including archived recoverables such as deletions and purges.
ARCHIVE_DELETIONS No Archived items that have been deleted from any folder and placed in the Deleted items default folder.
ARCHIVE_RECOVERABLE_ITEMS No Archived items that have been deleted from the Deleted items default folder.
ARCHIVE_PURGES No Archived deleted items that have been marked to be removed from the mailbox database.
PUBLIC_FOLDERS No Shared folders within an organization.

Version Filter Parameters

The following version filter parameters can be configured to also include item versions during ingestion. With these parameters, you can choose to ingest only the latest version of a item, a selection of recent versions, or all available versions.

Version filters apply to all files and attachments retrieved from Microsoft Office 365 in which versions are maintained, including data from SharePoint, OneDrive for Business, and Teams.

If the following parameters are left undefined only the latest version of a file or attachment will be retrieved.

Version Filters Required Description
versionRetrievalEnabled No Set to true to retrieve all versions of an item. Set to false (default) to retrieve only the latest version of an item.
versionRetirevalLimit No Specify an integer value to limit the number of retrieved versions to a specific number of recent versions. Requires versionRetrievalEnabled be set to true. A value of -1 will retrieve all available versions.

Version Metadata

When item versions are included in a case, the following version related metadata is applied to all applicable items:

Metadata Value Description
Flag: versioned boolean Indicates that multiple versions of the selected item exist and have been ingested.
Flag: current_version boolean Indicates if an item is the most recent version. Determined at the time of ingestion.
Property: Version Age integer Identifies the version of the selected item in relation to the latest known version.
A value of 0 represents the latest version. Each increment of 1 indicates a subsequent older version.
Warning: If the data within a case is updated by reloading the source data or scanning for new child items, the Version Age of an item will change if newer versions of that item are found.
Property: Version Group ID string A unique identifier that is used to link all versioned items that originate from the same item.
The Version Group ID value is derived from the original item’s ID value.
Property: Version Value integer Identifies the exact version of the selected item based on source data.
Note: Microsoft 365 uses a basic sequential numbering method (1.0, 2.0, 3.0) when assigning version values. Other data sources may assign non-readable values, such as a hash, when defining this value.

Querying Items Based on Version Metadata

The following table contains example queries that can be used after ingestion to locate versioned items based on their metadata:

Query Returns…
Flag: versioned All items that have multiple versions.
flag:(versioned AND current_version) The latest version of each item that has multiple versions.
flag:(versioned AND NOT current_version) All versions of an item except for the latest version.
integer-properties:"Version Age":(0 to 10) The most recent version (indicated by 0) and the previous 10 versions.
properties:"Version Value:2.0" All items that have a version value that matches the value specified in the query (2.0).
properties:"Version Group ID:<GROUPIDGUID>" All versions of an item, including the original, that match the group ID specified in the query.