Package com.nuix.superutilities.regex
Class RegexScanner
- java.lang.Object
- 
- com.nuix.superutilities.regex.RegexScanner
 
- 
 public class RegexScanner extends java.lang.ObjectClass for scanning a series of items with a series of regular expressions.
- 
- 
Constructor SummaryConstructors Constructor Description RegexScanner()
 - 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidabortScan()When running a scan by providing a Consumer callback, this will signal that further scanning should be aborted.voidaddPattern(java.lang.String title, java.lang.String expression)Adds a regular expression to be part of the scan with a given title.protected voidfireProgressUpdated(int value)Fires progress update if there is a callback listeningprotected voidfireScanError(RegexScanError error)Fires error event if there is a callback listening.booleangetCaptureContextualText()booleangetCaseSensitive()intgetContextSize()static java.lang.StringgetContextualSubString(java.lang.CharSequence textSequence, int matchStart, int matchEnd, int contextSize)java.util.List<java.lang.String>getCustomMetadataToScan()booleangetDotall()booleangetMatchNamedEntityValues()booleangetMultiline()java.util.Set<java.lang.String>getNamedEntityTypes()java.util.List<PatternInfo>getPatterns()java.util.List<java.lang.String>getPropertiesToScan()booleangetScanContent()booleangetScanCustomMetadata()booleangetScanProperties()static java.util.Map<java.lang.String,java.lang.String>getStringCustomMetadata(nuix.Item item, java.util.Set<java.lang.String> specificFields)Convenience method for converting the custom metadata fields of an item into a Map<String,String> so that regular expressions may be ran against them.static java.util.Map<java.lang.String,java.lang.String>getStringProperties(nuix.Item item, java.util.Set<java.lang.String> specificProperties)Convenience method for converting the metadata properties of an item into a Map<String,String> so that regular expressions may be ran against them.protected ItemRegexMatchCollectionscanItem(nuix.Item item)Scans a single itemjava.util.List<ItemRegexMatchCollection>scanItems(java.util.Collection<nuix.Item> items)Scans a series of items serially (no concurrency)voidscanItems(java.util.Collection<nuix.Item> items, java.util.function.Consumer<ItemRegexMatchCollection> callback)Scans a series of items, providing each item's matches to callback as they are obtained.voidscanItemsParallel(java.util.Collection<nuix.Item> items, java.util.function.Consumer<ItemRegexMatchCollection> callback)Scans a series of items, providing each item's matches to callback as they are obtained.voidscanItemsParallel(java.util.Collection<nuix.Item> items, java.util.function.Consumer<ItemRegexMatchCollection> callback, int concurrency)Scans a series of items, providing each item's matches to callback as they are obtained.voidsetCaptureContextualText(boolean captureContextualText)voidsetCaseSensitive(boolean caseSensitive)voidsetContextSize(int contextSize)voidsetCustomMetadataToScan(java.util.List<java.lang.String> fieldsToScan)voidsetDotall(boolean dotall)voidsetMatchNamedEntityValues(boolean matchNamedEntityValues)static voidsetMaxToStringLength(int maxLength)Configures the character count threshold in which the CharSequence TextObject of an item, obtained from the API, is first converted to a String object before being scanned for regular expression matches.voidsetMultiline(boolean multiline)voidsetNamedEntityTypes(java.util.Collection<java.lang.String> namedEntityTypes)voidsetPatterns(java.util.List<PatternInfo> patterns)voidsetPropertiesToScan(java.util.List<java.lang.String> propertiesToScan)voidsetScanContent(boolean scanContent)voidsetScanCustomMetadata(boolean scanCustomMetadata)voidsetScanProperties(boolean scanProperties)voidwhenErrorOccurs(java.util.function.Consumer<RegexScanError> errorCallback)Allows you to provide a callback which will be invoked when an error occurs during scanning.voidwhenProgressUpdated(java.util.function.Consumer<java.lang.Integer> callback)Allows you to provide a callback which will be invoked when progress updates occur.
 
- 
- 
- 
Method Detail- 
setMaxToStringLengthpublic static void setMaxToStringLength(int maxLength) Configures the character count threshold in which the CharSequence TextObject of an item, obtained from the API, is first converted to a String object before being scanned for regular expression matches. CharSequence may make use of less memory and perform slower but scanning value as a String may perform faster and user more memory.- Parameters:
- maxLength- Maximum text length that should be converted to a String before scanning
 
 - 
whenProgressUpdatedpublic void whenProgressUpdated(java.util.function.Consumer<java.lang.Integer> callback) Allows you to provide a callback which will be invoked when progress updates occur.- Parameters:
- callback- Callback to receive progress updates
 
 - 
fireProgressUpdatedprotected void fireProgressUpdated(int value) Fires progress update if there is a callback listening- Parameters:
- value- The progress value
 
 - 
whenErrorOccurspublic void whenErrorOccurs(java.util.function.Consumer<RegexScanError> errorCallback) Allows you to provide a callback which will be invoked when an error occurs during scanning.- Parameters:
- errorCallback- The callback to be invoked when errors occur
 
 - 
fireScanErrorprotected void fireScanError(RegexScanError error) Fires error event if there is a callback listening.- Parameters:
- error- The error which occurred
 
 - 
addPatternpublic void addPattern(java.lang.String title, java.lang.String expression)Adds a regular expression to be part of the scan with a given title. Creates a new instance ofPatternInfousing the values provided.- Parameters:
- title- The associated title
- expression- The Java regular expression string to add
 
 - 
scanItemspublic java.util.List<ItemRegexMatchCollection> scanItems(java.util.Collection<nuix.Item> items) Scans a series of items serially (no concurrency)- Parameters:
- items- The items to scan
- Returns:
- List of matches
 
 - 
scanItemspublic void scanItems(java.util.Collection<nuix.Item> items, java.util.function.Consumer<ItemRegexMatchCollection> callback)Scans a series of items, providing each item's matches to callback as they are obtained. Items are scanned in serial (no concurrency).- Parameters:
- items- The items to scan
- callback- Callback which will received each item's matches as they are obtained.
 
 - 
scanItemsParallelpublic void scanItemsParallel(java.util.Collection<nuix.Item> items, java.util.function.Consumer<ItemRegexMatchCollection> callback)Scans a series of items, providing each item's matches to callback as they are obtained. Items are scanned in parallel using a Java parallel stream.- Parameters:
- items- The items to scan
- callback- Callback which will received each item's matches as they are obtained.
 
 - 
scanItemsParallelpublic void scanItemsParallel(java.util.Collection<nuix.Item> items, java.util.function.Consumer<ItemRegexMatchCollection> callback, int concurrency) throws java.lang.ExceptionScans a series of items, providing each item's matches to callback as they are obtained. Items are scanned in parallel using a Java parallel stream. This differs from the methodscanItemsParallel(Collection, Consumer)in that this method invokes the parallel stream within a thread pool to allow for controlling how many threads are used.- Parameters:
- items- The items to scan
- callback- Callback which will received each item's matches as they are obtained.
- concurrency- Number of threads to create in worker pool that parallel stream is invoked in
- Throws:
- java.lang.Exception- if there is an error
 
 - 
scanItemprotected ItemRegexMatchCollection scanItem(nuix.Item item) Scans a single item- Parameters:
- item- The item to be scanned
- Returns:
- The matches for that item
 
 - 
getStringPropertiespublic static java.util.Map<java.lang.String,java.lang.String> getStringProperties(nuix.Item item, java.util.Set<java.lang.String> specificProperties)Convenience method for converting the metadata properties of an item into a Map<String,String> so that regular expressions may be ran against them.- Parameters:
- item- The item from which metadata properties will be pulled
- specificProperties- List of specific properties to be pulled. If null is provided, all properties will be pulled.
- Returns:
- Map of "stringified" metadata properties for the specified item
 
 - 
getStringCustomMetadatapublic static java.util.Map<java.lang.String,java.lang.String> getStringCustomMetadata(nuix.Item item, java.util.Set<java.lang.String> specificFields)Convenience method for converting the custom metadata fields of an item into a Map<String,String> so that regular expressions may be ran against them.- Parameters:
- item- The item from which metadata properties will be pulled
- specificFields- List of specific custom metadata fields to be pulled. If null is provided, all fields will be pulled.
- Returns:
- Map of "stringified" custom metadata fields for the specified item
 
 - 
getContextualSubStringpublic static java.lang.String getContextualSubString(java.lang.CharSequence textSequence, int matchStart, int matchEnd, int contextSize)
 - 
getScanPropertiespublic boolean getScanProperties() 
 - 
setScanPropertiespublic void setScanProperties(boolean scanProperties) 
 - 
getScanCustomMetadatapublic boolean getScanCustomMetadata() 
 - 
setScanCustomMetadatapublic void setScanCustomMetadata(boolean scanCustomMetadata) 
 - 
getScanContentpublic boolean getScanContent() 
 - 
setScanContentpublic void setScanContent(boolean scanContent) 
 - 
getCaseSensitivepublic boolean getCaseSensitive() 
 - 
setCaseSensitivepublic void setCaseSensitive(boolean caseSensitive) 
 - 
getMultilinepublic boolean getMultiline() 
 - 
setMultilinepublic void setMultiline(boolean multiline) 
 - 
getDotallpublic boolean getDotall() 
 - 
setDotallpublic void setDotall(boolean dotall) 
 - 
getCaptureContextualTextpublic boolean getCaptureContextualText() 
 - 
setCaptureContextualTextpublic void setCaptureContextualText(boolean captureContextualText) 
 - 
getContextSizepublic int getContextSize() 
 - 
setContextSizepublic void setContextSize(int contextSize) 
 - 
getPatternspublic java.util.List<PatternInfo> getPatterns() 
 - 
setPatternspublic void setPatterns(java.util.List<PatternInfo> patterns) 
 - 
getPropertiesToScanpublic java.util.List<java.lang.String> getPropertiesToScan() 
 - 
setPropertiesToScanpublic void setPropertiesToScan(java.util.List<java.lang.String> propertiesToScan) 
 - 
getCustomMetadataToScanpublic java.util.List<java.lang.String> getCustomMetadataToScan() 
 - 
setCustomMetadataToScanpublic void setCustomMetadataToScan(java.util.List<java.lang.String> fieldsToScan) 
 - 
getMatchNamedEntityValuespublic boolean getMatchNamedEntityValues() 
 - 
setMatchNamedEntityValuespublic void setMatchNamedEntityValues(boolean matchNamedEntityValues) 
 - 
getNamedEntityTypespublic java.util.Set<java.lang.String> getNamedEntityTypes() 
 - 
setNamedEntityTypespublic void setNamedEntityTypes(java.util.Collection<java.lang.String> namedEntityTypes) 
 - 
abortScanpublic void abortScan() When running a scan by providing a Consumer callback, this will signal that further scanning should be aborted.
 
- 
 
-