Solutions to Empower Analysts…….

Typical and conventional cycle for gaining intelligence from speech/text entails (a) Listening to hundreds of Intercepts by various language experts (b) language identification (c) assessing its intelligence value through manual and labor-intensive process of transcription & translation and (d) finally intelligence collection/collation. The cycle is not only inefficient, expensive and error prone, it also results in inherent and unacceptable delays that may render gathered intelligence of little or no value.

A structured approach to Intelligence gathering would be:-

Take a tour of our ‘technologies bouquet’, that facilitates analysts to process masses of Audio/Text data available, filter out poor quality data with no information of interest, quickly and efficiently find information of significance and facilitate timely action on the same.

Voice Filtering and Support

Language, accent, or the channel independent solutions that filter out 40 percent of poor quality Audio.

Speech Quality Estimation

Measures the quality of speech

Voice Activity Detection

Detects the audio part containing voice

Speaker
Diarization

Separates multiple speakers in an audio recording automatically.

Speaker Age Estimation

Estimates the speaker’s age group

Voice Biometrics

Language, accent and channel independent Solutions that facilitate accurate identification and search functions.

Powered by state-of-the-art deep neural networks (DNN), the Solutions achieve their functionality based on the comparison of the unique characteristics of a human voice (a voiceprint).

Language Identification

Gender Identification

Speaker Identification

Voice Analytics

Cutting-edge Language dependent Solutions for Advance analysis.

Quickly gain actionable insights in to Audio of the Specific language Modules installed.
Untitled design (80)

Keyword Spotting

Phoneme Recognition

Time Analysis Extraction

Waveform Denoiser

Voice Forensics

A simple light and portable tool customized for Law Enforcement and forensic experts.
The tool facilitates highly accurate speaker identification for criminal investigations and generates a customized analysis report for presentation to Courts or Forensic experts.

Salient Features:

Inputs requirements:

Outputs:

Speech Transcription

Fast and Accurate Transcription over a wide range of Languages
Highly accurate technology, that can be used for :-

Dissemination/Archive in print:

Preparing a synopsis and 100% accurate annotations of the recordings of interest is a very time-consuming task. Recordings are often in a foreign language, demanding a language expert, and the accuracy of automatic transcriptions relies on audio quality. Use our solution to:-
  • The language of a recording is detected automatically, speech transcribed accordingly
  • Annotations with a confidence level for each word are generated. 
  • The words with a low confidence level are highlighted so that an operator can manually correct them.

Pre-process of Audio for translation or any other analytics of audio intercepts in text domain

To meet the requirements of rapid translation of Audio in foreign language (Source) to language of interest (target), we offer solutions that:-

Text Translation

Accurate text translation solutions that work on over 110 Languages.
Solutions can be tailored to meet varying requirements of users namely:-
  • Manual Text input.
  • Text & Video Files. 
  • Websites.
  • Automatic Speech Recognition (ASR) systems to translate Voice Data
  • Optical Character Recognition (OCR) systems for image data translation
  • Language identification
  • Domain and topic classification
  • Named-entity recognition.

Voice Filtering and Supporting Technologies.

Take a tour of our ‘technologies bouquet’, that facilitates analysts to process masses of Audio/Text data available, filter out poor quality data with no information of interest, quickly and efficiently find information of significance and facilitate timely action on the same.

Take a tour of our technologies bouquet, that facilitates analysts to process masses of Audio/Text data available, filter out poor quality data with no information of interest, quickly and efficiently; find information of significance and facilitate timely action on the same.

Voice Filtering and Supporting Technologies.  These technologies assist Speech Processing by filtering out up to 40% of Audio that is of poor quality or that does not contain any voice. The technologies are independent of Language, accent, or the channel.

Typical and conventional cycle for gaining intelligence from speech/text entails (a) Listening to hundreds of Intercepts by various language experts (b) language identification (c) assessing its intelligence value through manual and labor-intensive process of transcription & translation and (d) finally intelligence collection/collation. The cycle is not only inefficient, expensive and error prone, it also results in inherent and unacceptable delays that may render gathered intelligence of little or no value.

Speech Quality Estimation

Measures the quality of speech

Voice Activity Detection

Detects the audio part containing voice

Speaker Diarization

Separates multiple speakers in an audio recording automatically.

Speaker Age Estimation

Estimates the speaker’s age group

Voice Biometrics Technologies

Powered by state-of-the-art deep neural networks (DNN), these technologies provide accurate identification and search functions. The technologies are independent of Language, accent, or the channel and achieve their functionality based on the comparison of the unique characteristics of a human voice (a voiceprint). 

Language Identification
  • Detect the language spoken and dialect automatically.
  • Filter the Audio for further processing by language dependent analytics technologies.
  • Well over 70 pre-trained language models provided. 
  • User can easily train the tool for any language of interest or improve/customize the pre-trained models supplied.
Gender Identification
  • Pre-filter audio files by identifying the Gender (male/female)
Speaker Identification
  • Search for and recognize a speaker automatically based on the uniqueness of their voice.
  •  Recognition by Voiceprint comparison against a database of suspects.
  • Suspects’ database can be built and improved dynamically.

Voice Analytics Technologies

Speech content filtered by Supporting and Biometrics modules can be processed by this set of cutting-edge speech technologies for advanced analysis of the speech content to quickly gain actionable insights. The technologies are language dependent and rely on specific language modules installed.

Keyword Spotting
  • Automatically detect specified keywords in speech and discover related audio content.
  •  Combines power of DNN with Acoustic based algorithms to automatically generate Pronunciations of specified Keyword.
  • Pronunciations (phonemes) form basis of search.
  •  Provision to add variants of pronunciation for each keyword or phrase.
  • Over 20 Languages supported.
Phoneme Recognition
  • Technology to convert (Transcribe) speech recordings and standard orthography into phoneme symbols for further use by Keyword Spotting Module.
  • Facilitates correction of possible mistakes by users for further improvement of Phonemes generated.
Time Analysis Extraction
  • Applicable to 2-Channel Recordings
  • Extract information about conversation flow in an Audio.
  • Identifies reaction times, cross talk and speaker responses in the two channels.
Waveform Denoiser
  • Remove noise in audio automatically and improve the audibility of speech for analysts.
  • Focused on better audibility to the human ear
  • Trained on various kinds of Noise

Voice Forensics Technology

Tools customized for Law Enforcement and forensic experts. Users have the Option to choose between (a) a simple light and portable tool to provide highly accurate speaker identification for criminal investigations that Generates a customized analysis report for presentation to Courts or Forensic experts.

Salient Features
  • Independent of language, accent, text and channel
  •  1:1 speaker comparison 
  • 1:N speaker identification for more complex cases.
  • Diarization for ease of working with audio recordings containing multiple speakers.
  • Search/visualization of the same phoneme sequences across audio files through a phoneme recognizer.
  •  Measures accuracy in a user’s data sets for evaluation purposes.
  • Enables Waveform Editing with tools such as a spectrum panel, voice activity Technology.
  •  Compatible with the widest range of audio sources possible GSM/CDMA, 3G, VoIP, landlines, etc.
  • Inputs requirements:-
    • Signal formats : WAV or RAW (8 or 16-bit linear coding), with A-law or Mu-law, PCM, 8 kHz+ sampling
    • Minimum speech signal duration for enrollment: 20+ seconds
    • Minimum speech signal for identification: 3+ seconds
  • Output
    • Scoring to a likelihood ratio (LR), log-likelihood ratio (LLR) and verbal presentation of results
    • Graphic presentation of the likelihood ratio (LR)
    • Detailed report output (expert opinion template automatically generated) for presentation of results (to a court or an investigation team)
Untitled design - 2023-10-11T221416.674

Speech Transcription Technology

Speech Transcription as a process can be a culmination of (a) Voice Analytics process for dissemination/archive in Print or (b) Pre-process for translation or any other analytics of audio intercepts in text domain.

STT for Dissemination/Archive in print. Preparing a synopsis and 100% accurate annotations of the recordings of interest is a very time-consuming task. Recordings are often in a foreign language, demanding a language expert, and the accuracy of automatic transcriptions relies on audio quality. Use our solution to:-

  • Convert speech into plain text automatically
  • Search for the topic of interest instantly. 
  • Quickly annotate speech content of call recordings with the combination of Language Identification and Speech to Text technologies. 
    • The language of a recording is detected automatically, speech transcribed accordingly
    • Annotations with a confidence level for each word are generated. 
    • The words with a low confidence level are highlighted so that an operator can manually correct them. 
  • If a natural language processing (NLP) layer is implemented, a synopsis proposal is also created and, along with the generated annotations, sent either directly to a corresponding operator based on the language or run through an offline translation layer first.

STT as a Pre-process for translation or any other analytics of audio intercepts in text domain.

To meet the requirements of rapid translation of Audio in foreign language (Source) to language of interest (target), we offer solutions that:-

  • Free users of dependence on Cloud based solutions, thus making transcription, cost effective while ensuring user data security.
  • Over 90 languages supported,  with high quality punctuation
  • Faster response due to minimal data exchange latency
  • Caters to varying operational needs by scalable deployment on local PC, organisations Intranet or extranet, 
  • Easily integrates with user applications or services.

Text Translation Technology

Our text translation solutions that work on over 110 Languages can be tailored to meet varying requirements of users namely:-

  • Cost vis-à-vis Accuracy requirements.
  • Cost vis-à-vis Security expectations by allowing flexibility in deployment as On-premise (Behind Organization’s Firewall) or Cloud Based solution.
  • Scalability from a PC based solution to a High End On-premise machine translation server which may be configured/sized as per the number of users, machines, data, etc.
  • Ease of integration into any Information Retrieval (IR) or communication systems to facilitate multilingual Information Retrieval and Document Exploitation (DOCEX).
  • Ease of integration with Customers’ OSINT and COMINT platform using standard APIs and partner connectors.
  • Translation of:-
    • Manual Text input.
    • Text & Video Files. 
    • Websites.
  • Integration with 
    • Automatic Speech Recognition (ASR) systems to translate Voice Data
    • Optical Character Recognition (OCR) systems for image data translation
  • Additional value added features like
    • Language identification
    • Domain and topic classification
    • Named-entity recognition.