The Collector should consider transcribing collected audio data that includes a human voice in order to create a written record that faithfully reflects the content of the audio data. Collected audio data that includes a human or generated voice may also need to be translated, e.g., when it includes a voice speaking in a language other than that spoken by the personnel who will assess the audio data’s relevance. The Collector may also consider translating the audio into the official language(s) of the intended recipient(s) of the audio data, e.g., certain domestic or international courts or tribunals. The Collector should consider taking steps to ensure the translation is thorough, accurate, and impartial.
Transcriptions and translations form a part of the collected audio’s associated metadata (see BP 17). They should therefore be appropriately included in the relevant audio data file (see BP 11). Notably, if the audio has been anonymised or partially deleted—for example, for privacy reasons per BP 5—the same should be done to the audio’s transcription and/or translation.
Both transcription and translation may be conducted automatically by appropriate software or manually by personnel. The tools and techniques used should be thoroughly documented. Per BP 8, the Collector should assess the likelihood of whether its transcription or translation tools or techniques may generate incomplete and/or potentially inaccurate data, or whether using the tools poses any risk to the security of the collection effort or the privacy or security of the data subject. As part of the collection effort’s risk management approach, particular care should be paid to the tool’s potential use of cloud-based resources and the associated risks (see BP 6).
If the voice-containing audio is not adequately or wholly intelligible, the resulting transcription and translation may indicate the sections which are unintelligible by, for example, noting /unintelligible/ in the text. In the event of possible or known vulnerabilities, such as an inaccurate transcription or translation, the vulnerabilities must be documented along with any measures taken to ameliorate them.1
Tech Specs & Resources
Certain translation and transcription tools (e.g., Whisper, Otter.ai) can automatically, preliminarily estimate whether the audio data is likely to contain a human voice or not. They may also be used to detect the primary language spoken, transcribe the audio in its original language, and/or translate the audio.
The tool(s) used should support the range of languages likely to be spoken in the audio data collected, as well as the language(s) to which audio would have to be translated.
Note: The use of such a translation and/or transcription tool may involve the transfer of the audio outside of the Collector’s possession and control, which may pose a risk to the privacy and/or security of the collection effort.
Legal Framework
See section 5.2. on the importance of translations and transcriptions when establishing the relevance of potential evidence.
Applicable Ethical Principles Accuracy, Impartiality, and Objectivity.
Footnotes
-
This is an extension of the requirement stated in BP 3 that Collector ‘personnel must at all times strive to document the collection effort in a manner that is as consistent, clear, and transparent as possible’: Prosecutor v Ongwen (ICC), Trial Judgment, para. 658; Prosecutor v Ongwen (ICC), Confirmation of Charges, para. 51; Prosecutor v Ongwen (ICC), Transcript, para. 44, lines 8-24; Prosecutor v Tolimir (ICTY), Judgment, para. 64, referring to Prosecutor v Tolimir (ICTY), Transcript, page 5033; Prosecutor v Blagojević and Jokić (ICTY), Decision on Admission of Intercept Materials, para. 21; Prosecutor v Katanga and Chui (ICC), Decision on Bar Table Motion, para 30. See the discussion in sections 5.2. and 5.3. of the Legal Framework. ↩