How to Perform Content Moderation Over Audio Files with Python

Original Source Here

Content Moderation over audio and video files with Python

AssemblyAI API offers a content moderation functionality that allows you to perform Content Safety Detection. In the next few sections, will be using it in our step by step tutorial to detect whether a sensitive topic was mentioned in the input file and if so, when and what was spoken. A few topics that can be captured by the API include accidents, disasters, company financials, gambling and hate speech (there are many more than that, so if you are looking for other topics make sure to check the documentation that contains a detailed list of topics covered).

Now let’s get started with an audio file over which we are going to perform speech-to-text and at the same time we will also enable the content moderation feature in order to detect whether the audio file speaks about any of the sensitive topics supported by the API. Note that if you want to follow this step by step guide, you will need an API access token from AssemblyAI which is absolutely free.

Now that we have the Access Token for the API, let’s do some required imports and prepare the headers for the requests that will be sending to AssemblyAI endpoint.

Importing requests library and preparing the headers for the POST request — Source: Author

Then we need to read in the audio file over which we want to perform speech recognition and content moderation, in order to upload it to the corresponding hosting service of AssemblyAI which in turn will return the URL that we’ll use in sub-sequent requests.

Upload the input file to the hosting service of AssemblyAI — Source: Author

Now that we have uploaded the audio file to the hosting service and received back the upload_url from the endpoint, we can proceed forward and perform the actual speech-to-text task, combining it with sensitive content detection. Note that in order to enable the content moderation feature, we also need to pass content_safety parameter to True.

Performing speech-to-text and sensitive content detection — Source: Author

The response (shown in comments above) will include the transcript ID for the transcription that has been made. The final step involves a GET request using the id. Note that we keep sending GET requests until the status of the response is marked as completed (or error ):

Finally, we write the response from the transcription endpoint to a file:

Writing the response from transcription endpoint to a file — Source: Author


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: