use OpenAI's moderation endpoint
We are using the moderation endpoint provided by OpenAI to filter the response if it is classified to belong in any of the following categories:
- hate
- hate/threatening
- self-harm
- sexual
- sexual/minors
- violence
- violence/graphic
More info in the OpenAi docs https://platform.openai.com/docs/guides/moderation/overview
Bug: T329058