AI Content Moderation
Better Messages offers AI-powered content moderation using OpenAI's moderation API to automatically detect and handle harmful content.
How it works
When enabled, messages are analyzed by OpenAI's moderation API before delivery. If harmful content is detected above the configured confidence threshold, the message is automatically held for moderation review instead of being delivered. Moderators can then review flagged messages in the admin Messages Viewer and approve or reject them. The system can detect multiple categories of harmful content and can also moderate images.
Key capabilities
- Automatic detection of harmful content before message delivery
- Multiple detection categories: hate speech, harassment, sexual content, violence, self-harm, illicit activities
- Configurable confidence threshold (0-1) for flagging sensitivity
- Image moderation in addition to text
- Flagged messages held for moderator review
- Role-based bypass for trusted user roles
- Powered by OpenAI's moderation API
How to enable
Navigate to WP Admin → Better Messages → Settings → Moderation.
- Enable AI Moderation — Turn on AI-powered content moderation
- Action — What happens when content is flagged (hold for review)
- Moderate Images — Also analyze images sent in messages
- Categories — Select which content categories to detect (hate, harassment, sexual, violence, self-harm, illicit)
- Confidence Threshold — How confident the AI must be before flagging (0-1, default 0.5). Lower values catch more content but may have more false positives
- Bypass Roles — User roles that are exempt from AI moderation
note
AI Content Moderation requires PHP 8.1 or higher and an OpenAI API key configured in Integrations → OpenAI.