Learn how to configure safeguard warnings to get alerts to the use of problematic content.
The safeguard features of Narus create warning on the Administration Dashboard to alert you whenever a user enters something potentially problematic into an AI model prompt. The feature does not prevent users from sending these prompts it is designed to allow Admins to monitor usage of the AI models and act accordingly. The alert triggers are configured in the Safeguarding screen. To view the prompts that triggered the alert see the User Logs.
Within the Safeguard screen you can identify words, topics and formats that should generate alerts. You can also determine the level of the warning to help you quickly identify the more serious issues.
To get to the Safeguard screen navigate to the Configuration tab of the Administration Dashboard and select Safeguard.
Topics are a good way to quickly set up alerts. By identifying a specific topic that you want to be alerted to it can save you from creating a long list of words that might indicate the topic.
To add a topic select Add topics.
Enter the name of the topic.
Use the dropdown to choose the warning level assigned to alerts triggered by this topic.
Select Add topic.
The topic is added to the list of warning topics.
Identifying a word means that it will always generate an alert if it is used in a prompt. This can be helpful if you want to be alerted to the use of certain words regardless of the context they are used in.
To add a word select Add word.
Enter the word.
Use the dropdown to choose the warning level assigned to alerts triggered by this word.
Select Add new word.
Alerts can be generated by looking for prompts that contain data that fits a certain format. This can help to identify if your users are entering personal data, such as a phone number. You can chose to use the included standard formats as triggers or not. Use the toggles to turn alerts on or off.
To remove any of the triggers select the trash icon next to the trigger.
To change the warning level for any trigger select the cog icon next to the trigger.
Use the dropdown to select a new warning level.
Select Save.