Skip to content

moderation

Classes:

Name Description
Categories
CategoryScores
Moderation

Categories

Attributes:

Name Type Description
harassment bool

Content that expresses, incites, or promotes harassing language towards any

harassment_threatening bool

Harassment content that also includes violence or serious harm towards any

hate bool

Content that expresses, incites, or promotes hate based on race, gender,

hate_threatening bool

Hateful content that also includes violence or serious harm towards the targeted

self_harm bool

Content that promotes, encourages, or depicts acts of self-harm, such as

self_harm_instructions bool

Content that encourages performing acts of self-harm, such as suicide, cutting,

self_harm_intent bool

Content where the speaker expresses that they are engaging or intend to engage

sexual bool

Content meant to arouse sexual excitement, such as the description of sexual

sexual_minors bool

Sexual content that includes an individual who is under 18 years old.

violence bool

Content that depicts death, violence, or physical injury.

violence_graphic bool

Content that depicts death, violence, or physical injury in graphic detail.

harassment instance-attribute

harassment: bool

Content that expresses, incites, or promotes harassing language towards any target.

harassment_threatening class-attribute instance-attribute

harassment_threatening: bool = Field(
    alias="harassment/threatening"
)

Harassment content that also includes violence or serious harm towards any target.

hate instance-attribute

hate: bool

Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste. Hateful content aimed at non-protected groups (e.g., chess players) is harassment.

hate_threatening class-attribute instance-attribute

hate_threatening: bool = Field(alias='hate/threatening')

Hateful content that also includes violence or serious harm towards the targeted group based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.

self_harm class-attribute instance-attribute

self_harm: bool = Field(alias='self-harm')

Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders.

self_harm_instructions class-attribute instance-attribute

self_harm_instructions: bool = Field(
    alias="self-harm/instructions"
)

Content that encourages performing acts of self-harm, such as suicide, cutting, and eating disorders, or that gives instructions or advice on how to commit such acts.

self_harm_intent class-attribute instance-attribute

self_harm_intent: bool = Field(alias='self-harm/intent')

Content where the speaker expresses that they are engaging or intend to engage in acts of self-harm, such as suicide, cutting, and eating disorders.

sexual instance-attribute

sexual: bool

Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness).

sexual_minors class-attribute instance-attribute

sexual_minors: bool = Field(alias='sexual/minors')

Sexual content that includes an individual who is under 18 years old.

violence instance-attribute

violence: bool

Content that depicts death, violence, or physical injury.

violence_graphic class-attribute instance-attribute

violence_graphic: bool = Field(alias='violence/graphic')

Content that depicts death, violence, or physical injury in graphic detail.

CategoryScores

Attributes:

Name Type Description
harassment float

The score for the category 'harassment'.

harassment_threatening float

The score for the category 'harassment/threatening'.

hate float

The score for the category 'hate'.

hate_threatening float

The score for the category 'hate/threatening'.

self_harm float

The score for the category 'self-harm'.

self_harm_instructions float

The score for the category 'self-harm/instructions'.

self_harm_intent float

The score for the category 'self-harm/intent'.

sexual float

The score for the category 'sexual'.

sexual_minors float

The score for the category 'sexual/minors'.

violence float

The score for the category 'violence'.

violence_graphic float

The score for the category 'violence/graphic'.

harassment instance-attribute

harassment: float

The score for the category 'harassment'.

harassment_threatening class-attribute instance-attribute

harassment_threatening: float = Field(
    alias="harassment/threatening"
)

The score for the category 'harassment/threatening'.

hate instance-attribute

hate: float

The score for the category 'hate'.

hate_threatening class-attribute instance-attribute

hate_threatening: float = Field(alias='hate/threatening')

The score for the category 'hate/threatening'.

self_harm class-attribute instance-attribute

self_harm: float = Field(alias='self-harm')

The score for the category 'self-harm'.

self_harm_instructions class-attribute instance-attribute

self_harm_instructions: float = Field(
    alias="self-harm/instructions"
)

The score for the category 'self-harm/instructions'.

self_harm_intent class-attribute instance-attribute

self_harm_intent: float = Field(alias='self-harm/intent')

The score for the category 'self-harm/intent'.

sexual instance-attribute

sexual: float

The score for the category 'sexual'.

sexual_minors class-attribute instance-attribute

sexual_minors: float = Field(alias='sexual/minors')

The score for the category 'sexual/minors'.

violence instance-attribute

violence: float

The score for the category 'violence'.

violence_graphic class-attribute instance-attribute

violence_graphic: float = Field(alias='violence/graphic')

The score for the category 'violence/graphic'.

Moderation

Attributes:

Name Type Description
categories Categories

A list of the categories, and whether they are flagged or not.

category_scores CategoryScores

A list of the categories along with their scores as predicted by model.

flagged bool

Whether any of the below categories are flagged.

categories instance-attribute

categories: Categories

A list of the categories, and whether they are flagged or not.

category_scores instance-attribute

category_scores: CategoryScores

A list of the categories along with their scores as predicted by model.

flagged instance-attribute

flagged: bool

Whether any of the below categories are flagged.