Skip to content

transcriptions

Classes:

Name Description
AsyncTranscriptions
AsyncTranscriptionsWithRawResponse
AsyncTranscriptionsWithStreamingResponse
Transcriptions
TranscriptionsWithRawResponse
TranscriptionsWithStreamingResponse

AsyncTranscriptions

AsyncTranscriptions(client: AsyncOpenAI)

Methods:

Name Description
create

Transcribes audio into the input language.

with_raw_response
with_streaming_response

create async

create(
    *,
    file: FileTypes,
    model: Union[str, Literal["whisper-1"]],
    language: str | NotGiven = NOT_GIVEN,
    prompt: str | NotGiven = NOT_GIVEN,
    response_format: (
        Literal[
            "json", "text", "srt", "verbose_json", "vtt"
        ]
        | NotGiven
    ) = NOT_GIVEN,
    temperature: float | NotGiven = NOT_GIVEN,
    timestamp_granularities: (
        List[Literal["word", "segment"]] | NotGiven
    ) = NOT_GIVEN,
    extra_headers: Headers | None = None,
    extra_query: Query | None = None,
    extra_body: Body | None = None,
    timeout: float | Timeout | None | NotGiven = NOT_GIVEN
) -> Transcription

Transcribes audio into the input language.

Parameters:

Name Type Description Default
file FileTypes

The audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

required
model Union[str, Literal['whisper-1']]

ID of the model to use. Only whisper-1 (which is powered by our open source Whisper V2 model) is currently available.

required
language str | NotGiven

The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.

NOT_GIVEN
prompt str | NotGiven

An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.

NOT_GIVEN
response_format Literal['json', 'text', 'srt', 'verbose_json', 'vtt'] | NotGiven

The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.

NOT_GIVEN
temperature float | NotGiven

The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.

NOT_GIVEN
timestamp_granularities List[Literal['word', 'segment']] | NotGiven

The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: word, or segment. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.

NOT_GIVEN
extra_headers Headers | None

Send extra headers

None
extra_query Query | None

Add additional query parameters to the request

None
extra_body Body | None

Add additional JSON properties to the request

None
timeout float | Timeout | None | NotGiven

Override the client-level default timeout for this request, in seconds

NOT_GIVEN

with_raw_response

with_raw_response() -> AsyncTranscriptionsWithRawResponse

with_streaming_response

with_streaming_response() -> (
    AsyncTranscriptionsWithStreamingResponse
)

AsyncTranscriptionsWithRawResponse

AsyncTranscriptionsWithRawResponse(
    transcriptions: AsyncTranscriptions,
)

Attributes:

Name Type Description
create

create instance-attribute

create = async_to_raw_response_wrapper(create)

AsyncTranscriptionsWithStreamingResponse

AsyncTranscriptionsWithStreamingResponse(
    transcriptions: AsyncTranscriptions,
)

Attributes:

Name Type Description
create

create instance-attribute

create = async_to_streamed_response_wrapper(create)

Transcriptions

Transcriptions(client: OpenAI)

Methods:

Name Description
create

Transcribes audio into the input language.

with_raw_response
with_streaming_response

create

create(
    *,
    file: FileTypes,
    model: Union[str, Literal["whisper-1"]],
    language: str | NotGiven = NOT_GIVEN,
    prompt: str | NotGiven = NOT_GIVEN,
    response_format: (
        Literal[
            "json", "text", "srt", "verbose_json", "vtt"
        ]
        | NotGiven
    ) = NOT_GIVEN,
    temperature: float | NotGiven = NOT_GIVEN,
    timestamp_granularities: (
        List[Literal["word", "segment"]] | NotGiven
    ) = NOT_GIVEN,
    extra_headers: Headers | None = None,
    extra_query: Query | None = None,
    extra_body: Body | None = None,
    timeout: float | Timeout | None | NotGiven = NOT_GIVEN
) -> Transcription

Transcribes audio into the input language.

Parameters:

Name Type Description Default
file FileTypes

The audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

required
model Union[str, Literal['whisper-1']]

ID of the model to use. Only whisper-1 (which is powered by our open source Whisper V2 model) is currently available.

required
language str | NotGiven

The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.

NOT_GIVEN
prompt str | NotGiven

An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.

NOT_GIVEN
response_format Literal['json', 'text', 'srt', 'verbose_json', 'vtt'] | NotGiven

The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.

NOT_GIVEN
temperature float | NotGiven

The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.

NOT_GIVEN
timestamp_granularities List[Literal['word', 'segment']] | NotGiven

The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: word, or segment. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.

NOT_GIVEN
extra_headers Headers | None

Send extra headers

None
extra_query Query | None

Add additional query parameters to the request

None
extra_body Body | None

Add additional JSON properties to the request

None
timeout float | Timeout | None | NotGiven

Override the client-level default timeout for this request, in seconds

NOT_GIVEN

with_raw_response

with_raw_response() -> TranscriptionsWithRawResponse

with_streaming_response

with_streaming_response() -> (
    TranscriptionsWithStreamingResponse
)

TranscriptionsWithRawResponse

TranscriptionsWithRawResponse(
    transcriptions: Transcriptions,
)

Attributes:

Name Type Description
create

create instance-attribute

create = to_raw_response_wrapper(create)

TranscriptionsWithStreamingResponse

TranscriptionsWithStreamingResponse(
    transcriptions: Transcriptions,
)

Attributes:

Name Type Description
create

create instance-attribute

create = to_streamed_response_wrapper(create)