Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

postAndExpectFileResponse with chunk support for real time audio streaming #133

Open
PeperMarkreel opened this issue Nov 26, 2023 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@PeperMarkreel
Copy link

PeperMarkreel commented Nov 26, 2023

Hi,

I'd like to use OpenAI's TTS with chunks so we can start playing the speech before the full audio file has been created.

Any information if this will be implemented in this dart library? If not, I was hoping to get some pointers in how to implement it in my own app.

My current approach is to feed the TTS sentence by sentence, but each TTS sentence has about 500 to 1000 ms of silence at the end. So if you chain these audio recordings it sounds really unnatural. The application is quite time sensitive.

Edit: link to OpenAI docs

Thanks,
Harmen

@anasfik
Copy link
Owner

anasfik commented Feb 21, 2024

Thank you for pointing about this, will check and go back to let you know.

@anasfik anasfik self-assigned this Feb 21, 2024
@anasfik anasfik added the enhancement New feature or request label Feb 21, 2024
@anasfik
Copy link
Owner

anasfik commented Feb 22, 2024

since this seems playing audio related, the most that I can do is offer a Stream<List<int>> of the speech file instead of a file, I did make this simple flutter app that have raw code to play a speech in real time as a stream:

import 'dart:async';
import 'dart:convert';

import 'package:flutter/material.dart';
import 'package:just_audio/just_audio.dart';
import 'package:http/http.dart' as http;

final StreamController<List<int>> _controller = StreamController<List<int>>();

void main() {
  WidgetsFlutterBinding.ensureInitialized();

  runApp(MyApp());
}

class MyApp extends StatelessWidget {
  MyApp({Key? key}) : super(key: key);
  final player = AudioPlayer();

  MyStreamAudioSource myStreamSource = MyStreamAudioSource();
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'Flutter Demo',
      theme: ThemeData(
        primarySwatch: Colors.blue,
      ),
      home: Scaffold(
          body: Center(
        child: ElevatedButton(
          onPressed: () async {
            _listenToStream();

            player.setAudioSource(myStreamSource);
            player.play();
          },
          child: const Text('Press Me'),
        ),
      )),
    );
  }

  void _listenToStream() async {
    try {
      // fetch an audio as stream with http.
      // then add the data to the stream

      final uri = Uri.parse("https://api.openai.com/v1/audio/speech");

      final headers = {
        "Authorization":
            "Bearer YOUR-KEY",
        "Content-Type": "application/json"
      };

      final req = http.Request("POST", uri);

      req.headers.addAll(headers);

      req.body = jsonEncode({
        "model": "tts-1",
        "input": "Hi, I am a somebody. I am testing the audio stream.",
        "voice": "echo"
      });

      final res = await req.send();

      res.stream.listen((List<int> chunk) {
        myStreamSource.addAudioData(chunk);
      });
    } catch (e) {
      rethrow;
    }
  }
}

class MyStreamAudioSource extends StreamAudioSource {
  void addAudioData(List<int> data) {
    _controller.add(data);
  }

  @override
  Future<StreamAudioResponse> request([int? start, int? end]) async {
    return StreamAudioResponse(
      sourceLength: null,
      contentLength: null,
      offset: start ?? 0,
      stream: _controller.stream,
      contentType: 'audio/mpeg',
    );
  }

  @override
  Future<void> close() async {
    await _controller.close();
  }
}

Note: if you are trying to run this Flutter code, configure the just_audio package for the platform you are running the app on, also set your API key in the Authorization header.

This needs many changes, but as a demo, it should reflect what you want to achieve.

I am thinking of exposing a method called createSpeechBytes, that will return a Stream<List<int>> which you can pipe to an audio player in your flutter app.

@decisionslab2
Copy link

"Thank you, @anasfik . Do you have any timeline for when we can expect this feature to be live in the package?"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants