1.3.7. Media Server Session

A core part of the Proximie SDK is the management, sending and receiving of media streams (namely audio and video); this functionality is encapsulated in the Proximie::PxMedia namespace. The MediaServerSession class provides the majority of the support for media sharing.

Key features

The media server sessions’s main role is to share audio and video between participants in a media session.

For video media:

  • Share one or more local video device (e.g. a webcam) feeds to others in the session.

  • Receive one or more video feeds from others in the session and display locally.

For audio media:

  • Share a local audio source (e.g. microphone) feed with others in the session.

  • Receive a remote audio feed from the session that mixes other participants’ audio only (i.e. without the local source in the mix, this avoiding echo) and play back locally.

Feed concepts

The media streaming support in the Proximie SDK is underpinned by GStreamer, a mature open source framework for creating streaming media applications. GStreamer is a very flexible and comprehensive framework, but is also somewhat low-level, and can require some detailed understanding to use well.

The Proximie SDK abstracts much of the complexity away to suit the use cases it is intended for: namely, audio and video streaming for telepresence applications using Proximie Services. This more streamlined model is described next.

Feed components

The most basic feed from the SDK’s point of view has three parts that we call “components”:

  1. A source or input - e.g. a local webcam, or a remote feed being streamed from a media session.

  2. Some kind of encoding - e.g. VP8 or H.264 encoding for a video.

  3. A target for output - e.g. an application window, or a sending to a media session to share the feed.

The SDK refers to these elements as “feed components”. When creating a feed, the application decides which components are used to create it; this is based on the requirements of the feed - e.g. to stream a remote feed to the local application, the input component is fixed/implicit (the media feed itself), and the application just decides which output component to use to consume the feed - e.g. an application window.

Depending on platform, media type and component type, there are various components available. The following table shows the base classes used for each component type; you can consult the base class documentation for subclasses that match the type of feed.

Audio

Video

Input

AudioInputFeedComponent

VideoInputFeedComponent

Output

AudioOutputFeedComponent

VideoOutputFeedComponent

Encoding

AudioWebRtcEncodeFeedComponent

VideoWebRtcEncodeFeedComponent

Media server session notifications

The MediaServerSession generates various notifications that the application can subscribe to by providing a suitable callback function. Typically the application will set up the notifications as part of the initialisation process of the MediaServerSession, before connecting. The remainder of this section will outline the various notifications supported by the session object.

Creating a MediaServerSession instance

To create a new MediaServerSession object, the application simply needs to provide a ProximieContext.

using namespace Proximie;
using MediaServerSession = PxMedia::MediaServerSession;

auto session = MediaServerSession::create(pxcontext);

The value for pxcontext is created as explained in ProximieContext.

Note that the MediaServerSession is created into a shared pointer. This is because many of the session APIs are asynchronous, and so the lifetime of the object needs to account for “in flight operations” (see Object lifetimes for more details).

Connecting to a media session

A media server session needs to connect to a Proximie media session in order to be able to share and receive feeds. To connect to a remote session, you need:

  1. A valid session ID,

  2. Details of the services that required to use to obtain media session details in order to connect to the correct Media Server for it,

  3. Required request fields (such as authentication token provider) in order to make certain requests.

using ProximieServices = PxRestApi::ProximieServices;

PxCore::RemoteEnvironment environment;
environment.sessionHost = "live.proximie.net";
environment.globalApi = "my.proximie.net";
environment.regionalApi = "{}.proximie.net";

ProximieServices::RequestFields reqFields;
reqFields.tokenProvider(auth);

auto connectAsync = session->connectToSession(sessionId, environment, reqFields);
auto connected = connectAsync.get();  // Blocking call
if (connected) {
    // Session has connected and the application and can send/receive feeds...
} else {
    // Some error occurred starting the connection
    auto error = connected.error();
    // Handle error...
}

Firstly, the environment declares the URLs to access the Proximie services it needs. Note that the values will differ depending on which Proximie environment or tenant you are using.

The session uses its own PxRestApi::ProximieServices instance to access the Proximie services it requires. Check out Proximie REST services for details providing the token provider object.

Finally, the session needs a valid session ID (sessionId) in order to know which remote Proximie session to connect to. The application typically queries available sessions to choose an appropriate session - see Session selection (remote) for more details.

connectToSession() returns a future that the application can use to wait for the connection to complete. Alternatively, the application can use a notification callback. As mentioned above, before starting a session connection the application will usually set up the various notification callbacks for various asynchronous operations.

We can use the onSessionConnected() notification to handle session connection logic.

session->onSessionConnected([](const auto& session, error_code error) {
    if (error) {
        // Handle error connecting...
    } else {
        // Session has connected and the application and can send/receive feeds...
    }
});

In this case, the call to connectToSession() can ignore the return value and wait for the notification to be called.

Managing feeds

Once the MediaServerSession is connected to a session, you can create and start local and remote feeds.

Using a local video feed

Feeds objects are created using the appropriate create factory function for the feed, providing the owning session, general properties, and the required feed components. For an outgoing local feed, we use MediaServerOutgoingVideoFeed::create.

First though, the input device and its parameters are declared as an feed input component object. The application creates an object reflecting the required input device, optionally setting parameters like the frame rate:

PxMedia::VideoInputFeedV4Linux2 video;
video.deviceProperties().device("/dev/video0");
video.videoCapabilities().frameRate(30);

A typical use-case is to see the feed locally as well as sharing, so MediaServerOutgoingVideoFeed creation optionally takes an output feed component, which can be created like this (in this example, outputting to a UDP port):

PxMedia::VideoOutputFeedUdp udp;
udp.port(UDP_PORT);
udp.encoderProperties().bitrate(UDP_BITRATE);

Local feeds have common parameters independent of the device type, e.g. a human-readable label. These are set in a separate properties object which is not dependent on the source/device type.

PxMedia::MediaServerOutgoingVideoFeed::FeedProperties props{"webcam"};

Then, with the properties and feed components, the application can create the feed:

auto created = PxMedia::MediaServerOutgoingVideoFeed::create(session, props, video, udp);
if (created) {
    // When successful, the created value is the feed object
    auto feed = created.value();
} else {
    // Some error occurred creating the video
    auto error = created.error();
    // Handle error...
}

As mentioned above, the local feed output is optional, and can be omitted from the call to create().

If the feed cannot be created, the result will indicate an error. Otherwise, the feed object can be retrieved from the result. As with most SDK objects the feed reference is a shared pointer (in this case to MediaServerOutgoingVideoFeed). Note that the feed is not yet started; the application may set up callbacks with the feed before the feed is started.

feed->onFeedStarted([](const auto& feed) {
    // The feed has successfully started
});
feed->onFeedStopped([](const auto& feed, error_code error) {
    if (error) {
        // The feed has failed...
    } else {
        // The feed was stopped intentionally...
    }
});

Feeds may either be stopped in error (e.g. internet connection drops) or intentionally (e.g. the application requested a feed be stopped, or a remote feed was stopped and has stopped sharing).

The feed callbacks are called with a reference to the feed object. It is important to take care not to attempt to capture the feed you created, since this is a shared pointer and it means the callback will be holding a reference itself. Always use the reference passed to the callback to avoid lifetime and circular references.

Once any event callbacks are set up, the application can start the feed.

auto starting = feed->startFeed();
if (starting) {
    // The feed is starting up, wait for the feed started event...
} else {
    // The feed failed to start with an error
    auto error = starting.error();
}

This process is asynchronous, since the session needs to negotiate with the Media Server and have the feed prepared for sharing. So, after the call to startFeed() returns, the application should respond to a subsequent notification (as just described) to indicate the status of the feed.

Starting a remote feed

Remote feeds can only be started when they have been shared from another media session participant. There is a notification that the application should subscribe to, to be informed when feeds are published. The SDK does not attempt to automatically received remote feeds, since the application can best decide when or whether some new feed should be added, and how.

session->onNewRemoteFeed([](const auto& session, const auto& info) {
    auto newStreamId = info->streamId;  // The stream ID of the new remote feed

Given a remote feed stream ID in the metadata provided, the application can create a MediaServerIncomingVideoFeed object to receive the feed. When the application wants to add a remote feed, it provides certain feed properties (including the remote feed’s unique stream ID). This is similar to how we created a local video, but because the incoming feed will output to a local target (e.g. a window), the feed requires an output component. The output component setting values to set will depend on the target of the output, just like the feed input element depended on the type of video source.

This can be created in the same way as we did for the local feed example above (of course if using multiple UDP outputs, you will need to provide different ports).

PxMedia::VideoOutputFeedUdp udp;
udp.port(UDP_PORT);
udp.encoderProperties().bitrate(UDP_BITRATE);

Then, the remote feed is created using MediaServerIncomingVideoFeed::create, like the local feed but with the required parameters for remote feeds. In this example we also set a human-readable label based on the stream ID.

auto label = "remote:" + newStreamId;
PxMedia::MediaServerIncomingVideoFeed::FeedProperties props(label, newStreamId);
auto created = PxMedia::MediaServerIncomingVideoFeed::create(session, props, udp);
if (!created) {
    // Handle error creating the remote feed...
    auto error = created.error();
} else {
    // The feed was created successfully...
    auto feed = created.value();
}

Once created, the application can set up any event callbacks for the feed as before (not shown here). Then, the feed can be started.

auto starting = feed->startFeed();
if (!starting) {
    // Handle error starting the remote feed...
    auto error = starting.error();
} else {
    // The feed is starting up, wait for the feed started event...

The process is again similar to starting a local feed. The request may fail immediately, which the application needs to check for. Otherwise, the feed is starting and will subsequently start streaming if successful. The feed’s onFeedStarted() is called when successful, or onFeedStopped() with an error if it fails.

Session audio feed

Audio in the SDK is slightly different from video. Currently, there is only a single audio feed available, which is bidirectional: the audio feed comprises:

  1. An audio source, e.g. a local microphone,

  2. The incoming media session audio feed, which is a mix of all incoming audio sources from all participants except the source being sent from the application.

The reason why audio has this dual setup is because the application typically wants to play back the received audio, but without the local audio in the mix to avoid artefacts (e.g. echo, feedback). Thus, the local audio signal is sent and mixed with into only other participants feeds; each participant then gets a feed of only other participants and not themselves.

The pattern here is similar to other feeds: there are some common properties for the feed, plus components to defined the input (e.g. a microphone) and output (e.g. the device’s speaker). For audio, the application uses a MediaServerTwoWayAudioFeed feed type.

PxMedia::MediaServerTwoWayAudioFeed::FeedProperties audio{"audio"};

PxMedia::AudioInputFeedAuto mic;
PxMedia::AudioOutputFeedAuto speaker;

auto created = PxMedia::MediaServerTwoWayAudioFeed::create(session, audio, mic, speaker);
if (!created) {
    // Handle error...
} else {
    // The audio feed was created successfully, set up callbacks and start...
    auto feed = created.value();
}

As before, after the application creates the feed, it should set up any event callbacks, and start the feed with startFeed().

Stopping feeds

Feeds are programmatically stopped using cpp:func:~Proximie::PxMedia::FeedBase::stopFeed on the target feed object. When the feed stops, it will call any callback set up using onFeedStopped().

Disconnecting from a session

The media server session should be disconnected from an ongoing Proximie session as part of the session life cycle. This process mirrors the connection process mentioned above, using the disconnectFromSession() member function:

auto disconnectAsync = session->disconnectFromSession();
auto disconnected = disconnectAsync.get();  // Blocking call
if (disconnected) {
    // Session has disconnected, the media server session can be released
    // or start a new session
} else {
    // Some error occurred closing the connection
    auto error = disconnected.error();
    // Handle error...
}

The disconnection process will shut down any existing running audio and video media feeds. Disconnection is an asynchronous process, that returns a future like the connection process does. The application should wait for the completion before freeing the media server session object or attempting to connect to another session.

The media server session also supports a callback mechanism to be informed when the disconnection completes:

session->onSessionDisconnected([](const auto& session, error_code error) {
    if (error) {
        // Handle error disconnecting...
    } else {
        // Session has disconnected, the media server session can be released
        // or start a new session
    }
});

Session error handling

As mentioned above, if an individual feed encounters an error, the feed’s onFeedStopped() notification is called with an error code.

Aside from individual feed errors, the MediaServerSession session itself can encounter problems, such as the internet connection problems. The application can subscribe to the onSessionError() notification callback to be informed of session errors:

session->onSessionError([](const auto& session, error_code error) {
    // Handle session error...
});

If an error is serious enough that the session has disconnected, the onSessionDisconnected() callback (see above) is called with the error.

Setting encoders

The MediaServerSession uses default encoders for outgoing audio and video (Opus and VP8 respectively). The application may choose alternative encoders (but these need to be compatible with the Media Server being used).

Example setting a H.264 encoder using the videoEncoder() function:

PxMedia::VideoWebRtcEncodeFeedH264 h264;
h264.encodingProperties().bitrate(H264_BITRATE);
session->videoEncoder(h264);

After creating the encoder object, the application can optionally set adjust custom properties from the defaults. Then the encoder is passed to the session object - note that a copy is made of the encoder settings passed.

Feed statistics

Media session feeds support WebRTC statistics reporting. This capability is exposed in all subclasses of MediaServerFeed via its mediaServerFeed() accessor function, which returns an interface for media server feed capabilities, including statistics.

Here, given a feed which is one of the MediaServerFeed concrete classes mentioned above, the application can access the feed statistics.

// Where feed is one of the MediaServerFeed subclasses
feed->mediaServerFeed().feedStatistics([](const auto& stats) {
    // The stats parameter contains the feed reference and the collected stats
    auto feed = stats->feed;
    auto webRtcStats = stats->webRtcStats;
});

See the documentation for WebRtcStats for more details on the stats collected.