Streaming Evolution – File downloading to Adaptive Streaming
RTMP Streaming
This article
covers experiencing media content from server to user. There are different ways
to access the content. Over the period lot of technologies evolved to
efficiently deliver the content to the users.
I have done
extensive analysis and research on streaming technologies. On part of this , I want
to share the details here.
Content download and streaming are
the two possible ways to view the media content. Two ways of downloading can
possible namely file download and Progressive download.
File Downloading:
File downloading allows the entire
media file to be downloaded using HTTP or FTP from web server and saved in the
local device memory. The file names are referred using hyperlink .The
downloaded file can be opened using the appropriate media player application to
view and display the content. This method is most suitable for smaller size files.
For larger size files, user has to wait long time to complete download to view
the media content.
Progressive Download
Progressive download is used for
file download and render at the same time. The streaming file is downloaded
from web server to the client device. As soon as the file downloads starts,
client invokes the media player to play after sizable data available in the
client play out buffer. The user can store complete data and play the content
whenever required without downloading again from the server. The media player
stops playback if there is buffer underrun(playback rate exceeds the download
rate). The playback will resume after further download the data. Sometimes the
buffer overrun happens when the download rate exceeds the playback rate.
Progressive download uses HTTP
(Hypertext transport protocol) over TCP (Transport control protocol) . TCP is a
reliable protocol optimized for guarantee of delivery, irrespective of file
format or size and it controls the actual packet transport over the IP
network. packet retransmission consumes
extra bandwidth and time which restricts the real time end user experience.
Regardless of bandwidth drop or surge, the video representation remains same
for the entire duration. HTTP Web servers keep pushing the data flowing until
the download is complete ,it uses the existing web infrastructure and does not
require any additional firewall, Network Address Translation(NAT)
configurations which are major issues in RTSP/RTP streaming.
Streaming
Streaming is
the process of dividing data in the file is broken into small packets that are
sent in a steady and continuous flow, as a stream to the end device. As soon as
few initial data packets received, the playback starts as the rest of the
packets are transferred to the end user's device while playing. The packets are
reassembled at the client side based on the sequence number and time stamp. The
initial short play out buffer delay
required to accumulate small amount of data in buffer. The client play out
buffer makes sure that the playback to continue uninterrupted despite
variations in the rate of received rate and network delay.
Application
level protocols located on top of transport protocol which are required to
deliver application specific data and events .In transport level, UDP/IP and
TCP/IP are used in packet switched network to transport the content. The
various application protocols such as Real Time Streaming Protocol (RTSP),
Real-time Transport Protocol (RTP), HTTP, and Real Time Messaging Protocol (RTMP) are
some of the widely used protocols for media file streaming.
Media
streaming categorically divided into Live and on demand based on content
origin. In on-demand media streaming, the stored encoded media content
delivered to the consumer using specific set of protocols. In Live media
streaming ,the content captured on the
fly ,encoded and transmitted to the user .Such streaming methods require fast
processing capability to encode with minimal latency.
RTP media streaming
over UDP
User Datagram Protocol
(UDP) is widely used in Packet data network to stream multimedia content
because of its flexibility and real time delivery behavior. RTP streaming over
UDP widely used in low latency media and entertainment applications such as
streaming, video telephony, video conference, set top box application and
push-to-talk features. Real-time Transport Protocol (RTP) and Real Time Control
Protocol (RTCP) application protocols for payload transmission and control
respectively. Generally Real Time Streaming Protocol (RTSP) over TCP is used
for session initiation and description even though specification allows RTSP
over UDP. Figure 4 shows the communication flow between streaming client and
server.
The streaming operation
logically divided into three phases.
·
Session description
and control
·
Media payload
transportation
·
Session quality and
feedback
Session Description Protocol
(SDP) is a presentation description protocol, describes the session parameters
required for initiation, setup and negotiation. It provides the information
such as protocol version, session name, network connection details, display
orientation, media type, port, protocol, format and session duration. It can be
extended to provide new media types and profile information.
RTCP packets are providing
significant feedback information to the client and server. The functionality of
the RTCP is to monitor the transmission, reception parameters and convey
information to all the participants in an on-going session
Interleaving RTP/RTCP over RTSP
Even though RTP and RTCP
transmission over UDP gives the real time user experience, the connection
between streaming client and server is also not reliable and many times UDP
packets are blocked by network firewalls and NAT. Alternatively Packet switched
streaming RTSP/RTP solution can also be transmitted over full-duplex TCP
connection by interleaving RTP into the RTSP session provided RTSP use TCP for
transport.The RTSP connection channel number 0 is used for RTP to transmit the
data stream and channel number 1 is used for RTCP to transmit control messages
HTTP Tunnelling
HTTP tunneling is another
method to allow RTP/RTSP data to easily pass through firewalls since most of
the system allows HTTP traffic to traverse. RTP and RTSP streams are wrapped
into HTTP messages and transported over TCP. Receiver unpacks the HTTP packets
to regain RTP/RTCP packets. Even though sending streaming data via HTTP is
least efficient, it ensures most reliable delivery.
RTMP Streaming
Real-Time Messaging
Protocol (RTMP) is proprietary and stateful Adobe media protocol used for
streaming streams over TCP. It permits live and on-demand streaming of Audio, video,
data transmission between a Flash media player and a Flash Media Server. It supports video formats such as FLV ,H.264 (MP4/MOV/F4V) and Audio formats
such as MP3 and AAC (M4A). The RTMP variations
are
·
RTMP - Adobe's Real-Time
Message Protocol
·
RTMPE – RTMP Encrypted
version uses Adobe's own security
mechanism.
·
RTMPT- RTMP Tunneled. It is tunneled using
HTTP.RTMP data is encapsulated and exchanged
via HTTP to avoid firewall/NAT issues of normal RTMP
transfer.
·
RTMPTE – RTMP encrypted
tunneled over HTTP
The Flash media server divides each
mediastreams into number of small fragments with different size after the
session establishment and begins sending the media as a steady stream of small
information packets till the session end. The receiver and sender
dynamically negotiate the size of the fragments to be transmitted.
Adaptive streaming
The basic concept of adaptive streaming
is to divide audio, video into number of small chunks for appropriate duration,
encoded in different bit rates, stored and delivered to the client using HTTP
download.
Microsoft Smooth Streaming, Apple
HTTP Live Streaming (HLS), Adobe HTTP Dynamic Streaming (HDS) and MPEG DASH
(Dynamic Adaptive Streaming over HTTP) are the frequently used adaptive
streaming techniques.
Microsoft Smooth streaming
Smooth streaming server stores
manifest files along with the media files. Server has client and server
manifest files for each media file. Client initiates the connection request
with URL to the server. Server sends manifest file which comprise metadata
required by the client to start the session. Server manifest file denoted with
.ism extension.
An ismv file comprises video
and audio data, or only video data. In Audio video representation, audio track
multiplexed into video as ismv file instead of storing in separate file. Each
bit representation stored in separate ismv file. An isma file contains
only audio and it is required for audio only file streaming.
HTTP Live
Streaming (HLS)
The basic principle of HLS is to divide the
overall streaming into number small segmented MPEG2-TS files of HTTP download. Segmentation
makes different media representation units and creates the index(m3u8) files.
For multiple bit representation, the main m3u8 file contains the entries of sub
index m3u8 files to support multiple encodings of the same presentation. HLS
server manages with thousands of individual fragments and sends the fragment
stream to the client. Each fragment contains Program Association Table (PAT)
and a Program Map Table (PMT) at the start along with media data.
HTTP
Dynamic Streaming (HDS)
HTTP Dynamic streaming (HDS)
is open standard streaming solution
developed by adobe to support for adaptive bit rate HTTP Dynamic live and on-demand Streaming using HTTP caching servers and using a Fragmented MP4 container format .
Dynamic
Adaptive Streaming over HTTP (DASH)
A XML-based manifest file
describes the media presentation details and playlists similar to smooth
streaming. The media segments with various bit representation requested based
on the manifest file information . MPD file contains stream information in the
beginning and followed by media presentation contents for various time periods.
Each media presentation elements contain number of adaptation sets for audio,
video stream for specified duration.
Adaptive HTTP streaming uses either MPEG-TS or fMP4(Fragmented MP4) container format.
Content streaming using Cloud infrastructure and services
Cloud infrastructure helps to host
media streaming servers and content files . The computing infrastructure helps
to create virtual environment and install media servers either custom or from
the third party provider. Cloud infrastructure and services helps to
- Provisioning media server
- Host content in cloud storage
- Content encoding/transcoding services
- Media engine to generate multiple format output
- Live streaming
- Edge content delivery
- Live video analytics