Desktop Streaming to the Raspberry Pi using VLC and H.264

These are my revised notes on streaming a live Linux desktop to the Raspberry Pi using VLC.

The Problem

Regardless of what else you try, the truth is that it is pretty hard to render complex web pages on a Raspberry Pi – and almost impossible to do so reliably and at speed if you need any sort of animation or complexity, since even if you manage to build WebKit with GLES support, the resulting browser will still be severly impaired by the CPU side of things.

So for the past few months I’ve been looking for a way to reliably broadcast a desktop session to a number of Raspberry Pis for a digital signage system.

The two obvious approaches work, but have the following caveats:

  • VNC can be made view-only with ease and is very fast if you get rid of X on the client (I have a framebuffer client for the Pi). Its simple encoding makes for very low latency but is a bandwidth hog.
  • RDP and rdesktop work very well indeed, but they don’t take advantage of the Raspberry Pi’s GPU. Even using a framebuffer client doesn’t help, and running RDP is a can of worms security-wise.

So the next obvious step was to look into using the GPU’s H.264 decoding functionality, which, sadly, is only readily available through omxplayer – that in turn raises another kind of problem, since omxplayer is somewhat buggy and tends to freeze often, and has a number of limitations regarding streaming formats (it only supports HTTP pseudo-streaming – as opposed to RTSP, which would be massively more useful).

The Setup

My test setup has evolved a fair bit, and now consists of an upstart script that sets up an in-memory X display. I’m using Xtightvnc for this since it allows for a fair bit of command-line tinkering but you can use xvfb as well.

The upstart script looks like this:

# cat /etc/init/signage-vnc.conf
start on net-device-up


exec sudo -u signage Xtightvnc :1 -geometry 1280x720 -depth 16\
 -rfbwait 120000 -rfbport 5900 -viewonly -nocursor -dpi 96 -alwaysshared\
 -fp /usr/share/fonts/X11/misc/,/usr/share/fonts/X11/Type1/,/usr/share/fonts/X11/75dpi/,/usr/share/fonts/X11/100dpi/ -co /etc/X11/rgb

post-start exec sudo -u signage DISPLAY=:1 sh /home/signage/.xsession &

It assumes there’s a signage user, of course, and sets up a view-only virtual display with 16bit1 color and my target resolution. Note that the -dpi option is essential if you want text rendering to look right in Xtightvnc.

When started, the service will then run the .xsession file, which looks like this:

# cat /home/signage/.xsession
uzbl-core --geometry 1280x720+0+0 &
exec cvlc screen:// :screen-fps=12 :screen-caching=100\
--sout "#transcode{vcodec=h264,venc=x264{keyint=12,scenecut=80,profile=faster,intra-refresh,tune=zerolatency,bframes=0,nocabac},fps=12,scale=0.75,vb=512}:std{access=http,mux=ts,dst=/stream.mp4}"

This starts a full-screen uzbl browser with the D3 Show Reel demo (an excellent torture test for this kind of thing) and a VLC instance that:

  • Samples the screen at 12fps, with a 100ms pixmap cache
  • Encodes it with the faster preset, inserting one keyframe per second, disabling cabac encoding and B-frames (and hence future motion prediction) while tweaking for low-latency streaming by disabling IDR frames and tuning for minimal latency (see this for a list of applicable settings)
  • Streams the result as a transport stream via HTTP (on port 8080 by default)

The result can be rendered on a Raspberry Pi with:



There are still a number of problems with this setup, though:

  1. omxplayer can take up to 30 seconds to display anything, depending on buffering settings and framerate, and will drop out without warning thinking the stream has ended.
  2. Some fine tweaking is required to achieve a balance between good encoding quality and CPU load on the server.
  3. Bandwidth use per display easily hits the 512Mbps mark, which (given that I can’t use multicast) means this doesn’t scale well (even if VLC seems to have no trouble taking on multiple clients).

So far I’ve tried no less than three different omxplayer builds, and the buffering is still inconsistent, regardless of the settings I try (and it feels silly to set the incoming buffer to 0.1 MB, really), but things are a trifle better than a few months ago, when omxplayer flat out refused to use transport streams and I had to use mux=ps on the server side.

In the meantime, I’ve been trying to address the remaining issues by fiddling with VLC settings:

  • Increasing keyframe frequency improves the startup time, but increases stream bandwidth.
  • Increasing framerate makes for stupendous quality and good startup times, but, of course, bandwidth also suffers.
  • Enabling bframes decreases bandwidth usage, but quality suffers tremendously (at least for this use case).
  • Turning on CABAC encoding improves quality markedly, but increases latency. It does help a lot if you turn down the bitrate, though.
  • Tweaking the presets (veryfast, slow, etc.) has a noticeable impact on stream quality and bandwidth, but CPU usage varies along with them.

But, of course, all the above depend on the subject matter. If you don’t have stuff moving about on screen constantly, this seems to be a good compromise (at the expense of a fair amount of CPU usage on the server whenever there’s motion):

exec cvlc screen:// :screen-fps=12 :screen-caching=100\
--sout "#transcode{vcodec=h264,venc=x264{keyint=12,scenecut=80,profile=veryslow,intra-refresh,tune=zerolatency,bframes=0},fps=12,scale=1,vb=320}:std{access=http,mux=ts,dst=/stream.mp4}” 

A Note on Audio

Since I’m streaming an X desktop, streaming the audio is the kind of hassle I don’t really want to attempt (fortunately I have zero interest in streaming the audio at all), but you can do it by invoking VLC on a console session and using something like this (stripped for readability):

... --input-slave=pulse:// --sout "#transcode{...acodec=mp3,channels=2,ab=128,audio-sync}

  1. I haven’t noticed any significant difference between using 16bit/24bit on the server. Both the browser and VLC would probably work slightly faster when rendering/scraping a 24bit buffer, but the jury’s still out on that. ↩︎