Last updated: 2022-06-11

Live streaming on Plan 9: 10 Tips On How To Become A Successful Twitch Streamer!

Thanks to Sigrid, livestreaming to (at least) Twitch and Peertube with both video and audio has been possible for a while now. The README in the rtmp repo already gives a lot of details on how to accomplish it, so make sure that you read that first.

This document will mostly attempt to fill in some blanks and provide some advice on configuration to experience streaming to some of the most egregious online services to its fullest. The intended result is a working stream at 1920x1080 and 60 FPS with a face cam and mixing multiple sources of audio including a microphone.

Requirements

The software requirements are already mentioned in the README.md file at rtmp:

rtmp: audio and video streaming using the RTMP protocol.
hj264: h264 video encoder, based on minih264.
aacenc: AAC audio encoder.

Everything else is included in 9front.

A key from Twitch et al is necessary to connect and authenticate, for which you may refer once again to the README. For this service, the stream's configuration can, as far as I know, only be done through their webshit. This includes saving VOD's, latency, label, descriptions, etc.

Mouse and screen settings

Depending on the type of content, some tweaks may be necessary:

disable screen blanking: this is useful for instance when playing games with a joystick (through nusb/joy(4), since the controller does not wake the screen:
; echo blanktime 0 >/dev/mousectl
use software mouse cursor: because hardware cursors will not be visible in the video:
; echo hwgc soft >/dev/vgactl

change mouse acceleration: a minor, self-explanatory point, though only for USB mice:

; echo accelerated 3 >/dev/mousectl
# or disable acceleration:
; echo linear >/dev/mousectl

hj264 and picture quality

To reiterate, the goal here is a stream with audio and video at 1080p at 60Hz with legible text for programming streams. The machine I ended up streaming with is a 2014 Haswell desktop with an i7-4770k CPU running at 1920x1080x32 resolution at 60Hz through HDMI using the igfx driver, and runs very well with the final settings selected below. The screen supports 3840x2160x32 at 30Hz through HDMI as well, but running a stream at that resolution is impossible, at least for that desktop.

Aside from the streaming machine, the only but most important knobs are hj264's parameters. At least for now, they cannot change at runtime, and are set through the command line. Just a cursory glance at the source code will reveal all of them and their default and possible values. Here are the most important, to my mind, to consider:

-f [default 30]: framerate
-Q [0 to 51, default 33]: QP, quantization parameter
-k [-, default 0, ie. no limit]: maximum bitrate

The number of CPU cores to use is determined automatically through the $NPROC environment variable. It does not use all available cores and I haven't seen any big difference in changing it. There also is an overall quality parameter -q (0 to 10, default 0) and a few others, from which I have not noticed any significant impact.

Audio streaming

Besides h.264 video, the pipeline uses aacenc for the audio stream.

Quality-wise, its default settings are fine for streaming, but have noticeable artifacts. I find that a constant bitrate between 128 and 256 kb works better. The numbers needn't be a multiple of 8, but must be specified in bytes rather than kilobytes:

; audio/aacenc -B 262144
# or just:
; audio/aacenc -B 200000

Ensuring that audio works correctly is crucial for the stream to work correctly. This implies pushing a stable uninterrupted supply of audio samples to aacenc. As Sigrid shows in rtmp's README, one can use dd(1) and mixfs(3) to fill any silence with empty samples from /dev/zero. This isn't necessary if there is another source of audio which transmits samples even during "silence", as is the case with a usb-connected hardware mixer (as in my use case).

Note also that aacenc should be used with the -b flag to disable buffered writes.

Starting the stream

The basic command line with near-optimal settings which Work on My Machine™ is:

; video/hj264 -f 60 -k 1000 -Q 50 /dev/screen |\
	video/rtmp -a <{audio/aacenc -b </mnt/mix/audio} \
	$key

Depending on the setup, it would probably also be necessary to set the silence filling fix up at the same time:

; window -hide rc -c \
	'dd -bs 44100 -if /dev/zero -of /mnt/mix/audio'

At the end of a stream or recording, I just kill(1) the encoders and dd without attempting much cleanup:

; slay rtmp hj264 dd | rc

Performance

Since there is no hardware acceleration and draw performance is usually fairly limited even on well-supported but older machines, a busy screen will significantly degrade the stability and quality of the stream. Low performance quickly leads to audio/video desynchronization. I have a vague recollection of the desynchronization reducing itself once the picture stays essentially static for a while, but I'm not sure that I didn't just dream this up. aacenc's parametrization doesn't seem to have any noticeable impact on performance.

In coding streams, the screen does indeed stay static for the vast majority of the time, and this isn't a huge issue, unless performance is low to begin with. Playing a game such as doom(1) however will blur the screen (depending on hj264's quality settings), probably desynchronize audio and destabilize the stream.

There are several ways to try to compensate for it:

close all other graphical windows
reduce screen size
reduce hj264's framerate
stream a subrio

The last idea actually works very well. From what I can tell, there is no real requirement for the stream to be in a standard resolution. One may then limit the stream to a rio(1) window, or a subrio, the size of the game's output (scaled or not). I have used this to stream NES and SNES games with good results in terms of both quality and performance. The only thing needed is to point hj264 to a window instead of the entire screen:

; video/hj264 -f 60 -k 1000 -Q 50 /dev/wsys/11/window |\
	video/rtmp -a <{audio/aacenc -b </mnt/mix/audio} \
	$key

It is important to mention that performance and quality depend on screen brightness and contrast. The stream's brightness will be significantly lower than the picture of the screen. This should be taken into account for games, by for instance raising a brightness/gamma setting if available. Contrast is also critical for text to be legible. For idiots like me who like to use non-standard color schemes, high contrast colors must be used to avoid blurring. Obviously, the need for more contrast is diminished with better quality settings for hj264.

Multiple audio sources

The README in Sigrid's rtmp repo already shows simple ways to mix in and monitor a microphone/headset, and that will probably be sufficient for most people. In my use case, I would like to mix in audio from multiple systems (including Unix ones), a studio microphone and electric guitars or bass. I use a hardware mixer to manage all of this, and want to patch its output into rtmp. Thankfully, the mixer has a USB interface and can act as an audio card without any drivers, which 9front can happily use.

There are multiple problems there however. First of all, the Haswell desktop and other more recent hardware mostly have USB3 connectors, which may or may not work with the card on 9front (and they don't for me), probably due to bugs in the usb stack or xhci driver. Second, even if the ports work, there currently isn't a way to set volume with nusb/audio, ie. the volume control doesn't work (at least not on cards I've tested) and any output is at full volume, which is unsuitable for me.

To solve this, one may just use the USB connection on an older machine, and import its own #u (USB) device. The following really does the same thing as what Sigrid shows, but with a few adaptations:

; rimport -u qwx w500 '#u' /n/u
; video/hj264 -f 60 -k 1000 -Q 50 /dev/screen |\
	video/rtmp -a <{audio/aacenc -b </n/u/audioin} \
	$key

Notice the use of the audioin, and the lack of the need to use dd(1) to pump empty samples into mixfs(1).

Face cams

The fullest, most authentic livestreaming experience necessitates showing one's likeness on the screen. As others have already shown, this is possible with USB webcams via the use of the thus far undocumented nusb/cam(4) and camv(1) programs.

nusb/cam(4) requires the id of a USB endpoint of a USB camera. To find out the correct one, you can monitor /dev/usbevent when attaching the camera:

; cat /dev/usbevent
attach 7 0ac8 3500 0102ef cd8ba

If that doesn't help, one can just try every existing endpoint:

; ls -d /dev/usb/ep*.0/
/dev/usb/ep1.0
/dev/usb/ep2.0
/dev/usb/ep3.0
/dev/usb/ep4.0
/dev/usb/ep5.0
/dev/usb/ep6.0
/dev/usb/ep7.0
/dev/usb/ep8.0
; nusb/cam 7
;

Next, camv(1) is used on the directory nusb/cam(4) creates:

; camv /dev/cam7.1

If successful, it will display video from the camera. Clicking the screen with the middle mouse button will show all available settings, including resolution, framerate, color settings, etc. However, several of them including resolution and framerate cannot be changed while camv(1) is running. For that, exit camv(1) and write to the camera's ctl file:

; echo format 320x240 >/dev/cam7.1/ctl
; echo fps 30 >/dev/cam7.1/ctl

The names of the parameters are also those displayed in camv(1).

There is one last component to this setup, which is to have the camv(1) window always be on top without taking away focus from other windows. A simple hack for this is to just send a top message to the window's ctl file in an infinite loop:

; { while() sleep 2 && echo top >/dev/wctl } &

If using Sigrid's riow(1) and for this to work across workspaces, one must also make the window sticky with Mod4+s.

Message and status bars

It's easy to implement a simple text bar to stick to a border of the screen and display messages: for instance the name of the music being played, with perhaps a likelihood of being auto-muted. One idea is to open a statusmsg(8) window, and feed text to it from a pipe posted on /srv (see srv(3)). Here's one implementation, dubbed sbar(1):

#!/bin/rc
rfork n
rm -f /srv/bar
bind -a '#|' /mnt/bar
<>[3]/mnt/bar/data1 {
	echo 3 >/srv/bar
	aux/statusmsg -k -w 720,1048,1590,1080 </mnt/bar/data &
}

Its placement is hardcoded right now, but that's easy to change. You can then either manually write (append) text to the /srv/bar file, or write more scripts to prompt for text to send there. I use Sigrid's riow(1) and bar(1), so I've elected to add a button to the bar(1) setup which does just that:

[...]
window -r $riowrect 'label riow;
fn ⑨ { if(test -f /srv/bar) window ''echo -n >/dev/snarf; hold /dev/snarf && cat /dev/snarf >>/srv/bar'' }
fn bar{
	sed -u ''s/$/│⏪│⏯│⏩│⑨│⏵│⏸│⏭│⏶│⏷/g'' \
	| /bin/bar -s ''│'' \
	| awk ''
		/⑨/{system("⑨")}
		# zuke
		/⏪/{system("plumb -d audio ''''key <''''")}
		/⏯/{system("plumb -d audio ''''key p''''")}
		/⏩/{system("plumb -d audio ''''key >''''")}
		# shp
		/⏵/{system("Sta")}
		/⏸/{system("Sto")}
		/⏭/{system("Fw")}
		/⏶/{system("v+")}
		/⏷/{system("v-")}
	'' >[2]/dev/null
}; riow'

Streaming the Nintendo Switch

This console's video output can be captured using a cheap HDMI → USB3 capture card. If the card is recognized, it will act as a USB camera (just like on Unix with v4l2) and can be used in the same way as above, even at the same time as a regular webcam.

The main limitation is that currently the only supported video format is YUY2, with which larger resolutions may only be displayed at very low framerates. Optimal resolutions and framerates can only be used with the MJPEG format. However, once added there isn't much in the way of smooth drawing at 720p and 60Hz, other than 9front's own draw performance. The capture card also provides audio, but this isn't detected and seems to require additional work as well.

Putting it all together

Here's what a script using all of this may look like:

#!/bin/rc
echo hwgc soft >/dev/vgactl
echo blanktime 0 >/dev/mousectl
window -r 1730 839 1920 1055 '
	label cam
	nusb/cam 7
	cat <<! >/dev/cam7.1/ctl
format 320x240
fps 30
backlight-compensation 1
brightness 20
contrast 95
saturation 40
sharpness 7
gamma 200
!
	{ while() sleep 2 && echo top >/dev/wctl} &
	camv /dev/cam7.1'
echo 'proceed?'
read -n 1 >/dev/null
rimport -u qwx w500 '#u' /n/u
video/hj264 -f 50 -k 1000 -Q 50 /dev/screen |\
	video/rtmp -a <{audio/aacenc -b </n/u/audioin} \
	$key

Offline recording

Whether for tests or for another purpose, it can be useful to record offline to a file. I don't think that there currently is a way to circumvent using ffmpeg on a Unix to join the audio and video files/streams in a container like MP4. One could use ssh(1) to script conversion with ffmpeg. Either way, the basic command to create an MP4 file is simple:

$ ffmpeg -i vid.264 -i aud.aac -c:v copy -c:a copy vid.mp4

Of course, you could encode the audio in FLAC or Ogg/Opus instead, use an MKV or WEBM container, etc.

Experiment: streaming another machine's screen

I've made many tests with many different setups at this point. The most interesting one from my point of view is to use a separate, faster machine to do the encoding and streaming. I have another small Comet Lake desktop which is nearly current-gen, and by that virtue alone at least two times faster than the Haswell machine. However, its UEFIshit doesn't support any legacy BIOS features of any kind. Once 9front is booted, UEFI provides a screen buffer the size of which we cannot control, and it turns out that it's a paltry 640x480 resolution, not enough for my purposes.

One idea therefore was to rexport(1) the Haswell desktop's 1920x1080 /dev/screen to the Comet Lake one (as well as an audio source) and start hj264, rtmp and the rest on that one. Unfortunately, this did not work out well enough, since the transfers choke performance even on local LAN. On the other hand, it may work slowly, but it does work, for what that's worth. Here's what the script looked like (parts of it can certainly be done better:

rfork ne
# record w500 audio/video on x240 or other machine
cmd=`"{
cat <<'EOF'
ramfs
touch /tmp/^(audio screen)
f=`{ls /srv/rio.$user.* | sed 1q}
mount $f /mnt/wsys 1
bind /mnt/wsys/screen /tmp/screen
mount /srv/mixfs /mnt/mix
bind /mnt/mix/audio /tmp/audio
rexport -s rec /tmp x240
EOF
}
rcpu -u $user -h w500 -c rc -c $cmd &
rpid=$apid
while(! test -f /srv/rec){
	echo waiting on connection
	sleep 5
}
theo
mount /srv/rec /n/u
window -scroll 'echo slay hj264 aacenc cam exportfs ''|'' rc; echo rm -f /srv/rec; echo echo kill ''>''/net/tcp/'^$rpid^'/ctl ''|'' rc; rc'
window -m rc -c 'audio/aacenc -B 262144 -b </n/u/audio >/tmp/rec.`{date -t}^.aac'
video/hj264 -f 15 -k 1000 -Q 50 /n/u/screen >/tmp/rec.^`{date -t}^.264
unmount -q /n/u
rm -f /srv/rec

Troubleshooting

If you're using a USB audio card, you may encounter problems with both new and old machines. On older machines, watch out for USB 1 ports (UHCI): reading the device will work for a short while then die with a babble detected read error (insofar as I understood that to be the cause of the trouble anyway). Make sure you use a USB 2 port (EHCI): for example, one of my t61p's has 2 USB 1 ports and 1 USB 2 port. Nice symmetry, bad surprise. With USB 3, results may vary. On some machines it doesn't work at all, on others it works fine. Make sure you try other ports and/or other machines.

Epilogue

Streaming on Twitch has been both fun and a pain in the ass over all, mostly because of their webshit and DMCA practices. I intend to continue to do it until the moment where my stream gets auto-muted because of the music I listen to. Playing audio clips from a movie always mutes it almost immediately.

Performance-wise, as long as the machine I use for the actual streaming performed well enough, that is with settings such that neither video nor audio encoding could choke it, the stream is smooth, does not break up, and audio does not desynchronize. The Haswell desktop has been thus far my machine of choice. All other ones (incl. x240, t490s, w520, w500) did not fare as well and saw either low framerates and audio underflows, the encoding process failing abruptly, or the stream being cut off, though admittedly also because I tried hard to have the highest possible picture quality that I could get away with.

Suggestions for alternatives to Twitch though with similar basic features would be very welcome. I don't quite know how to use eg. Peertube for my purposes.