Hyperaudio

Making Audio a First Class Citizen of the Web.

hy·per·au·di·o

Non passive
Dynamically generated
Integrated into the web experience

As hypertext is to text, hyperaudio is to audio.

What makes audio so special?

Requires only partial attention
Conveys emotion well
Goes deeper ...

What do we need to make audio interactive?

Simply put ...

Set the audio playhead position
Get the audio playhead position

Word time-aligned transcripts. Hypertranscripts?

  <span data-t="123">Hello</span>
  <span data-t="456">Edinburgh!</span>

more info on custom data attributes

Transcripts break audio out of its black box

And make it ...

Navigable
Searchable
Shareable
Scotland.js 2013 - Mark Boas - @maboa

Navigable

A transcript can give you a very good idea of media content.

We can scroll through a transcript and take in content more easily than we can scrub audio or video.

Searchable

Transcribing your audio makes it much more findable and shareable.

Search engines can index your media and you can also search through text for keywords.

Shareable

Now we have a good grip on our content's word timings, we can allow people to link to excerpts of audio and their associated text.

We can add social mechanisms.

Language Switching

Precise timings mean we can switch languages on-the-fly.

Use Case : language learning applications.

Technology

Popcorn.js ↬
jPlayer ↬
jQuery ↬

Popcorn.js

✔ Light
✔ Modular
✔ Time based events built in
✔ Active community
✔ Library independent
✔ IE8+ compatible to an extent

jPlayer

✔ Light
✔ Modular
✔ Skinnable
✔ Active community
✔ Flash fallback built in
✔ IE6+ compatible

Popcorn.js + jPlayer

A winning combo?

support IE8 using a custom player
use one audio/video format
masks browser differences

jPlayer Popcorn Player

Going further with text

Highlight words as they are spoken.
Use colour and size of text.
Provide a mobile device experience.

The elephant in the room

How do we create word-aligned timed transcripts?

Use third-party services
By hand
Both

Third party transcription services (paid) $

3PlayMedia ↬
Koemei ↬
DotSub ↬
Dragon Speech Engine ↬
PlyMedia ↬
Ramp ↬
SpeakerText ↬
VoiceBase (freemium) ↬

Third party transcription services (free) ☮

Amara (formerly Universal Subtitles) ↬
CMU Sphinx ↬
Shout Toolkit ↬
Something that hasn't been made yet!

Hypervideo?

As most video has an audio track, we can apply the same techniques we use to control audio, to video.

Idea

When we synchronise audio with text we can manipulate audio in the same way we can text.

Hyperaudio Pad

Create audio/video programmes easily.
Web based intuitive interface.
Each programme comes with a hypertranscript.
Remix the remixes.
Programmes come with source intact.

Nothing left on cutting room floor.

Hyperaudio Pad Applications

Citizen journalism.
Mainstream journalism when time is an issue.
Prototype (first cut).
Casual mash-ups. Art?

... and now for something completely different.

We've talked a lot about speech-to-text.

What about text-to-speech?

Introducing speak.js

Demo 7 - Perceptive Media

FutureBroadcasts.com

Dynamically generated audio

We are starting to see the ability to dynamically generate audio content.

Libs like Speak.js can generate audio on the fly.
Advanced audio APIs allow 'real-time' effects.
Standards for future audio are being forged!

The Web Audio API

The Web Audio API is a high-level JavaScript API for processing and synthesizing audio in web applications.

It takes a node based approach, each node performs an audio function and connected together to define the overall audio rendering.

It can be as simple as this ...

Source : Getting Started with Web Audio API (HTML5Rocks)
Scotland.js 2013 - Mark Boas - @maboa

or as complex as this ...

Source : Web Audio API spec (w3.org)
Scotland.js 2013 - Mark Boas - @maboa

Web Audio API - Three Audio

acko.net/files/three-audio/
(@unconed)

What's around the corner?

Lots of new media related web technology:

WebRTC
Web Speech API
Media Fragments

WebRTC

A technology to facilitate P2P comms.

MediaStream API accessing camera and mic
PeerConnection API connecting media streams
DataChannel API real-time P2P transfer of data

WebRTC

Designed for browser-to-browser comms
Handles live streaming of audio and video
Facilitates the streaming of any type of data

Useful for ..

Voice Calling / Video Chat
P2P File sharing
Inter-application comms

Opus Audio Codec

Mandatory part of the WebRTC standard
Support for both constant bit-rate (CBR) and variable bit-rate (VBR)
Audio bandwidth from narrowband to fullband
Dynamically adjustable bitrate, audio bandwidth, and frame size
Good loss robustness and packet loss concealment (PLC)

Live Streaming is Possible

Pausing streams will cause audio to buffer. Reconnect to keep live.
Streams can break, and the browser generally recovers, but detection and reconnection is faster.
Only Flash can handle RTMP streams.
Chrome does not like the ICY response headers (often used by SHOUTcast). AAC will break.

Media Fragments

The three main parts are :

Spatial Dimension - eg #xywh=160,120,320,240
Temporal Dimension - eg #t=9,20
Track - eg #track='audio'

Web Speech API

An API for speech based input
Currently Chrome only
Server based

Demo

Mobile is Getting Better

But watch out for older mobile browsers.

Cannot autoplay
May not be able to affect volume
May not be able to play simultaneous audio
Will most likely not preload

Web Native Audio & Video - The Stats

◼◼◼◼◼◼ Supports
◼ Mixed
◼ Does Not Support

source: statcounter.com

The Future is Now

The audio element is very well supported, already we can break audio out of it's black box and integrate into our web experiences in new ways and forms.

Very soon we will have cross-browser support for advanced audio and real-time communications.

source: statcounter.com

Thanks for listening, you can find us on...

More info Hyperaudio Google Group

Hyperaudio

Making Audio a First Class Citizen of the Web.

hy·per·au·di·o

What makes audio so special?

?

Demo 1 - Martin Luther King

more info

What do we need to make audio interactive?

Word time-aligned transcripts. Hypertranscripts?

more info on custom data attributes

Demo 2 - Hyperdisken

more and more

Transcripts break audio out of its black box

Navigable

Searchable

Shareable

Language Switching

Demo 3 - RadioLab

more info

Technology

Popcorn.js

jPlayer

Popcorn.js + jPlayer

jPlayer Popcorn Player

Going further with text

Demo 4 - Radio24Syv

more info

The elephant in the room

Third party transcription services (paid) $

Third party transcription services (free) ☮

Hypervideo?

Demo 5 - US Presidential Debates

Demo 6 - State of the Union Speech

Idea

Demo 7 - Hyperaudio Pad

screen cam and github

Hyperaudio Pad

Hyperaudio Pad Applications

Hyperaudio Ecosystem

... and now for something completely different.

We've talked a lot about speech-to-text.

What about text-to-speech?

Introducing speak.js

Demo 7 - Perceptive Media

FutureBroadcasts.com

Dynamically generated audio

The Web Audio API

It can be as simple as this ...

Source : Getting Started with Web Audio API (HTML5Rocks) Scotland.js 2013 - Mark Boas - @maboa

or as complex as this ...

Source : Web Audio API spec (w3.org) Scotland.js 2013 - Mark Boas - @maboa

Web Audio API - Three Audio

acko.net/files/three-audio/(@unconed)

What's around the corner?

WebRTC

WebRTC

Opus Audio Codec

Live Streaming is Possible

Media Fragments

Web Speech API

Demo

Mobile is Getting Better

Web Native Audio & Video - The Stats

The Future is Now

Resources

Thanks for listening, you can find us on...

Source : Getting Started with Web Audio API (HTML5Rocks)
Scotland.js 2013 - Mark Boas - @maboa

Source : Web Audio API spec (w3.org)
Scotland.js 2013 - Mark Boas - @maboa

acko.net/files/three-audio/
(@unconed)