Hyper Audio – A New Way to Interact

Friday, April 8th, 2011

Mark Boas

Recently I had the privilege of working on a very interesting project with a few folk from Mozilla – it’s the type of project I love to work on, as it involves web audio and its deep integration into the general web experience.

Web audio is no longer consigned to being the passive play and pause experience of yesteryear, it has the potential to be much more, it can be a driver of much richer interactions, something Henrik Moltke explores with something he dubs Hyper Audio. The remit of the project was to take various media elements of a radio interview broadcast by Danish Radio station DR; audio, subtitles, transcripts, footnotes etc and link these in an intuitive and useful manner.

To say this project was right up my street would be an understatement – this project was in my flat, raiding my fridge and drinking my beerz. I was already fascinated by the concept. I’d been playing about, creating audio related demos for a couple of years and in November last year I decided to attend the Mozilla Drumbeat festival and created a demo for the event. The demo was accepted to be exhibited at the science fair on the opening evening and garnered some interesting feedback both on and offline, what it effectively demonstrated was the synchronization and bi-directional control of text and audio.

When Henrik asked me to work on this project, I naturally jumped at the opportunity. Due to time differences, pressing deadlines and the luxury of having a nice quiet office, I stayed up late most nights for a week, happily hacking away and helped out and supported by various Mozillians and the popcorn.js community.

So that’s the back-story, here’s the demo.

Screenshot from HyperDisken Demo (Hyper Audio)

Some things to try :

  • Switch the audio from English to Danish – it should continue from the same point in Danish, subtitles and the transcript should also change appropriately.
  • Try clicking on words in the transcript – the audio should start playing from the corresponding point.
  • Highlight a passage of transcript text – this should add a tweetable excerpt to the ‘share’ box. The URL included should just play that part of the audio.
  • Clicking the music note icons in the ‘media’ box should take you to the point of the audio where that resource was mentioned.

How did we achieve this? We used popcorn.js to display subtitles, footnotes and other time-related resources. In fact a lot of this was already in place when I picked up the project. I then integrated jPlayer for the audio playback and deeper interaction. Popcorn allows us to associate timings with actions and have these actions triggered by media when they hit said timings. So pretty much perfect for our needs. jPlayer provided a solid abstraction above the native audio API, it allowed me to easily synchronize and switch audio tracks and jump to specific points or sections in the audio, with very few lines of code. Importantly it also protected us from any cross-browser issues and allowed our designers to effortlessly create a custom skin for the player.

So this was the control, but what about the media? Well this part was a massive team effort. Henrik managed to provide a very accurately timed transcript. We had hoped to use the subtitles in SRT format but for convenience we parsed them or rather Scott Downe parsed them into JSON format.

One of the bigger issues we encountered was that we only had the transcript in English and the timings for the Danish transcript were naturally different. Luckily we had accurately timed Danish subtitles and legendary Bobby Richter on hand to convert the subtitles to individual words complete with their timings, which he did by cunningly interpolating the timing of words (based on word length) and based on their in-subtitle position. All knocked out in about 10 minutes and in 20 lines of code. It worked surprisingly well, of course you need to be able to understand Danish to truly tell. We could have probably parsed the subtitles into the transcript on the fly but due to time limitations we made them static.

Perhaps an aside not directly related to audio, I managed to hack together some code that allowed highlighted transcript text to be placed in the ‘share’ box, and grab the timings of the first and last words, from there it was pretty much straightforward to make this excerpt tweetable.

This whole endeavor was very much a group effort, a huge thanks to the popcorn.js team, who made joining their IRC feel like walking into a pub full of friends.

Special credit and thanks then should go to Scott Downe, Bobby Richter, Barry Threw, David Humphrey, Brett Gaylor, Ben Moskowitz, Christian Valentiner, Silvia Benvenuti and of course Henrik ‘Tank’ Moltke whose baby all this was. It was great being part of such a talented team. Awesomesauce indeed.

Mark B

Tags: , , , , ,

Friday, April 8th, 2011 Audio, HTML5, Hyper Audio, javascript, jPlayer, Popcorn.js, Web Apps

7 Comments to Hyper Audio – A New Way to Interact

  • [...] Here’s how it was built; this is a Mozilla project using popcorn.js. [...]

  • [...] BoasĀ  worked tirelessly on the demo, and has written up a great post that outlines the technology and procedure behind the demo. My workflow was [...]

  • [...] the fun we had making the Hyperdisken demo, I was happy to be asked by Mozilla, in collaboration with Radiolab and [...]

  • [...] known as Hyperaudio and was given the opportunity to create a couple of proof-of-concept demos : Denmark Radio’s Hyperdisken Demo and the Radiolab/Soundcloud collaboration ( a shout-out to Henrik Moltke for doing much of the [...]

  • Hey Mark – I believe we met briefly in the jPlayer google group – again – Kudos for creating such a great tool for the web.

    I am particularily interested in what you’ve done here for our purposes at Farm Radio International. http://www.farmradio.org – we work with over 350 partner radio stations across Africa to support the creation of “good farm radio”. Radio is still the media of choice with the most reach in Africa – especially in rural areas. We are looking at ways to improve this using new technologies to make radio even better at reaching farmers in Africa.

    As you can imagine, there are many languages in Africa. Most of our 350+ partners broadcast in a local dialect/language. We can currently share audio examples of the radio programs (typically btwn 15 and 30mins) however, most online listeners wouldnt understand. So Im interested in implementing your solution here to play the audio in the native language but have an english/french subtitle track running along.

    By implementing some of the scripts like Bobby Richter’s and finding a way to dynamically create the subtitle pages – if we provided transcripts of the audio/radio pieces (we already have these but without timings) – could we implement something similar?

    Where would the pitfalls be?
    How hard would it be?
    Where would we need to invest the most energy?

    Thanks for your toughts…
    Bart

  • Oh yes – this service would be useful for strengthening the online community of radio broadcasters so they can share their radio pieces with one another and actually understand what is happening in the audio

  • [...] One example: My Little Ponies audio mixed with video from a toy gun ad. He mentioned a tool called Hyper Audio as a new way to engage critically with media — I know it has something to do with popcorn.js, [...]