Imagine that making your own video or audio news programs was really easy.
Imagine if you could pull together news stories from various sources, combine them and publish the results in minutes.
Imagine that the story you made came complete with a transcript, was fully accessible and could be picked up by search engines.
Imagine that you could easily embed your story into a web page and that others could incorporate your story into new stories.
Imagine that viewers of you story could share any part of it in seconds.
Imagine all this and you are imagining the functionality of the Hyperaudio Pad.
Listen to a use case.
Watch the screencast.
Currently, the process of putting together a media program, the process will probably involve some fairly complex audio/video editing software. The result – a fairly ‘static’ representation. The whole process is time consuming, wasteful and inflexible.
By changing the way we think about media, we can create a new way of manipulating it. The Hyperaudio Pad allows quick manipulation of media and takes advantage of a form of media representation referred to as a hypertranscript.
When we tell a story we invariably use the spoken word, the spoken word can be transcribed, and once transcribed can be used as an accurate reference to the media it is associated with.
Hypertranscripts are part of a broader concept dubbed Hyperaudio by Henrik Moltke, they are transcripts hyper-linked to the media they represent, a type of fine-grained subtitle, they are separate entities from the media they describe, existing as a word-level aligned references expressed in HTML.
Tools and services exist to create Hypertranscripts, word level timing can even be approximated from subtitles. In the past I have used libraries such as Popcorn.js, jPlayer and data from the Universal Subtitles project.
The good news is that Hypertranscripts already exist, here are a few examples:
Using Hypertranscripts we can:
- Navigate through the media via the text.
- Select and play parts of the media by highlighting the relevant parts of the text.
- Share parts of the media creating URLs that contain start and end points.
- Make our media more discoverable (via search engines)
- Make our media more accessible.
- More easily visualize the media.
- Combine, embed and easily integrate media (using simple tools)
The Hyperaudio Pad
The Hyperaudio Pad is a tool to allow users to easily assemble spoken word based audio or video from Hypertranscripts.
It will allow users to create their media by simply copying and pasting text from existing transcripts into a new transcript. These Hypertranscripts link to the actual media that they transcribe, so when parts of the transcripts are copied the references to their associated media are also copied.
Note that the media itself is not copied, just references to parts of it, the result is mash-up of video or audio and an associated representative transcript.
The tool will be :
- Web based, cross-browser, cross-platform (work on mobile devices) and open source.
- Simply presented, taking cues from minimalist distraction-free tools such as iAWriter.
- Intuitive and extremely easy to use, taking advantage of the text editing paradigm.
- Extensible – written in such a way that it can easily be built upon.
The interface should be clear and minimal, the user can search for and choose from existing hypertranscripts, open them, play them and copy parts of them to their ‘document’ all from within the same page.
Who is this tool for?
- Newsroom journalists who need to very quickly assemble media from different existing sources and perhaps adapt the result as news unfurls.
- Podcasters, citizen journalists and mediabloggers who want a free and easy way of putting together media programs or newscasts.
- Anyone who wants their resulting media programs to include a transcript for increased accessibility and visibility.
- Anybody who is happy for their results to be taken apart, re-combined or referenced by others.
How does it work ?
Hypertranscripts are essentially defined in HTML. The simplest possible example for the use by a tool such as the Hyperaudio Pad could be:
Note, we can add to this format should we wish to include meta and other data, for example :
Transcripts will contain a series of these marked up words each individual word containing enough data to describe it’s associated media and at which time in the media it occurs.
Note that copying the Hypertranscript into any text editor will result in valid HTML and so can be used in other tools and applications external to the Hyperaudio Pad.
It is likely that the user’s resulting transcript will reference several different pieces of media and so it may beneficial for the application to ‘load’ and possibly cache the various media sources in advance for smoothest possible playback. We can achieve this by parsing the transcript after every save.
Future developments could include a simple scripting language that users can insert between transcripts to make transitions smoother and even the possibility to add background music or sound effects for example:
[fade out over 5 seconds]
[background fade in 'threatening music' over 3 seconds]
and then at the bottom
[reference 'threatening music' http://soundcloud.com/some.mp3]
You may say I’m a Dreamer
In order to start using the Hyperaudio Pad to its full potential we first need our media to be transcribed and to do this we need to make tools that make this easier and if possible free. However I believe that hypertranscripts deliver so many benefits that the incentive to transcribe media is high.
The Hyperaudio Pad is just one of many tools that could be built on the underlying Hypertranscript platform. To build this platform we should collaborate with others to :
- establish a standard markup for transcripts
- make the platform easy to build upon and enhance as new technologies become viable
- create a community with a pioneering spirit
- develop an ecosystem to facilitate the creation of a toolkit
During the course of this project it’s been a pleasure to find many other participants interested in the general theme of describing/transcribing media and utilizing the result. Over the last couple of weeks I have been collaborating with Julien Dorra, Samuel Huron, Nicholas Doiron and Shaminder Dulai. The excitement has been palpable and although only communicating virtually it felt like we were sparking off each other.
So I guess at least as far as this dream is concerned, I’m not the only one.
4 Comments to The Hyperaudio Pad – a Software Product Proposal
- The Hyperaudio Pad – Next Steps and Media Literacy
- Breaking Out – The Making Of
- Breaking Out – Web Audio and Perceptive Media
- Altrepreneurial vs Entrepreneurial and Why I am going to Work with Al Jazeera
- HTML5 Audio APIs – How Low can we Go?
- Hyperaudio at the Mozilla Festival
- The Hyperaudio Pad – a Software Product Proposal
- Introducing the Hyperaudio Pad (working title)
- Accessibility, Community and Simplicity
- Build First, Ask Questions Later
- Further Experimentation with Hyper Audio
- Hyper Audio – A New Way to Interact
- P2P Web Apps – Brace yourselves, everything is about to change
- A few HTML5 questions that need answering
- Drumbeat Demo – HTML5 Audio Text Sync
Add new tag