Manual Audio and Slide Capture

From Bjoern Hassler's website
Jump to: navigation, search


2007-12-20 , Capturing a lecture, Capturing a mathematics lecture, Manual Audio and Slide Capture, Automated Audio and Slide Capture, OpenOffice Slide Capture, Lecture browsing system.

1 Manual lecture capture

This articles describes a manual lecture capturing process, with slides and audio. Emphasis is on:

  • recording a lecture with synchronised slides
  • recording a lecture 'non-invasively', i.e.
    • without interrupting the flow of the lecture, or without special requirements on the lecturer.
    • without special requirements in the lecture theater
  • with minimal equipment
  • so that the output is compatible with low-bandwidth accessibility requirements.

These requirements impose significant constraints. Certainly in the longer run, you may want to go to a more automated system, but the present article shows how you can effectively record lectures, and present them online, with a maximum use of existing resources.

You should first read Capturing a lecture to choose the appropriate method for capturing the lecture. You may find that audio and slides is sufficient (without synchronisation). If you then decide that you need synchronisation, read on.

If you can use camtasia (on PC/ppt), you have an easier way of recording timings. Likewise, Apple Keynote 4 can record presentations (and timings can be extracted, Keynote 4 Extract Recordings). Also ProfCast on Mac can record presentations. Similarly, you might be able to use a stills camera taking pictures automatically. For more information, see Automated Audio and Slide Capture. However, the notes below are for situations where those tools aren't available!

2 Before and during the lecture

2.1 Requirements

You need:

  • Some way of recording audio, e.g. ideally Microphone + Audio recorder. (could be a recording into a computer, or into some other recording device)
  • Some way of recording pictures, e.g. a stills camera, or a mobile phone
  • Ideally, you have a printout of the slides in front of you (e.g. as 4-up)

Recorder: Ideally, you would have a digital recorder. These are available quite cheaply now, and will save you a lot of work. However, if a recorder isn't available, you can record straight into a computer (with the risk of the computer crashing and thus loosing the recording), or record onto an analogue medium (cassette), and transfer to a computer later.

Microphone: The microphone needs to be close to the speaker. If you do not have an external microphone, place the recorder as closely to the speaker as possible. Use an external microphone if you can, ideally a radio microphone.

2.2 Recording

Audio: We used a Zoom H2 and Sennheiser EW112P microphone. This is quite good gear. Use whatever gear you have. The details of audio recording are described in the Audio Recording Tutorial.

Pictures: You first start recording with the recorder, and the first picture you take with the 'camera' is of the display of the recorder. This is to determine the offset between the recorder time and the camera time. During the lecture, you take a picture of the powerpoint show immediately after the lecturer changes slide. I recommend that, at a first attempt, you should only take images when slides change. If you take images inbetween, it makes things harder lateron. In principle, you can of course take more pictures, e.g. every time something significant happens.

In you have a recorder with chapter marking ability (such as the FR2LE): Instead of the camera, you can insert a track whenever slides change.

You should have a printout of the slides in front of you, so that you can mark down correspondence between the images and the slides, or points where the lecturer goes back.

Notes: You might also want to take notes. Every time the slide changes, you look across to the audio recorder, and note down the time. Ideally you type this straight into a computer.

2.3 Release Form

Ideally you speak to the lecturer well in advance, and get them to sign a release form. The shape of the release form will vary between different jurisdictions, and what you want to do with the materials, but essentially you need to make the lecturer aware of what you want to use the material for.

If the lecturer has not signed a form in advance, they need to sign one immediately after the lecture. Also, get their presentation.

You need to be careful about 3rd party copyright. The lecturer may be using materials that cannot be used for public broadcast. Typically this will be materials that

  • have not been produced by the lecturer themselves (e.g. music, feature film extracts, photographs from the web)
  • do not have explict licensing attached (e.g. creative commons materials, or US government materials, say from NASA)

You'll need to remove all uncleared materials from the recording or slides.

From an OER perspective, it's a good idea to publish materials under a creative commons license. Let the lecturer know that you're intending to do this, and get the relevant releases from the lecturers and from 3rd parties where necessary.

3 After the lecture

You should now have:

  • A digital audio recording
  • A completed release form
  • A set of images from the camera (or mobile phone)
  • The powerpoint or pdf from the lecturer in digital form (if you have a ppt, make an optimised pdf also)
  • A set of notes from the lecture

3.1 Export ppt or pdf slides to images

Firstly, take the powerpoint or pdf, and export one image per slide. You can do this e.g. from powerpoint using Open Office, but there are quicker ways of exporting all images in one go, e.g. using pdf utilities under Ubuntu/Linux. You'll end up with a set of files something like this:


3.2 Rename and extract timings from images

You now take the photos from the camera (or mobile phone etc),


and copy them to a folder called 'images-original-all', and then manually create a between the photos and the slides as follows: For each image, work out the slide during which the image was taken, and insert this into the name. I.e. you compare the photos with the slides created from the pdf file your got from the lecturer. After you have amended the names, your list should look like this:


In this example, two photos were taken during slide 1 (one hopefully immediately after slide 1 appeared, the other one at some point between slide 1 and 2), one photo during slide 2 (immediately after the switch from slide 1 to slide 2), and lateron, three images were taken during slide 43.

The copy the folder 'images-original-all' to 'images-original'. If you took more than one image per slde, we'll now have to option to discard some of those extra images.

3.3 Discarding images

This step only applies if you took more than one image per slide. You then need to make decisions as to which images to keep. Of course you must keep each image taken immediately after a powerpoint slide transition, so that you can process the audio accordingly later. However, if you took more than one image per slide, you then can to decide which ones to keep. For the images you keep, starting counting from '1', i.e. immediately after a transition, add '001'.

Slide01.DSC035.jpg -> Slide01.000.DSC35.jpg (corresponds to lecture slides)
Slide01.DSC036.jpg (discarded)
Slide02.DSC037.jpg -> Slide02.000.DSC37.jpg (corresponds to lecture slides)
Slide43.DSC053.jpg -> Slide43.000.DSC053.jpg (corresponds to lecture slides)
Slide43.DSC054.jpg -> Slide43.001.DSC053.jpg (keep extra slide) 
Slide43.DSC056.jpg -> Slide43.002.DSC056.jpg (keep extra slide)  

Note that if you only took images immediately after slide transitions, then you don't need to discard any images, and you can skip this whole step.

3.4 Extracting timings

The task is to extract the timings from the images from the camera. Use this python script, or some other script you may have. If necessary, calculate this by hand: From the first image (of the recorder) note down the 'offset time', i.e. timing shown on the screen of the recorder. If the picture was taken immediately after the recorder was switched on, this will be close to zero.

For each image, calculate:

Current image time relative to audio recording = (Timestamp of current image) - (Timestamp of first image) + (offset time)

It's easier to use python or perl to work this out automatically! For script to compute times for each slide, see Compute image timinigs.

Rather than using time stamps on the files, you can also use exif data.

You now have a list like this:

123 (tab) Slide01.001.jpg
243 (tab) Slide02.001.jpg
1823 (tab) Slide43.000.jpg
1901 (tab) Slide43.001.jpg
2104 (tab) Slide43.002.jpg

where the first column is timings in seconds, relative to the start of the audio file, and the 2nd column is image file names.

Note: As we will order the images based on time stamps of the images, you can't reorder images at this stage. You can swap images and adjust timings once you are editing in Audacity.

Well done so far! The process of synchronising images is messy, and will require some detective work, to get all the slides lined up. It will become a lot easier with time! BUT - This was the hardest bit. From now it's downhill.

3.5 Converting images

For images, you now have

  • A set of images from the camera (in a folder called 'Images-Camera')
  • A set of images from the slides. (in a folder called 'Images-Slides')

You then copy 'Images-Camera' to a new folder, called 'Images-Combined'. You then copy all images from 'Images-Slides' into 'Images-Combined', overwriting exiting images. You now have a mix of images from the camera and from the slides.

If you need to compress your images further, make a folder called 'Images-Ready' and put small versions into that folder (e.g. at 320x240 resolution).

3.6 Audacity

Now import the audio into audacity.

Step 1: Process the audio as necessary (see e.g. Tutorials or ICTP_Workshop_2007/Content_Sessions). You want to

  • Turn the stereo track into mono.
  • Normalise.
    • There may be one of two loud noises (e.g. the mic being dropped, or a door slamming), that stop the audio from increasing in volume when normalised. Reduce the volume on any loud noises, and normalise again.
    • When you are done with this, apply dynamic range compression. Make sure the sound quality is ok afterwards! For details on this, see Dynamic Range Compression.
    • Reduce further peaks, normalise again if necessary, or treat different sections of the audio different.

Step 2: Synchronise with images. Thanks to the timestamps from the images, an thanks to Audacity label tracks, this is now very easy. Simply import the above list with timings as a label track. You'll see a set of labels appear in Audacity, corresponding to the start of each slide.

Ideally, with the list of images next to you, check the beginning of each segment: The segment breaks should break during silences, e.g. between sentences or at least between words, and ideally they need to be at suitable moment. Also, maybe one of the images was take a little late, and you need to move the marker forward. So you need to adjust the labels slightly as necessary.

Rexport the lable track from Audacity when you are done.

Step 3: Export. When you are done with this, use 'export multiple' and select 'export the audio based on markers'. Export this to Audio-wav.


3.7 Notes on export (1) - Why to export as wav.

Ideally you would export each segment (using 'export multiple' in audacity) as uncompressed wav. You might also want to export the entire file as uncompressed wav.

Once you have all these wav files, you can then encode them to mp3. Why not export to mp3 straight away from audacity?

Generally speaking, you want to keep a finished version of your project, output to a media file.

  1. You might say But I can go back to Audacity and reexport at any point lateron!, but that's not always the case: You might have lost some of your source files, or Audacity may have moved on to a different version that no longer understands the older file format etc. If you export a wav, it's likely to have a longer lifetime.
  2. Secondly, you might want to export your audio to several different formats, e.g. low and high bitrate mp3. For more details see Multiformat Media Delivery
  3. Finally, Audacity doesn't presently offer you to export to mono mp3 at 32kbps (which is what you want!)

To convert the wavs to mp3s, you can use lame, or iTunes, or any other program you have access to. However, make sure that you are encoding into mono 32kbps. You can use iTunes to check this! For playback in flash (which we'll do later), you also need to make sure that your sampling rate is 1/2 or 1/4 of 44100Hz, so we'll choose 22050 Hz (and not 320000Hz).

When you've converted, put the results into a folder Audio-mp3.

3.8 Notes on export (2) - Single file vs. many files.

Once you have done the export above, you will have a set of images, with a set of audio files, where each audio file corresponds to one image. This makes for simple playback, and also allows you to mix and match segments from various lectures, as well as to access segments individually (see comments on searching below).

However, it means that once the audio has been exported, you cannot easily change the timings. To change the timings, you'd have to go back to the original Audacity project, and rexport. Clearly this isn't a great prospect.

Another possibility is to take a single audio file, and 'attach' the slides to it at various points. In principle, that's a good thing, but it has two drawbacks.

  1. The technology is less straight forward. You can e.g. use smil, but that locks you to quicktime player or realplayer (both with slightly different dialects of smil). Or you can use SAMI, but then you're locked to Windows Media Player. These options will all massively reduce your audience, so they cannot be recommended. You could package the file as an enhanced podcast, but again, it's then only usable on quicktime player, in iTunes, or on iPods. That's not good enough. As of Flash CS3, you can load cue points via XML. That provides a suitable alternative, but currently no freely available player supports this, so you need to know some flash to do this. We'll try to do a tutorial on this Flash Cue Points via XML.
  2. You're caught between a rock and a hard place regarding delivery. If you do progressive download, especially over low bandwidth, the lecture has to be accessed more or less linearly. It's not easy for viewers to fast forward to the middle of the lecture. So to make this easier, you'd then need to move to a streaming format, e.g. flash streaming, or mp3 streaming. You'll need more expertise to get this going, and you'll encounter some of the problems mentioned in the previous point. Moreover, with streaming, if your viewers don't have the required bandwidth, they'll just have a lot of buffering, and won't be able to access the content easily. Ideally, you'd do a flash pseudo-streaming (http/php based method), with we'll leave for a future tutorial, see Flash Cue Points via XML.

Hence for now, we're going to stick with the 'many files' solution, and will hopefully find a better flash streaming at some point.

3.9 Your notes

Type up your notes, slide by slide.

Copy the slide contents (if appropriate) to a separate file ("Slide01.txt", "Slide02.txt", ...). You can use this manually, or use a utlity like pdftotext.

4 Publishing

4.1 Step 1

You now have:

  • A set of high quality wav files, which have the same names as the images (in a folder called 'Audio-wav')
  • A set of compressed mp3 files, which have the same names as the images (in a folder called 'Audio-mp3')
  • A set of original images from the camera (in a folder called 'Images-Camera')
  • A set of original images from the slides. (in a folder called 'Images-Slides')
  • A set of original images from the slides. (in a folder called 'Images-Combined')
  • A set of images from the slides. (in a folder called 'Images-Ready')
  • The pdf of the presentation (called 'Lecture_123.pdf')
  • A file with notes on each section (Notes.txt)
  • A set of files with text from slides (in a folder called 'Slides-Text')

We now write all of this to CD/DVD for safe keeping. Then from the above folders, we now take:

  • A set of mp3 files (in 'Audio-mp3')
  • A set of image files (in 'Images-Ready')
  • The pdf of the presentation (called 'Lecture_123.pdf')

and copy these to our webserver, into a folder called 'Lecture_123' (or similar), so that you have


4.2 Step 2: The Glue

Although all content is now available, it's not available in a user friendly way. We now tie the results together, to give a good user experience. We produce web-viewable content as Flash, as well as downloadable content as a podcast feed.

IMPORTANT: The code below is an example. Really your job finished with providing the slides, the audio, and other materials. The podcast feeds and pages are generated automatically, and you need to make friends with somebody who is a competent perl, python or other scripter!

4.2.1 Podcast feed

We now use these items:

  • A file with notes on each section (Notes.txt)
  • A set of files with text from slides (in a folder called 'Slides-Text')

To make the feed, we create a file 'Lecture123.rss' as follows:

<general rss stuff, see elsewhere/>
  <title>Lecture 123: Pdf file of slides</title>
  <description>General description of the lecture</description>
  <enclosure url="http: //MyWebServerURL/Lecture_123/Lecture_123.pdf"/>
  <title>Lecture 123, Part 01: Bla bla</title>
  <description>Content on slide 1 from Notes.txt, as well as content of Slide01.txt</description>
  <enclosure url="http: //MyWebServerURL/Lecture_123/Audio-mp3/Slide01.000.mp3"/>
  <title>Lecture 123, Part 01 continued: Bla bla</title>
  <description>Content on slide 1 from Notes.txt, as well as content of Slide01.txt</description>
  <enclosure url="http: //MyWebServerURL/Lecture_123/Audio-mp3/Slide01.001.mp3"/>
  <title>Lecture 123, Part 43: Bla bla bla bla</title>
  <description>Content on slide 43 from Notes.txt, as well as content of Slide43.txt</description>
  <enclosure url="http: //MyWebServerURL/Lecture_123/Audio-mp3/Slide43.000.mp3"/>

4.2.2 Flash

For flash, we use either flowplayer or the JW media player An example in flowplayer is:

A very interesting lecture on 123
<object type="application/x-shockwave-flash" data="flowplayer.swf" width="320" height="240" id="FlowPlayer">
  <param name="allowScriptAccess" value="sameDomain" />
  <param name="movie" value="flowplayer.swf" />
  <param name="quality" value="high" />
  <param name="scale" value="noScale" />
  <param name="wmode" value="transparent" />
  <param name="flashvars" value="config={ 
    showPlayListButtons: true, 
    playList:[ { url: 'http: //MyWebServerURL/Lecture_123/Audio-mp3/Slide01.000.mp3', 
             overlay: 'http: //MyWebServerURL/Lecture_123/Images-Ready/Slide01.000.jpg' }, 
               { url: 'http: //MyWebServerURL/Lecture_123/Audio-mp3/Slide01.001.mp3',
             overlay: 'http: //MyWebServerURL/Lecture_123/Images-Ready/Slide01.001.jpg' }],
    autoPlay: false, 
    autoBuffering: false,
    splashImageFile: 'http: //MyWebServerURL//ClickHereToPlay.jpg',
    showMenu: false, 
    loop: false,  
    showLoopButton: false,  
    showFullScreenButton: false}"
<a href="http: //MyWebServerURL/Lecture_123/Lecture_123.pdf">Download the entire presentation as pdf</a>
<a href="http: //MyWebServerURL/Lecture_123/Lecture_123.rss">Subscribe to podcast feed for this lecture</a>
<a href="http: //MyWebServerURL/Lecture_Podcasts.rss">Subscribe to podcast feed for all lectures</a>

All the slides are about 32kB or less, and would download over a slow connection with in a few seconds (perhaps up to 10sec). You can browse all the slides without listing to the audio, and only where you want to start listening, you switch on the audio.

5 Lecture browsing system

Of course the last few steps should be automated, e.g. using a little bit of php to provide some glue.

Lecture browsing system