From the iPhone 3Gs up, it’s possible to encode compressed AAC audio from PCM audio data. That means great things for apps that deal with audio sharing and transmission, as the audio can be sent in compressed form, rather than sending huge PCM audio files over the network.
Apple’s produced some [sample code (iPhoneExtAudioFileConvertTest)](http://developer.apple.com/library/ios/samplecode/iPhoneExtAudioFileConvertTest/Introduction/Intro.html), which demonstrates how it’s done, but their implementation isn’t particularly easy to use in existing projects, as it requires some wrapping to make it play nice.
For my upcoming looper app [Loopy](http://loopyapp.com), I’ve put together a simple Objective-C class that performs the conversion of any audio file to an AAC-encoded m4a, asynchronously with a delegate, or converts any audio provided by a data source class (which provides for recording straight to AAC) and I thought I’d share it.
Grab the code, and a sample project demonstrating its use at the [GitHub repository for TPAACAudioConverter](https://github.com/michaeltyson/TPAACAudioConverter).
To use it:
- Include the class in your project, and make sure you’ve got the AudioToolbox framework added, too.
- Audio session setup:
If you already have an audio session set up in your app, make sure you disable mixing with other device audio for the duration of the copy operation, as this stops the hardware encoder from working (you’ll see funny errors like kAudioQueueErr_InvalidCodecAccess
(Error 66672)). I know that AVAudioSessionCategoryPlayAndRecord
, AVAudioSessionCategorySoloAmbient
and AVAudioSessionCategoryAudioProcessing
work for sure. TPAACAudioConverter
will automatically disable kAudioSessionProperty_OverrideCategoryMixWithOthers
, if it’s set.
If you’re not already setting up an audio session, you could do so just before you start the conversion process.
You’ll need to provide an interruption handler to be notified of audio session interruptions, which impact the encoding process. You’ll also need to create a member variable to store the converter instance, so you can tell it when interruptions begin and end (via interrupt
and resume
).
// Callback to be notified of audio session interruptions (which have an impact on the conversion process) static void interruptionListener(void *inClientData, UInt32 inInterruption) { AACConverterViewController *THIS = (AACConverterViewController *)inClientData; if (inInterruption == kAudioSessionEndInterruption) { // make sure we are again the active session checkResult(AudioSessionSetActive(true), "resume audio session"); if ( THIS->audioConverter ) [THIS->audioConverter resume]; } if (inInterruption == kAudioSessionBeginInterruption) { if ( THIS->audioConverter ) [THIS->audioConverter interrupt]; } } /*snip*/ -(void)startConverting { /*snip*/ // Initialise audio session, and register an interruption listener, important for AAC conversion if ( !checkResult(AudioSessionInitialize(NULL, NULL, interruptionListener, self), "initialise audio session") ) { [[[[UIAlertView alloc] initWithTitle:NSLocalizedString(@"Converting audio", @"") message:NSLocalizedString(@"Couldn't initialise audio session!", @"") delegate:nil cancelButtonTitle:nil otherButtonTitles:NSLocalizedString(@"OK", @""), nil] autorelease] show]; return; } // Set up an audio session compatible with AAC conversion. Note that AAC conversion is incompatible with any session that provides mixing with other device audio. UInt32 audioCategory = kAudioSessionCategory_MediaPlayback; if ( !checkResult(AudioSessionSetProperty(kAudioSessionProperty_AudioCategory, sizeof(audioCategory), &audioCategory), "setup session category") ) { [[[[UIAlertView alloc] initWithTitle:NSLocalizedString(@"Converting audio", @"") message:NSLocalizedString(@"Couldn't setup audio category!", @"") delegate:nil cancelButtonTitle:nil otherButtonTitles:NSLocalizedString(@"OK", @""), nil] autorelease] show]; return; } /*snip*/ } |
- Make the relevant view controller implement the
TPAACAudioConverterDelegate
protocol: That means implementingAACAudioConverterDidFinishConversion:
, andAACAudioConverter:didFailWithError:
, and optionallyAACAudioConverter:didMakeProgress:
to receive progress updates. - Create an instance of the converter, pass it the view controller as the delegate, and call
start
:
audioConverter = [[[TPAACAudioConverter alloc] initWithDelegate:self source:mySourcePath destination:myDestinationPath] autorelease]; [audioConverter start]; |
Alternatively, if you wish to encode live audio, or provide another source of audio data, you can implement the TPAACAudioConverterDataSource
protocol, which defines AACAudioConverter:nextBytes:length:
, which provides a buffer to copy at most “length” bytes of audio into, and then expects you to update “length” to the amount of bytes provided. For that you’ll need to use the second initialiser, initWithDelegate:dataSource:audioFormat:destination:
.
I noted previously that you can’t encode AAC live, which is what Apple’s docs say, but Alex in the comments informed me that it wasn’t so. So, I added the datasource method, and sure enough, it does work live!
The one caveat is that it’s a relatively heavy process. As it turns out, because my app Loopy is busily mixing and displaying visualisations and such, it was too much to also encode straight to AAC, and I was getting glitches. But it would probably work fine for plain recording. Thanks, Alex!
You can record to AAC directly. We have been doing so since the 3GS.
No way! I hadn’t even thought to try because Apple’s docs say it can’t be done.
Right, I’m going to add another method that encodes in successive chunks and use that live instead.
Thanks for saying!
I’m a beginner at objective-c for iOS. Could you post sample code about encode live audio? Excuse me. I had read the segement that you wrote, but I can understand and implement it. Because I want to transcode live audio every 10 seconds . Thanks a lot.
Hello Michael
Thank you very much for your code, it works like a charm on my iPad (4.3.2) to convert from .wav to .m4r / .m4a
However, exactly the same code gives me the error “Couldn’t setup intermediate conversion format” in my iPhone (4.2.1)
Do you know why or how could I debug it?
Thanks!
Hey Marc,
Hmmm – What kinda iPhone are you using? Only the 4 and the 3Gs have AAC support, so if it’s anything else, that might explain it. Otherwise… Have you got the correct audio session setup?
It’s an iPhone 4, (4.2.1), so it should have AAC support.
I think that the audio session is correctly setup, as in my iPad 2 ( 4.3.2) it’s working.
Is it possble that it’s not finding the original file, etc.? I tried converting to .pcm and it doesn’t work either
Any other hints will be appreciated
Thanks
Hello again Michael,
The part that is actually failing is this
checkResult(ExtAudioFileSetProperty(destinationFile, kExtAudioFileProperty_ClientDataFormat, size, &clientFormat), "ExtAudioFileSetProperty(destinationFile, kExtAudioFileProperty_ClientDataFormat")
which returns false. The strange thing is that I get false even if I convert from .wav to .wav
About the audio session: now I remember that it’s being handled by FMOD automatically. Is there a way to trace the session properties to check if they are the right ones?
Thank you
Marc
Weird! I have no idea, I’m afraid, I don’t recognise the error. I’d be googling it, I think!
There’s a function you can call to find out the current session – AudioSessionGetSomething
Finally I found out that changing the destination format to kAudioFormatMPEG4AAC_ELD solved the issue… but the resulting file, while working, is unusable for my purposes
Maybe this can give you a hint of what’s happening?
Thanks
although you propably fixed it by now, but could it be that the hardware is in use at some other point in your app? I had this problem, because I was reading an AAC file at the same time.
cheers, Dom
Hmm, not a clue, I’m afraid! Out of curiosity, does the sample app work?
Any way of catching the output AAC samples in memory? I am trying to make a VoIP application.
Hey Thomas – it’s probably doable using the AudioConverter services (instead of the ExtAudioFile stuff), but I wouldn’t advise its use for VoiP, due to latency issues. Probably best to use something like IMA4, I think!
Hello Michael ~ I’m a beginner at objective-c for iOS. Could you post sample code about encode live audio? Excuse me. I had read the segement that you wrote, but I can understand and implement it. Because I want to transcode live audio every 10 seconds . Thanks a lot.
I tried to do this with a encoded lpcm file but I get this error:
“2011-06-27 14:08:17.961 OnePrayer[4344:9603] XXX file://localhost/var/mobile/Applications/D8303E63-CDBC-4F08-B420-157061D17B88/Documents/1309198090.760125.lpcm 2011-06-27 14:08:18.013 OnePrayer[4344:9603] /Users/nmcdavit/Documents/Projects/Programming-xCode/2011-03-One Prayer/OnePrayer_svn/src/AudioConversion/TPAACAudioConverter.m:189: ExtAudioFileOpenURL result -43 FFFFFFD5 ’ˇˇˇ”
Any recommendations? I have no idea why it would think this file is somehow invalid, since it’s not invalid.
Is this easier?
2011-06-27 14:08:17.961 OnePrayer[4344:9603] XXX file://localhost/var/mobile/Applications/D8303E63-CDBC-4F08-B420-157061D17B88/Documents/1309198090.760125.lpcm 2011-06-27 14:08:18.013 OnePrayer[4344:9603] /Users/nmcdavit/Documents/Projects/Programming-xCode/2011-03-One Prayer/OnePrayer_svn/src/AudioConversion/TPAACAudioConverter.m:189: ExtAudioFileOpenURL result -43 FFFFFFD5 ’ˇˇˇ
Well it looks like it can’t open up the file for whatever reason. I tried converting the class to work with URLs instead since converting an url to a path seemed wonky. Looks like that wasn’t the issue.
hey, thanx for the code… i’m trying to compress and send audio im recording from the mic. i started of with the “SpeakHere” apple sample project. i’m sending data by receiving each buffer and sending its bytes via UDP. it looks like your code saves the data on to disc… is their any way i can obtain small chunks of data and send then rather than save them on disc?
also, i am trying to compile your project with xcode 4 and it gives me errors : error: operator ‘<‘ has no left operand this happens in a file named AvailabilityInternal.h. the code compiles fine on xcode3 any idea?
Hi Michael.
It’s a long time that I’ve been following your blog but I’ve never thanked you personally. I owe you the very first lines of Ricepad, that is the setup of the audio in/out. I’m also going to use your code for aac encoding, which by the way works also on my iPod Touch 2G, iOS 3.1.3 (I had just to subsitute the setThreadPriority call with its static version for compatibility).
Thank you very much for your blog posts, you’ve helped a lot of people! And keep up your great works! I’m a Loopy fan :)
Alex
P.S. If you happen to pass by Italy again give me a shout :)
Thanks heaps for the message, Alessandro! I’m very pleased it’s been of use =)
Cheers! (Next time we’re in Italy – will do!)
Hi Michael,
Many thanks for all your posts on Core Audio and AUGraphs, docs containing someone’s actual personal experience are indispensable. :-)
Do you know if live conversion can be carried out going the other way? I want to store MP3/AAC/CAF directly in a buffer (not as 32-bit AudioUnitSamples :-/) and decode out into an AUGraph, largely to save memory at run-time.
Adam
My pleasure, Adam!
I don’t see why that wouldn’t be doable (using the AudioConverter services). If you’re storing that much audio, don’t you want to store it on the filesystem, though?
I didn’t mention, it’s an iOS app. But since I posted I have found AVFoundation’s AVComposition class, and a brief test has shown this could be my solution (I was only mixing audio in the AUGraph, but I needed precise synch). So I’m going to be doing a bit of rewriting this weekend ;-)
Thanks for getting back to me, hope your Europe journey is going well!
Unfortunately TPAACAudioConverter is incompatible with AVFoundation as all the interesting uses require a non-exclusive audio category.
For example, you can’t use it to encode LPCM->AAC while decoding H264.
To be fair, this is a criticism of the ExtendedAudioFile API, it’s not really not TPAACAudioConverter’s fault.
RF
Thanks for this Michael.
In iOS 4.3, AVAudioRecorder would happily record kAudioFormatMPEG4AAC to a *.m4a file directly. In iOS 5.0, this behaves as if it’s working still, but the resulting file’s contents are unrecognisable by AVAudioPlayer or Audacity. Not sure if this is a bug in 5.0 or whether it was never meant to be used.
However, we can record MPEG4AAC into a *.caf file, and then convert to *.m4a using TPAACAudioConverter, which is the next best thing.
Colin
Turns out iOS 5.0 will record kAudioFormatMPEG4AAC to a *.m4a file directly, but only if you remember to set the audio session category to Record before doing so. In iOS 4.3, you could get away without doing this, and in 5.0 you can still get away with it when recording to *.caf, but not to *.m4a.
Colin
Thanks heaps for the update, Colin – good to know.
Thanks for all the great examples. I’m new to iOS. I just wanted to record some audio and upload it. Somebody should write an example demonstrating the use of all 100 different iOS audio APIs. :p I unleashed the spastic monkey and tried converting a second time but it generated an alert. AudioSessionInitialize should only be called once so I added a member variable sessionInitialized and changed -convert:
if ( !sessionInitialized) { if ( !checkResult(AudioSessionInitialize(NULL, NULL, interruptionListener, self), “initialise audio session”) ) { … } }
sessionInitialized = true;
“You can instead write a C callback function to handle audio interruptions. To attach that code to the audio session, you must initialize the session explicitly using the AudioSessionInitialize function. Do this just once, during application launch.”
http://developer.apple.com/library/ios/#documentation/Audio/Conceptual/AudioSessionProgrammingGuide/Cookbook/Cookbook.html#//apple_ref/doc/uid/TP40007875-CH6-SW2
Hi Michael, thanks for this example. It works great for conversation between different sound format!
I want to edit sound file. For example want to merge two sound files and generate new file and play. I have already spent couple of days doing this but not getting around. Basically I have using NSData to store the sound file…append required sound files and then write NSDate to sound file. The converted sound file doesn’t play! :(
Do you know what am I doing wrong? Thanks.
NSString *file1 = [[NSBundle mainBundle] pathForResource:@”file1″ ofType:@”caf”]; NSString *file2 = [[NSBundle mainBundle] pathForResource:@”file2″ ofType:@”caf”];
NSData *file1Data = [[NSData alloc] initWithContentsOfFile:file1]; NSData *file2Data = [[NSData alloc] initWithContentsOfFile:file2];
NSMutableData *mergedData =[[NSMutableData alloc] initWithCapacity:([file1Data length] + [file2Data length])];
// First chunk from original sound file NSRange firstChunk; firstChunk.length = startingPositionInSeconds; firstChunk.location = 0; [mergedData appendData:[file1Data subdataWithRange:firstChunk]];
// Add new sound [mergedData appendData:file2Data];
// First chunk from original sound file NSRange secondChunk; secondChunk.length = [file1Data length] – startingPositionInSeconds; secondChunk.location = startingPositionInSeconds; [mergedData appendData:[file1Data subdataWithRange:secondChunk]];
NSLog(@”File1: %d, File2: %d, Merged: %d”, [file2Data length], [file2Data length], [mergedData length]);
// Convert mergedData to Audio file [mergedData writeToFile:[self filePath:@”converted.caf”] atomically:YES];
[file1Data release]; [file2Data release]; [mergedData release];
Hmm, I’d suggest doing some reading up on audio programming, Jignesh – it’s not just a matter of concatenating audio files, which contain headers and footers and other wrapping. You need to work with the audio samples.
Thank you very much for quick turn around! I have been going through – Core Audio Programming – https://developer.apple.com/library/mac/#documentation/MusicAudio/Conceptual/CoreAudioOverview/WhatisCoreAudio/WhatisCoreAudio.html
Do you know anything else I can look into? I see many examples on conversation between different sound formats but not a single example on editing sound! Did you have a chance or know any good documentation on editing/inserting/manipulating sound files?
Thanks again.
I’m afraid I can’t help you there, but you might try asking on the Core Audio mailing list
Thanks for this great, stable wrapper, Michael.
I’ve run into one problem, however. When using the included example program without modification, the interruptionHandler is never called to signal the end of an interruption (with kAudioSessionEndInterruption). For example, if I start the app and start a conversion, then start playing from the Music app, I get kAudioSessionBeginInterruption; if I then pause the Music app, kAudioSessionEndInterruption is never delivered. Again, this is the 100% original example app. Am I missing something?
Much appreciated.
Hey Morgan,
I don’t know! I haven’t extensively tested the interruption stuff, so I’m not very familiar with what it takes to begin and end an interruption. Note that that’s all handled by the system, so we don’t have much control over it.
Right-o! Thanks for the reply. If when/I figure it out I’ll let you know!
Cheers!
Hi Michael,
Thank you so much for this website, its a really fantastic resource! And good on you for everything your doing, the traveling sounds amazing.
I am trying to implement live AAC encoding as you describe on this page but I am getting an error “unrecognized selector sent to instance” when I try to run: AACAudioConverter:nextBytes:length:
I think that the audio converter is correctly initialized, and I am setting the srcBuffer and length to pass to the method as follows:
UInt32 bufferByteSize = 32768; char srcBuffer[bufferByteSize]; bufferList.mBuffers[0].mDataByteSize = bufferByteSize; bufferList.mBuffers[0].mData = srcBuffer; [listener writeBufferToDisk:srcBuffer Length:(NSUInteger *)bufferByteSize];
And then running AACAudioConverter:nextBytes:length: in that method.
Do you have any idea what might be going wrong?
Many thanks
Thanks heaps for the kind words! =)
Hmm, you shouldn’t be calling AACAudioConverter:nextBytes:length: – that’s a data source delegate method. The AAC converter calls that on your object that you’ve specified as the data source, when it’s ready to write some more audio.
Hi Michael,
Thanks. Can you please provide a bit more explanation? I have an AudioUnitRender method returns buffers of audio. How should those buffers then be passed into the converter? Or am I completely off track here….
No, you’re all good – you’ll need to use a circular buffer (I, of course, recommend TPCircularBuffer =)) to store the audio, then pull audio out of that when the data source method (which you implement on your data source object) is called. Note that you’ll need to use 16-bit noninterleaved audio.
I know it’s a bit goofy – I’m currently creating a new class that’s specifically for recording AAC, as part of my new audio engine, which will be much easier to use.
It’s not goofy at all, I’m sure it’ll be a damn sight superior to the Core-Audio samples that Apple have provided! Still, I’ve been doggedly determined to try and scale the mountain myself… even if my code more closely resembles Frankenstein’s monster to the Sistene Chapel…
If you don’t mind providing a bit more… Am I correct in understanding that you set up the circularBuffer to be the TPAACAudioConverterDataSource? How do you set up that pointer/cast? (I’m new to programming)? Do you only need to initialize the audioConverter with that as the data source once and then it all happens magically?
Hmm, I don’t really have the time to explain it in much detail – TPCircularBuffer is just a C ring buffer. (take a look at this entry for some explanation). You need to write your own code to serve up the next audio from the buffer to the AAC converter when requested.
So in your input callback, you store the audio (TPCircularBufferProduceBytes), then in your data source callback, you retrieve and return the audio (use TPCircularBufferTail to get a pointer and the number of audio bytes available, then memcpy onto the buffer provided as a parameter to the method).
I’ve been trying to implement your class this evening, but I’ve received this error:
Error Domain=com.atastypixel.TPAACAudioConverterErrorDomain Code=0 “Couldn’t open the source file” UserInfo=0x26cd80 {NSLocalizedDescription=Couldn’t open the source file}
Would you go into a little more detail about the type of string this class would like as its “sourcePath”?
Thanks for your work, I’m very appreciate it. I’m planing to modify it to suit my work. Thanks a lot.
Hi Michael. Thanks so much for this. I actually started on this myself and then googled a related question. Ended up on this page. Thanks, A LOT !!
I was curious, is it a relatively simple task to reverse the process, specifically I need to go from m4a to wav…?
Thanks for this awesome tool! It’s saved me a few times.
Jon
Fairly simple, Jon, yes, although this class isn’t designed to do it – you’ll want to work directly with the ExtAudioFile services; but they’re really quite nicely designed and easy to use once you get the hang of it.
I want to thank you for the code. Currently using it for an upcoming app by the end of the year if not sooner. More sooner due to you sharing your work. Thank you!
You’re most welcome, Dave – glad it helped!
Hi is there any way to make TPAAC audio converter to work in iphone 3g and below versions, if not can anyone suggest me an alternative converter?
Nope: AAC encoding is only supported on the 3Gs and up. There’s a chance you may be able to find a third party library, but I’ve no idea where one would look.
Thanks Michael, So why? TPAAC audio converter doesn’t support 3g and below versions?. Any hardware specification?
That’s because the hardware just isn’t there – it’s all done in hardware, and the earlier models don’t have the equipment =)
Michael, are you using AAC for the actual live loops or only for exporting/sharing? What format do you use for the loops if you don’t mind me asking? I’ve read compressed formats tend to having issues with regards to seamless looping. Thx.
Hey Hari – I’m only using AAC for exporting. The loops themselves are AIFF.
It’s a little harder to seamlessly loop AAC because you need to account for padding, but it’s just a matter of following the steps – you can read about seamless AAC here.
Cheers Michael…
Hi Michael – can you give me any suggestions to convert CAF to MP3 .
As a side note, I was encoding live audio using ios 5 quite well (not using this well written class). However since 6.0 came out the live recording starting behaving very poorly when the session was interrupted. Basically mediaserverd would crash before even calling my application’s audio session interruption listener, thus completely screwing up the app.
I guess I am posting this for two reasons: 1. to warn people of the issue and 2. To ask if anyone else has seen this and has a good solution.
If no one has a solution, my current pattern is to record to apple lossless and then use this class to transcode to AAC.
Thanks!
As a follow up I tried using this class to transcode a file recorded in to apple lossless and turn it in to aac and I experienced the same problem. If I background the app and open up voice memos while the transcoding is happening then the audio session interruption event notification does not make it to my application. Then mediaserverd dies on the phone and the app stops behaving properly. Also voice memos locks up for 10 seconds or so while mediaserverd restarts.
I noticed your code is licensed under the terms of the MIT license. Just wanted to confirm that this means I can use it in closed-source projects, as long as I attribute that part of the project to you?
That’s right, Erik – thanks for checking!
Thanks for making such a useful library open-source!
I can’t seem to get this to work. I’m using [AVAudioSession sharedInstance] in my app to play and record audio, and when I try to use this library it sais “Couldn’t initialise audio session” even though I’m doing [[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryAudioProcessing withOptions:nil error:nil]; at the start of startConverting method. What could be going wrong? Thanks
Hi, Is it possible to have delegate which will provide access to compressed chunks in case with live converting?