I’ve had nasty old time trying to get some audio stuff going on the iPhone, no thanks to Apple’s lack of documentation. If you’re an iPhone developer interested in getting RemoteIO/IO Remote/whatever it’s called working on the iPhone… Do I have good news for you. Read on.
Wanna skip the Core Audio learning curve and start writing code straight away? Check out my new project:
Update: Thanks to Joel Reymont, we now have an explanation for the “CrashIfClientProvidedBogusAudioBufferList” iPhone simulator bug: The simulator doesn’t like mono audio. Thanks, Joel!
Update: Happily, Apple have now created some excellent documentation on Remote IO, with some good sample projects. I recommend using that as a resource, now that it’s there, as that will continue to be updated.
Update: Tom Zicarelli has created a very extensive sample app that demonstrates the use of AUGraph, with all sorts of goodies.
So, we need to obtain an instance of the RemoteIO audio unit, configure it, and hook it up to a recording callback, which is used to notify you that there is data ready to be grabbed, and where you pull the data from the audio unit.
Overview
- Identify the audio component (kAudioUnitType_Output/ kAudioUnitSubType_RemoteIO/ kAudioUnitManufacturerApple)
- Use AudioComponentFindNext(NULL, &descriptionOfAudioComponent) to obtain the AudioComponent, which is like the factory with which you obtain the audio unit
- Use AudioComponentInstanceNew(ourComponent, &audioUnit) to make an instance of the audio unit
- Enable IO for recording and possibly playback with AudioUnitSetProperty
- Describe the audio format in an AudioStreamBasicDescription structure, and apply the format using AudioUnitSetProperty
- Provide a callback for recording, and possibly playback, again using AudioUnitSetProperty
- Allocate some buffers
- Initialise the audio unit
- Start the audio unit
- Rejoice
Here’s my code: I’m using both recording and playback. Use what applies to you!
Initialisation
Initialisation looks like this. We have a member variable of type AudioComponentInstance which will contain our audio unit.
The audio format described below uses SInt16 for samples (i.e. signed, 16 bits per sample)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | #define kOutputBus 0 #define kInputBus 1 // ... OSStatus status; AudioComponentInstance audioUnit; // Describe audio component AudioComponentDescription desc; desc.componentType = kAudioUnitType_Output; desc.componentSubType = kAudioUnitSubType_RemoteIO; desc.componentFlags = 0; desc.componentFlagsMask = 0; desc.componentManufacturer = kAudioUnitManufacturer_Apple; // Get component AudioComponent inputComponent = AudioComponentFindNext(NULL, &desc); // Get audio units status = AudioComponentInstanceNew(inputComponent, &audioUnit); checkStatus(status); // Enable IO for recording UInt32 flag = 1; status = AudioUnitSetProperty(audioUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Input, kInputBus, &flag, sizeof(flag)); checkStatus(status); // Enable IO for playback status = AudioUnitSetProperty(audioUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Output, kOutputBus, &flag, sizeof(flag)); checkStatus(status); // Describe format audioFormat.mSampleRate = 44100.00; audioFormat.mFormatID = kAudioFormatLinearPCM; audioFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked; audioFormat.mFramesPerPacket = 1; audioFormat.mChannelsPerFrame = 1; audioFormat.mBitsPerChannel = 16; audioFormat.mBytesPerPacket = 2; audioFormat.mBytesPerFrame = 2; // Apply format status = AudioUnitSetProperty(audioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, kInputBus, &audioFormat, sizeof(audioFormat)); checkStatus(status); status = AudioUnitSetProperty(audioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, kOutputBus, &audioFormat, sizeof(audioFormat)); checkStatus(status); // Set input callback AURenderCallbackStruct callbackStruct; callbackStruct.inputProc = recordingCallback; callbackStruct.inputProcRefCon = self; status = AudioUnitSetProperty(audioUnit, kAudioOutputUnitProperty_SetInputCallback, kAudioUnitScope_Global, kInputBus, &callbackStruct, sizeof(callbackStruct)); checkStatus(status); // Set output callback callbackStruct.inputProc = playbackCallback; callbackStruct.inputProcRefCon = self; status = AudioUnitSetProperty(audioUnit, kAudioUnitProperty_SetRenderCallback, kAudioUnitScope_Global, kOutputBus, &callbackStruct, sizeof(callbackStruct)); checkStatus(status); // Disable buffer allocation for the recorder (optional - do this if we want to pass in our own) flag = 0; status = AudioUnitSetProperty(audioUnit, kAudioUnitProperty_ShouldAllocateBuffer, kAudioUnitScope_Output, kInputBus, &flag, sizeof(flag)); // TODO: Allocate our own buffers if we want // Initialise status = AudioUnitInitialize(audioUnit); checkStatus(status); |
Then, when you’re ready to start:
1 2 | OSStatus status = AudioOutputUnitStart(audioUnit); checkStatus(status); |
And to stop:
1 2 | OSStatus status = AudioOutputUnitStop(audioUnit); checkStatus(status); |
Then, when we’re finished:
1 | AudioComponentInstanceDispose(audioUnit); |
And now for our callbacks.
Recording
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | static OSStatus recordingCallback(void *inRefCon, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inBusNumber, UInt32 inNumberFrames, AudioBufferList *ioData) { // TODO: Use inRefCon to access our interface object to do stuff // Then, use inNumberFrames to figure out how much data is available, and make // that much space available in buffers in an AudioBufferList. AudioBufferList *bufferList; // <- Fill this up with buffers (you will want to malloc it, as it's a dynamic-length list) // Then: // Obtain recorded samples OSStatus status; status = AudioUnitRender([audioInterface audioUnit], ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, bufferList); checkStatus(status); // Now, we have the samples we just read sitting in buffers in bufferList DoStuffWithTheRecordedAudio(bufferList); return noErr; } |
Playback
1 2 3 4 5 6 7 8 9 10 11 | static OSStatus playbackCallback(void *inRefCon, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inBusNumber, UInt32 inNumberFrames, AudioBufferList *ioData) { // Notes: ioData contains buffers (may be more than one!) // Fill them up as much as you can. Remember to set the size value in each buffer to match how // much data is in the buffer. return noErr; } |
Finally, rejoice with me in this discovery ;)
Resources that helped
- http://pastie.org/pastes/219616
- http://developer.apple.com/samplecode/CAPlayThrough/listing8.html
- http://listas.apesol.org/pipermail/svn-libsdl.org/2008-July/000797.html
No thanks at all to Apple for their lack of accessible documentation on this topic – They really have a long way to go here! Also boo to them with their lack of search engine, and refusal to open up their docs to Google. It’s a jungle out there!
Update: You can adjust the latency of RemoteIO (and, in fact, any other audio framework) by setting the kAudioSessionProperty_PreferredHardwareIOBufferDuration
property:
float aBufferLength = 0.005; // In seconds AudioSessionSetProperty(kAudioSessionProperty_PreferredHardwareIOBufferDuration, sizeof(aBufferLength), &aBufferLength); |
This adjusts the length of buffers that’re passed to you – if buffer length was originally, say, 1024 samples, then halving the number of samples halves the amount of time taken to process them.
Another Update: In the comments below, Florian Bomers pointed out that I was incorrectly using the AudioUnitUninitialize
to clean up the Audio Unit. This is incorrect, and should in fact be AudioComponentInstanceDispose
. Further [discussion here](http://stackoverflow.com/questions/12688883/ios-5-6-low-volume-after-first-usage-of-coreaudio). Cheers Florian!
Hi, I just read your post on the forums… I wish that I could have seen something like that 2 weeks ago– would have saved me a few days of headache :)
It looks like you’re doing I/O with the remoteIO- I was wondering if you’ve noticed that when you enable input on the remoteIO, Apple puts a filter on the output that boosts midrange and sends the audio out the headset rather than the speaker, presumably because they make the assumption that if you’re using full-duplex, you’re probably doing VOIP or some such.
I was able to get the audio coming out the speaker using AudioSession, but I was wondering if you’ve had any experience changing the filter?
Hi, I using same post and written app which take input frm mic and send to the headphone speaker. But my outputcallback is not consistent, crashing. Can any one post the output callback code here? It will be gr8 help. I am stuck here from last two weeks. Thanks in advance.
Regards, Ganesh Pisal Vavni Inc.
Hey Andrew!
I certainly did notice that; in fact, it’s been troubling me. I’m excited you found a way to force it out the speaker (I’m gonna try it right now), although that doesn’t really help with the filter. There’s a good chance that stuff’s just hard-coded, with little we can do about it – if that’s the case, the only solution I can think of is applying our own inverse filter…
I don’t suppose you’ve noticed insane headphone feedback, have you?
No feedback — but I haven’t been testing with recording and headphones at the same time.. Output isn’t coming out of the speakers right when I’m recording, but saving it playing it later so I’ve had no problems with feedback.
Also – about the filter… I just had the chance to do testing on the 3g phone (up till now just on first gen and ipod touch) and the filter isn’t added to the newer generations… intresting.
Hi Michael, Thanks for the post about remote I/O units and all the works you did without any help from Apple . I am currently not planning to use remote I/O in my app but it may change soon. I have used Queue Audio Services and it is OK for my case except doing async stop of a current playing stream seems causing system hangup occasionally.
Since you are deeply in pain & fun with this core IPHONE audio stuffs and I am a new developer ( for fun for now ) , may be you can give me some suggestion on the following questions if you have already done this sort of thing.
Currently I am using System Sound services for short PCM sounds and it is pretty simple but does not providing not a lot of options and Audio Session Services is not available with this option. I am try to control system volume thru my app , and this volume is the same one that being controlled thru APPLE MUSIC app . So far , I could not found a way to do this and MP VOLUME ( from Media Player ) would not work. Queue Audio Services does provide its own volume control but I think the master system volume is what I am looking for.
Regards, HTM
Hi Michael,
Thanks for sharing valuable codes.
I was just trying to play compressed audio with low-latency. I slightly change your code where I disabled the unit input and changed input audio format to mp3. but unluckily, it doesn’t work. I don’t know if the remoteIO does not support decompress or I failed to fill the correct data into audio buffer.
Without changes of audio format, .caf file works with a little wired, where I guess I filled the buffer with some noise…
Hope your reply! Thanks again!
Hao
Did you have any luck with the caf sound problem or mp3 format? I’m having the same issue.
Read your post, playing with it now. Thanks very much. Just what I was looking for.
Michael,
Am I correct in my understanding that you can use a single RemoteIO unit instance for both input and output? Also, can you weigh in on buffer size (and count) considerations—i.e., a few packets or a few hundred milliseconds worth?
Finally, if I wanted to play multiple audio streams simultaneously, does Audio Unit (on the iPhone) provide a mechanism for automatically mixing multiple streams (via multiple ‘file input’ units connected to a mixer unit, connected in turn to the RemoteIO output) or would I have to do that math manually (I’m thinking: divide each sample value in each stream by the total number of streams, then add the results to obtain a mixed value)?
Sorry for the barrage of questions (and probably some misunderstanding and/or misuse of terminology). Any insight you might have is much appreciated.
Thanks, benjamin
HTM: Sorry for the delay in getting back to you! Unfortunately, Apple don’t let us touch the hardware volume. All we can do is alter the volume for queues, as you mentioned.
Hao: I don’t think it will work with mp3 as-is; I haven’t had any experience with compression, but I’m pretty sure you need to do some explicit conversion. Other than that, I don’t have any details! I’m sure someone on the official forums can give you a hand.
Benjamin: That’s right, the one IORemote audio unit can work for both recording and playback. I’m working successfully with 3 buffers of 4096 bytes. You could probably use the graph services with a mixer audio unit, absolutely; you’ll need to do things slightly differently, and I can’t really advise you as I haven’t had much experience with AUGraph. I’m mixing manually, pretty much the way you describe, because I need seriously fast response times and lots of control.
Hi Michael, it is very helpful, but the same code is not working for sampling rate 8000, is this a limitation on iphone SDk or iPhone? Thanks in advance!
Hi usha: I would imagine that there would certainly be limitations in what sample rates are available; that said, I’d be surprised if a 8000 Hz rate wasn’t supported. So your code definitely works at 44100? (Just to isolate the problem)
Mate, this is great stuff. I’ve used the AudioQueue for my Commodore 64 emulator, running on the iPhone. The overhead is pretty low, but I might take a look at this, since the APIs are fairly similar.
Cheers,
Stu
Hi Michael, sorry for the delay… yes, its working with 44.1k but not with 8k. I am able to get it working using AudioQueue interface.
BestRegards, Usha.
Hi Michael,
Do you know a way to control the Audio Remote I/O individual volume by using AudioUnitSetProperty() . I can not find a kAudioUnitProperty_XXX to set up the volume.
Regards, HTM
Hi HTM,
There’s no volume controls for the IO Remote audio unit – you either need to use a AUGraph with an IORemote unit attached to a mixer audio unit, or you need to multiply your audio samples manually.
There are some comments on the developer forums post linked to at the start of this entry from a guy who’s been using a mixer unit.
Hi Michael,
Thanks very much for posting this, it is very helpful. I’m using it to stream output and I find that the callback asks to fill 1 buffer with 1k samples (at 44100 sample rate). If I dip below filling with 1k samples, then I get some distortion. 1k samples at 44100 samples/sec is about 23ms. So the max latency (say modifying the audio stream in response to an event) is about 23ms. Is that what you have found, it doesn’t seem very “low latency” if this is true?
Also, no matter what I set the sample rate to, it still uses 44100.
Thanks, Marc
Hi Marc,
That’s pretty much what I’m getting too – ‘low latency’ is a relative term (compared to the alternative mechanisms). We’re talking software-based (not hardware, so the pipeline is much longer) audio here, and on an embedded processor, so it may be unrealistic to expect much better. I haven’t tried, but there’s a slight chance you could alter the buffer size using kAudioUnitProperty_MaximumFramesPerSlice. That’s all I got. Good luck!
Thanks Micahel. I’ll try the maximumframesperslice. BTW, I just came across the following which has some helpful information on RemoteIO:
http://developer.apple.com/technotes/tn2002/tn2091.html
Best, Marc
Hi Michael,
thanks for your sample, it cleared up the call sequence to obtain somethign working.
The fact is that, I don’t have any way of doing the recording part. The recording callback is not called in anyway while the playback is working succesfully.
Tried to use separate AU instances for calling, tried to paste exactly your code, but nothing happens.
Any idea ?
I’ll have to move to AudioSessions, i know but it would be nice to see something working, first…
Thanks.
Found the problem. It’s not needed to install both the callbacks. Only the playback gets called and there can be the AudioUnitRender to retrieve the recording buffers. If the recording buffer is rendered on the same playback ioData, then there is a playthrough.
Could you give me a little more information? I’d like to see an example of filling the playback buffers.
Hi Dave, Apple’s sample code gives excellent examples on how to fill buffers with audio data – SpeakHere is quite useful. Otherwise, refer to the post corresponding to this article on Apple’s developer forum.
Speakhere seems to use an AudioQueueBufferRef buffer object instead of AudioBufferList in the callback function. Isn’t that for the audio queue interface only?
If you could provide a snippet that reads a sound file into the AudioBufferList object that would be extremely helpful.
Hey Michael – I am new to iPhone development so sorry if this question is basic, but do you need to setup an AudioSession before trying this code?
I keep getting the following stack crash and I am not sure what is missing…
Thread 6 Crashed: 0 libSystem.B.dylib 0xffff0b08 __memcpy + 872 (cpu_capabilities.h:246) 1 AudioToolbox 0x315f495f RemoteIOClient::EnqueueInput(XAudioTimeStamp const&, int, AudioBufferList const&) + 191 2 AudioToolbox 0x315e2536 AQMEDevice::IO_PerformInput(AudioBufferList const&, AudioTimeStamp const&, unsigned long) + 438 3 AudioToolbox 0x315e9557 AQMEIO_AU::InputIsAvailable(void, unsigned long, AudioTimeStamp const, unsigned long, unsigned long, AudioBufferList) + 103 4 AudioToolbox 0x31612f32 AUHAL::AUIOProc(unsigned long, AudioTimeStamp const, AudioBufferList const, AudioTimeStamp const, AudioBufferList, AudioTimeStamp const, void) + 738 5 com.apple.audio.CoreAudio 0x00189c88 HP_IOProc::Call(AudioTimeStamp const&, AudioTimeStamp const&, AudioBufferList const, AudioTimeStamp const&, AudioBufferList) + 322 6 com.apple.audio.CoreAudio 0x00189970 IOA_Device::CallIOProcs(AudioTimeStamp const&, AudioTimeStamp const&, AudioTimeStamp const&) + 292 7 com.apple.audio.CoreAudio 0x0019ed52 HP_IOThread::PerformIO(AudioTimeStamp const&, double) + 1186 8 com.apple.audio.CoreAudio 0x00188762 HP_IOThread::WorkLoop() + 1518 9 com.apple.audio.CoreAudio 0x0018816f HP_IOThread::ThreadEntry(HP_IOThread) + 17 10 com.apple.audio.CoreAudio 0x00178a30 CAPThread::Entry(CAPThread) + 96 11 libSystem.B.dylib 0x931d8095 _pthread_start + 321 12 libSystem.B.dylib 0x931d7f52 thread_start + 34
Hi jd1; Yes, whenever you do any audio stuff on the iPhone, you need to initialise the session. SpeakHere shows how
Please please provide code for playbackCallback! I really need to play a file with remote io.
After a lot more pathetic flailing I have a working object that plays a caf with remote io.
One thing you might want to know is that the ioData buffers represent left and right channel.
One thing I’m still struggling with is that the pitch is too high even though the sample rate for the file is the same one I set for the audio unit 44100
Dave,
Can you shed any light on what you did to get CAF file playback in the callback buffer?
I have a different goal – to play a very small sample indefinitely by looping it.
I’m to the point where I have a looping sound playing, but something doesn’t sound right (should sound like a sine wave, but sounds more like a sawtooth).
I’m somewhat confused as to how many bytes I should be moving into inData on each firing of the playbackCallback. The least awful result is when I shoot for 2048 bytes – I just iterate over my sample N times until I’ve filled 2048 (saving the “last cursor position” so I don’t break the loop on the next callback.
Any guidance is welcome.
-Glenn
ioData->mBuffers[0].mDataByteSize should contain the number of bytes you can load.
Unfortunately I was unable to make this sound right because the documentation for core audio sucks. The only thing that helped was that I discovered ioData->mBuffers[0] and ioData->mBuffers[1] and right and left stereo channels so you can load them with the same bytes.
Hi guys,
Another thing to be aware of is CAF audio is stored big endian, while the iPhone is little endian. You’ll either need to use the AudioConverter (I think that’s what’s called) framework to convert the sound before playback, or swap the endianness manually.
That is…
UInt16 buffer; unsigned long numPackets, numBytesRead; AudioFileReadPackets(audioFile, NO, &numBytesRead, NULL, position, &numPackets, buffer); … // Sound is big endian; flip it int i; for ( i=0; i<length; i++ ) { ((UInt16)buffer)[i] = (((UInt16)buffer)[i]>>8) | (((UInt16)buffer)[i]<<8); }
I hadn’t considered the endianness – however, as soon as I saw your note, afconvert-ed the LEI16 CAFs to BEI16 CAFs, and unfortunately got a similar result (sawtooth-sounding sine wave samples).
Thanks for the ideas though – I’ll post over on the dev forums and see if anyone has had luck there.
-Glenn
If you figure out a solution, let me know! (Other things to make sure you’re doing right – handling multi-channel audio correctly, incrementing the position counter by the right number of bytes/packets, using the correct read length to match your buffer size)
I am following same steps which mentioned here. Which framework I should add ? I added AudioUnit framework, but on simulator I am getting error framework not found. It works on device. In my framework directory there is AudioUnit framework directory. But framework only consists header files no framework lib.
You’re after the ‘AudioToolkit’ framework, Ganesh
Hi Michael,
Are you using AudioToolKit framework or AudioToolBox framework? I am not able to find AudioToolKit framework on iPhone. And when I am using AudioToolBox framework, I am getting compilation errors on AudioUnitSetProperty.
Please help me, I am stuck on same problem from long time.
Oops.. Yes, AudioToolbox. My bad. Sounds like perhaps you’re using the wrong SDK version? Otherwise, dunno, ask on the forums perhaps
Michael, Do you have any sample project for same? I am trying your post from last few days. But no luck, I am newbie to iPhone programming. If possible please mail me project on [email protected] I am waiting for your reply. Thanks buddy.
I’m afraid not, Ganesh; I suggest you seek further assistance at the dev forums, where you’ll have a much bigger audience than just me =)
Hey Michael, Goood news. Now I am able to input output sound. Thanks for writing such good post. But problem is that my program is running for only 2-3 seconds and crashing. :( Any idea whats happeing, is there any issue of buffer size? I am using input output callbacks from CAPlayThrough application.
Thanks, Ganesh pisal Vavni Inc.
Hi Ganesh, I’m afraid I don’t know, without seeing crash info from gdb. I suggest you post this question on the apple dev forums, where you’ll get a bigger audience.
Pingback: [Time code];
Michael, see http://tinyco.de/2009/02/12/solution-for-crashifclientprovidedbogusaudiobufferlist-in-iphone-simulator.html for solution to the simulator crash. Any particular reason why you are using 1 channel?
Great work, Joel, I’m very excited you’ve found this – thankyou!
I was originally using 1 channel because it was all I needed for Loopy – it’s going to go to 2 channels as soon as I implement panning, but without that feature there wasn’t any point (the mic’s mono anyway, presumably).
Hi Michael,
Just want to say thanks for your post. I’m now synthesizing some low-latency audio on my iPhone. I appreciate you sharing your knowledge with the community.
Marcus
First of all, thanks for your RemoteIO-evangelism! You’ve pointed a class I’m taking in the right direction (away from Audio Queue). I’ve got a question about the Audio Unit architecture that you might be able to answer:
What exactly are the componentTypes & -subTypes for in an AudioComponentDescription and how do they affect an audio unit you’re setting up?
I’m asking because it seems like you can get full duplex going as long as you have a PlayAndRecord AudioSession activated. This then leads to asking “why can’t I use an AU effect subtype (‘aufx’) instead of remoteIO?” The goal, in my case, is to do real-time DSP on buffers filled with sample frames which, in essence, is just an AU effect for the iPhone. Please feel free to e-mail me at [email protected] or IM me (AIM: tbone41987) when you’re able. BTW, I’m a (jazz) trombone player myself ^_^.
Hey Michael, love your work, I am trying to develop an app where a user can toggle between two tracks and apply an audio unit filter to whichever is playing back. I'm having trouble in determining what method would be best to accomplish something like this and was hoping that after all your research you might be able to point me in the right direction. Is this a task for RemoteIO?
Hi Tony; please excuse the delay.
RemoteIO will probably do the job, although you'd probably need to use it in an AUGraph: something with which I have no experience yet, I'm afraid. It would probably look like [Your callback to provide buffers from playing audio] -> [AU filter] -> [Remote IO with output] -> speakers
Sorry I can't offer more advice there.
Michael, Thanks for your response. At this point I'm willing to take any help I can get. I will look into doing that and I'll keep you posted on what I'm able to figure out.