Thursday, October 11, 2012

Adobe Air (AS3.0) Tutorial - Transferring Audio/Text Real-time Over UDP Socket

Preface

To learn more about how data transmits on a lower level via UDP, I set out to write a quick proof-of-concept where I could connect two clients together and allow real time voice communication.

Some Quick Notes About UDP and TCP

UDP and TCP are two protocols (based on IP) which are used to transfer data over a network (i.e. the internet). Web pages use TCP because it guarantees that data packets reach the end recipient (if the packet does not reach the recipient, it will be resent). In contrast, UDP doesn't care if the packets reach their end destination, lost packets on a network are never resent because they are irrelevant. This makes UDP the perfect choice for real-time (minus latency) data transfer. Because UDP doesn't require any handshake or confirmation regarding packets, it tends to be more equipt to handle real-time video/audio data or be implemented in real-time multi-player games.

Why Adobe AIR?

© 2012 Adobe Systems Incorporated
My quickest programming language is ActionScript 3.0. I also knew that lower level libraries had already been written to handle the complexities of network data transferring namely ServerSocket (TCP implementation) and DatagramSocket (UDP implementation). Since I wanted to be fairly agnostic to the lower level working of this protocol for the time being, I decided this would work well for me.

These two socket classes only exist in Adobe AIR and are not available for standard SWF files. This is a security measure Adobe has implemented. However, I believe you can still use the socket class to make connections with a socket server, only IP binding classes aren't accessible.

The source for this project is linked following this article.

Our Goal

We are going to create a client which allows us to bind to a network IP address, and broadcast UDP packets to another client listening for incoming transmissions. While sending data, we will want to know what kind of data is being sent (i.e. audio data, text data, etc.). We also want the audio to play in real-time minus network latency and computation.

Goal Review:
  1. Client binds to local IP/port
  2. Client sends/receives UDP data
  3. UDP data can consist of more than one type of data (voice, text, etc.)
  4. Allow microphone input to be sent via UDP to other client (and visa versa)
  5. Client plays incoming audio data as it is received

Let's Begin!

Step 1

First, we need to setup a quick GUI to allow us to control our client. This means telling our client what IP address and port to bind to. We will also want a way to initiate the binding procedure.  Because we want to send text and voice data, we will want some controls to handle allowing the user to type and send messages as well as a button to allow us to begin transmitting voice data. To make things simpler to debug, I included a log textarea to display helpful information.

You can organize your client any way you feel will make sense. I used Flash Builder for my project (although you could easily do this with Flash Professional or a Free IDE like FlashDevelop). Here is my end result:

The view used for sending text data.

The view used for initializing voice data transfer.

Step 2

Now that we have a GUI, let's begin programming! Since we know that packets will be sent using UDP, we can determine that we will be using a DatagramSocket object. Let's add that to our instance scope now. 

private var updSocket:DatagramSocket = new DatagramSocket();

Next, let's add the socket binding process. Attach an onclick listener to your Connect button. Inside that method, we will do a couple things. First, we want to ensure that our socket is not already bound; second, we want to bind the socket to our input values. The local binding is what the socket will be listening to for incoming data, the target IP is where the UDP socket will send our data. My method is called bind and looks like this:

private function bind( event:Event ):void
{
    trace('binding');
    if( updSocket.bound ) 
    {
        updSocket.close();
        updSocket = new DatagramSocket();
    }
    updSocket.bind( parseInt( localPort.text ), localIP.text );
    updSocket.addEventListener( DatagramSocketDataEvent.DATA, dataReceived );
    updSocket.receive();
    log( "Bound to: " + updSocket.localAddress + ":" + updSocket.localPort );
}
You can see from my code that the local binding IP input has a name of localIp, and the port input has a name of localPort. The bind method belonging to the DatagramSocket class requires the first parameter to be an integer type which is why we call parseInt on the input text value. You can also notice that we attach a listener to the socket with a method called dataReceived which we want to handle incoming packet data. Lastly, we call the DatagramSocket's receive method to have it begin listening.

Note: The log method used above is not critical to the application, but if you are following along and wish to output information to a textarea component, you can use a log method like mine:

private function log( text:String ):void
{
    trace( text );
    logField.appendText( text + "\n" );
    logField.scrollToRange(int.MAX_VALUE);
}

The textarea created for log data has a name of logField. The scrollToRange method forces the lowest text in the textarea component to be consistently visible.

Question: Why does the localIp input value read: 0.0.0.0? This allows the socket to bind to the local IP addresses of the machine without you needing to know what the addresses are. For instance, you could use the local IP 127.0.0.1 in place of 0.0.0.0 and you probably wouldn't notice.

Question: Why port 8988? You could almost use any other available port on your machine if you want to. Sometimes, however, user's will have trouble binding to ports lower than 1024 on a Windows machine as those are reserved. 8988 seemed high enough to stay out of trouble, uncommon enough to not conflict with other applications, and easy to remember!

Note: In order for our client applications to communicate they must be transmitting data on the same port as they are listening on. If we listen on 8989 but another application is sending data to 8988, that data will be dropped and the clients will not be able to communicate.

Step 3

Congratulations! You have completed the first goal of this application! If you comment out your event listener in the bind method above, you can run your application and click the Connect button. Despite not seeing any visual changes, it should bind properly in the background. You will not really notice anything except for the log output if you have implemented that in your application.

Bytes: Data packets sent on a network (both UDP and TCP) are composed of bytes. These bytes are a breakdown of more complex data like strings or objects. These bytes are stored in a container called a ByteArray. A ByteArray allows you to perform calculations with byte data. Our datagram socket will be sending ByteArray objects between clients. In theory, you can send this data to other programming languages as long as the other language can interpret ActionScript byte code (object [de]composition) and the byte endians match.

Design Consideration: One of our goals above was to be able to send different types of data through our UDP socket to the target IP. In order to accomplish this, we will need a way to describe the data in the packet. There are various way to implement this, however, I chose to make the first byte of the packet an indication of what data follows it. For example, if the first byte is 0, that signifies to my application that the bytes behind it contain text data; likewise, if the first byte is 1, then the data after is be voice/audio data.

I call these flags and dictate this in the instance scope after the DatagramSocket object in my application.

private static const TEXT_FLAG:int  = 0;
private static const AUDIO_FLAG:int = 1;

Sending Data

Next we want to send text data over UDP. We are going to use the message textarea component and the Send button to handle this. Our goal here is to type a message in the textarea and click send which will place the text data into a UDP packet and send it to the target IP. Let's add a onclick listener to our Send button. In that method, (mine is called send) we will add the following code:

private function send( e:Event = null ):void
{
    //Create a message in a ByteArray
    var data:ByteArray = new ByteArray();
    data.writeUTFBytes( message.text );
    message.text = "";
    sendWrappedData(data, TEXT_FLAG);
}
This method creates byte data for our string by first creating a ByteArray container and then using the method writeUTFBytes (string) to have the string broken down into byte data. There are many helpful methods on the ByteArray class (I would suggest doing some exploring). You notice once we have used the text value in our message textarea, we clear our textarea. Last, we call a method (which you will see soon) passing two arguments. The first argument is the byte data for the string, and the second is the type flag which we want associated with this data.

The sendWrappedData method looks like this:


private function sendWrappedData(byte_data:ByteArray, wrapperFlag:int = 0):void{
    //Create a message in a ByteArray
    var data:ByteArray = new ByteArray();
    data.writeByte(wrapperFlag); // tell what kind of data this is
    data.writeBytes(byte_data); // add the data
    try
    {
        // send the data
        updSocket.send( data, 0, 0, targetIP.text, parseInt( targetPort.text )); 
        // log( "Sent message to " + targetIP.text + ":" + targetPort.text );
    }
    catch ( error:Error )
    {
        log( error.message );
    }
}
This method might look somewhat similar to the send method, but provides abstraction to allow us to pass anything into it which obeys our rules. This abstraction is important because it gives us a central location where the DatagramSocket is sending data out bound. This method also uses the targetIP input and targetPort input fields.

So we run the program and it still doesn't really appear to be doing much. Lucky for us, this is the fun part! We are going to implement the dataReceived method we commented out earlier for testing. This method will take the received packet, determine it's contents and handle the contents based on our first byte flag. The dataReceived method looks like this:


private function dataReceived( e:DatagramSocketDataEvent ):void
{
    // get packet descriptor
    var data_flag:int = e.data.readByte();
    // log('data flag: '+data_flag.toString(2));
    switch(data_flag){
        case TEXT_FLAG:
            var msg_txt:String = e.data.readUTFBytes( e.data.bytesAvailable );
            //Read the data from the datagram
            log("Them: "+msg_txt)      
            break;
        case AUDIO_FLAG:
            // trace('handle audio data');
            break;
    }
    log("Received from " + e.srcAddress + ":" + e.srcPort + "> " + e.data.readUTFBytes( e.data.bytesAvailable ) );
}

As you can see, this method first examines the first byte and uses a switch statement to determine what happens next.

Note: When you call a read* method from a byte array (which is what e.data is) it progresses a position pointer. This allow you to call read methods in sequence without worrying about managing a position pointer.

In the first case, we match the TEXT_FLAG byte which allows us to call readUTFBytes on the remaining bytes. This converts the byte data back into a string which we can then print to the log or output.

When we run the application now, we should be able to enter a message and send it to the recipient. If we use ourselves as the recipient by typing in 127.0.0.1 as the target, we will create a loop back which essentially sends the packets out to the network interface to be resolved which keeps it local to our machine.

Step 4

Wow! We now are sending text data across the network to another client! This accomplishes goal 2 and part of 3. If your still not convinced, export a build of your program and install it on another computer on your network, then use your network IP addresses to send data.

Now we want to get audio data moving over the network the same way that text is transferring. To do this, we need to first look at obtaining the incoming audio from our microphone. In the instance scope, let's add a variable that will point to an instance of the Microphone singleton.

private var mic:Microphone;

We will program out our Transmit Audio button by adding an onclick listener and inside the method called, we will add:

private function beingAudioTransfer():void{
    mic = Microphone.getMicrophone(); 
    mic.gain = 60; // dB boost 0-100
    mic.rate = 44; // kHz 5,8,11,22,44
    mic.setUseEchoSuppression(true); 
    // mic.setLoopBack(true); 
    mic.addEventListener(SampleDataEvent.SAMPLE_DATA, micSampleDataHandler);
}

The variable mic becomes a pointer to the Microphone object to which we set some various properties for. Gain (or volume), rate (in kHz), and echo suppression if desired. You can also enable the loopback method to ensure your audio is there, but this will become confusing if you use a UDP loop back to your own computer. The micSampleDataHandler method looks like this:

private function micSampleDataHandler(e:SampleDataEvent):void {
    sendWrappedData(e.data, AUDIO_FLAG); // pass audio bytes out via UDP socket
    System.gc();// garbage collection, we do this to help prevent memory overflow
}

This is simple enough, the data that we receive from the microphone instance is passed on to the UDP relay method to go out to our target IP. We won't do anything else with the audio locally, although we could play around with some byte compression to see if we can gain performance.

Step 5

See, you're already a pro at this! You just completed goal #4! Although we are not yet handling the audio data coming from the other client, this is the last step to our program. We are sending out UDP packets as data is captured from the microphone.

Now let's handle those packets at they come into the application. In order for AS3.0 to play the sound data, it must be 44 kHz. The Sound object can't handle any other types and since we are trying to play our sound data as it comes in, we probably want to make sure it's set to 44 kHz. Let's create a Sound object in the instance scope.

private var soundPlayback:Sound = new Sound();

This object can be told to play, and then when it fires a sample event, it can take data that will define tones that get played.

Before we implement the SampleDataEvent listener, however, we are going to build a sound buffer. This will allow us to push audio bytes into the buffer, and have the Sound object play them back as though it was one continuous file. Because our buffer will hold byte data, we are going to use the ByteArray container. Add a sound buffer to your instance scope:

private var soundBuffer:ByteArray = new ByteArray();

Now we need to load the buffer, so we are going to modify the dataReceived method for case 2. We need to implement a couple changes to allow our recieved audio data to be added to the sound buffer. Change the  dataReceived method to be like this:

private function dataReceived( e:DatagramSocketDataEvent ):void
{
    // get packet descriptor
    var data_flag:int = e.data.readByte();
    switch(data_flag){
        case TEXT_FLAG:
            var msg_txt:String = e.data.readUTFBytes( e.data.bytesAvailable );
            //Read the data from the datagram
            log("Them: "+msg_txt)      
            break;
        case AUDIO_FLAG:
            // trace('more audio data');
            // if we are out of available bytes to read, clear our buffer
            if (soundBuffer.bytesAvailable == 0){
                soundBuffer.clear();
            }
            // load the buffer with audio data
            e.data.readBytes(soundBuffer);
            soundPlayback.play();
            break;
    }
    // log("Received from " + e.srcAddress + ":" + e.srcPort + "> " + e.data.readUTFBytes( e.data.bytesAvailable ) );
}

As you can see, we added a simple check to see when we should clear the buffer by checking to see if it has all been read (or is bytesAvailable == 0). Below that you can see that we readBytes from the e.data ByteArray into the soundBuffer. The syntax here might be a little bit confusing so reading the documentation will be helpful. After loading the buffer, we want to make sure that the soundPlayback object is trying to read that data.

Now we will implement the want to make sure we catch Sound playing events as they fire, so add the soundPlayback object with an eventListener into your constructor or first-run init method.

private function init():void{
    soundPlayback.addEventListener(SampleDataEvent.SAMPLE_DATA, outputSoundData);
}

Now we need to implement the outputSoundData method.

private function outputSoundData(e:SampleDataEvent):void{
    if (soundBuffer.bytesAvailable > 0){
        // trace('sampling: ' + soundBuffer.bytesAvailable);
        while (soundBuffer.bytesAvailable){
            var sample:Number = soundBuffer.readFloat();
            e.data.writeFloat(sample);
            e.data.writeFloat(sample);
            System.gc();
        }
    }
}

This method first checks to see if there are bytes to read before anything else (this might be a bit redundant). If we have bytes to read, we will loop through them taking each float and writing it out to the sound output. You may notice two lines that appear exactly alike. This is not a mistake, floats in the byte array are stored in  mono but the sound output expects stereo sound. To make our mono stereo as expected, we write the same value twice (one for each channel). This while loop ends with a request for garbage collection.

Conclusion

Well? How do you feel? Did you get all that? There is a lot to take in if your fairly new to programming but do your homework and it will all make sense. Assuming I wrote all the code right and explained everything clearly and that you haven't made any syntactical errors on your end, we should now have a running UDP socket client that takes text and audio data and sends them over the network to a listening client.

If you not quite sure about something, or I made a mistake, please let me know.

Download the Flash Builder source file here


2 comments:

Jeff Spicer said...

I can get this to work in the local loop -- sending myself a message. But how oh how do i configure it to broadcast and somebody else's computer will pick it up. i've hacked about most of the night, typing in crazy ip addresses but still can't get the two to communicate. what's the magic ip mixture? what should the ports be set to? local ip i assume is ip4 like 10.10.10.12. i heard that both clients can't be set to the same port. ack. but thanks for the demo, it's gotten me pretty far.

Michael Corrigan said...

@Jeff, it should for the most part be the same kind of setup for a network as with the loop back. You will probably want to use some higher ports and then also ensure you can see both machines on the network by IP. Don't forget to allow the communication through the firewall too (bit.ly/9LaIZV on windows).