SIP phone on STM32F7-Discovery
Some time ago we were wrote 3r33333. about how we managed to launch a SIP phone on STM32F4-Discovery with 1 MB ROM and 192 KB RAM) based on 3r33232. Embox . It must be said here that that version was minimal and connected two phones directly without a server and with voice transmission only in one direction. Therefore, we decided to launch a more complete phone with a call through the server, voice transfer in both directions, but at the same time meet the smallest possible memory size.
How we came to this
So, initially there was a question about choosing a hardware platform. Since it was clear that the STM32F4-Discovery does not fit the memory, the STM32F7-Discovery was chosen. It has 1 MB of flash drive and 256 KB of RAM (+ 64 special fast memory, which we will also use). Also not a lot to call through the server, but decided to try to get in.
Conditionally for themselves the task was divided into several stages: 3r33228.
3r33140. Run PJSIP on QEMU. It was convenient for debugging, plus we already had support for the AC97 codec there. 3r3141.
3r33140. Voice recording and playback on QEMU and STM32. 3r3141.
3r33140. Porting application simple_pjsua from the PJSIP. It allows you to register on the SIP server and call. 3r3141.
3r33140. Deploy your own Asterisk server and test it, then try external ones, such as sip.linphone.org
The sound in Embox works through Portaudio, which is used in PISIP. The first problems appeared on QEMU - WAV played well on 44100 Hz, but on 8000 something clearly went wrong. It turned out that the matter was in setting the frequency - by default in the equipment it was 4410? and with us this did not change in software.
Here, probably, it is worth explaining a little how the sound plays at all. You can set a pointer to a sound card for a piece of memory from which you want to play or record at a predetermined frequency. After the buffer ends, an interrupt is generated, and execution continues with the next buffer. The fact is that these buffers need to have time to fill in advance, while the previous one is playing. We will face this problem further on STM32F7.
Next, we rented a server and deployed Asterisk on it. Since there was a lot to be debugged, and I didn’t want to talk into the microphone, I had to do automatic playback and recording. To do this, we patch simple_pjsua so that you can slip files instead of audio devices. In PJSIP, this is done quite simply, since they have the concept of a port, which can be either a device or a file. And these ports can be flexibly connected to other ports. You can see the code in our pjsip repositories. . As a result, the scheme was as follows. On the Asterisk server, I started two accounts - for Linux and for Embox. Next, Embox runs the command simple_pjsua_imported , Embox is registered on the server, then from Linux we call Embox. At the time of connection, we check on the Asterisk server that the entire connection is established, and after a while we should hear the sound from Linux in Embox, and in Linux we save the file that is being played from Embox.
After it worked on QEMU, we switched to porting to STM32F7-Discovery. The first problem is that they did not fit in 1 MB ROM without the optimization of the compiler “-Os” on the size of the image. Therefore, include the "-Os". Further, the patch turned off support for C ++, so it is needed only for pjsua, and we use simple_pjsua.
Once fit simple_pjsua , decided that the chances of running it now are. But first it was necessary to deal with the recording and playback of the voice. The question is where to write? Chose an external memory - SDRAM (128 MB). You can try it yourself:
Will create a stereo wav with a frequency of 16000 Hz and a duration of 10 seconds:
record -r 16000 -c 2 -d 10000 -m C0000000
play -m C0000000
There were two problems. The first with the codec is WM899? and there is such a thing as a slot, and these slots 4. So, by default, if this is not configured, then during audio playback, playback occurs in all four slots. Therefore, at a frequency of 16000 Hz, we received 8000 Hz, and for 8000 Hz, reproduction simply did not work. When only slots 0 and 2 were selected, it worked as it should. Another problem was the audio interface in the STM32Cube, in which the audio output works through the SAI (Serial Audio Interface) synchronously with the audio input (did not understand the details, but it turns out that they share a common clock and when the audio output is initialized, the audio is somehow tied to it entrance). That is, it is impossible to start them separately, so they did the following - they always work (including interrupts generated) audio input and audio output. But when nothing is played in the system, we just slip the empty buffer to the audio output, and when the playback starts, we honestly start filling it up.
Next, we faced the fact that the sound when recording voice was very quiet. This is due to the fact that the MEMS microphones on the STM32F7-Discovery somehow do not work well at frequencies below 16000 Hz. Therefore, we expose 16000 Hz, even if 8000 Hz comes. For this, it was really necessary to add a software conversion of one frequency to another.
Then I had to increase the size of the heap, which is located in RAM. According to our calculations, pjsip required about 190 KB, and we only have about 100 KB. Here we had to use a bit of external memory - SDRAM (about 128 KB).
After all these edits, I saw the first packages between Linux and Embox, and I heard a sound! But the sound was terrible, not at all like at QEMU, nothing could be disassembled. Then we thought about what could be the matter. Debugging showed that Embox simply does not have time to fill /unload audio buffers. While pjsip processes one frame, 2 interrupts occur when the buffer processing is completed, which is too much. The first thought to speed up was compiler optimization, but it was already included in PJSIP. The second is a hardware floating point, we told about it in 3r3208. Article 3r33333. . But as practice has shown, FPU did not give a significant increase in speed. The next step was prioritizing the threads. Embox has different scheduling strategies, and I turned on the one that supports the priorities, and set the audio to the highest possible priority. That didn't help either.
The next idea was that we work with external memory and it would be good to move structures there that are accessed extremely often. I conducted a preliminary analysis of when and under that 3r3230. simple_pjsua [/i] allocates memory. It turned out that out of 190 Kb, the first 90 Kb are allocated for internal needs of PJSIP and they are not used very often. Then, during an incoming call, the pjsua_call_answer function is called, in which then the buffers are allocated to work with incoming and outgoing frames. It was about 100 kb more. And here we did the following. Before the call, the data is stored in the external memory. As soon as the call is made, we immediately replace the heap with another - in RAM. Thus, all the “hot” data was transferred to faster and more predictable memory.
In the end, all this together allowed to run simple_pjsua and call through your server. And then through other servers such as sip.linphone.org.
The result was to run simple_pjsua with the transfer of voice in both directions through the server. The problem with the additional 128 KB of SDRAM can be solved by using a slightly more powerful Cortex-M7 (for example, STM32F769NI with 512 KB of RAM), but we still have not left hope to get into 256 KB :) We will be glad if someone is interested , and even better - try. All sources, as usual, are in our repositories. .
It may be interesting
I am overwhelmed by your post with such a nice topic. Usually I visit your blogs and get updated through the information you include but today’s blog would be the most appreciable. Well done!
Took me time to understand all of the comments, but I seriously enjoyed the write-up. It proved being really helpful to me and Im positive to all of the commenters right here! Its constantly nice when you can not only be informed, but also entertained! I am certain you had enjoyable writing this write-up.