Using iterative solvers – a few lessons learnt

I’ve been working on a couple of projects relating to camera and joint calibration on the Nao Robot for the last couple of years. During the last few months I put more focus on calibrating joints of this robot, particularly the legs.

Almost all calibration tools I’ve know of usually rely on non-linear iterative solvers, particularly the venerable Levenberg-Marquardt solver. The usual setting is the user define a cost function or a fitness function (usually in case of a genetic or like algorithm). This can also be introduced as an optimization problem depending on the field of interest.

This post is not meant to be an exhaustive review of the vast field of optimization or to give specific recommendations as I’m not an expert at this, but to share a few issues I came across and how they affect if not taken proper care of.

Lesson 1: Local or Global solver?

This is one of the most crucial selections in my opinion as it determines the time for calculation and many other factors. Why would this matter? Can’t we just throw in a global optimizer for all problems? – The simple answer is No!

Extrema example
Figure 1. Local and Global minimum; Source: I, KSmrq [CC BY-SA 3.0 (
Taking a look at Figure 1, if this plot relates to a cost function, then minimizing the cost is the objective, thus the solution would be at the global minimum. Another example can be in robotics, the case of inverse kinematics where there can be multiple solutions for a robot to reach the same place.

Majority of the solvers, rely on the gradient of the curve (Derivative or Jacobian matrix in case of multi-variable function), why? knowing the derivative helps the solver to quickly know if it is heading in the right direction! Some might not directly use the derivative, but see if it reduced or increased the cost.

The class of solvers called “local minima” solvers will terminate when they find a minima. Since they don’t spend searching the entire solution space, they are usually fast to give a solution. The ones that rely on derivative (AKA gradient) tend to converge fast as they usually employ “acceleration factors”. The overall result is quick results than brute-force search.

Unfortunately, this behavior of the local solvers also means it is sensitive to the initial position/ initial parameter vector, as the solution depends on where it started. Therefore these solvers work poorely when the cost function is noisy or discontinuous.

Why not use a global solver all the time?

Simply because it consumes a lot of time as it has to check the entire solution space (Consider a system with 20 variables with granularity of 0.1 and bounds of +- 5 -> (10/ 0.1)^20 possible solutions. There are huge systems with thousands of parameters. In order to reduce the workload, bounds for the parameters can be introduced and also to simplify the model. There are also methods such as Particle Swarm optimization which enables parallel processing, etc. Yet the results weren’t that magnificent in a few quick trials, coupled with the time for calculation, I would not recommend global solvers unless absolutely necessary.

This is a widely researched field and most of the machine learning or neural network training algorithms rely on one or more flavours of this class of solvers, clearly there are many new and old methods.

L1 : Conclusion

Try to use a local solver, such as Levenberg-Marquardt. But keep in mind of their limitations, particularly providing a good initial parameter set.

Lesson 2: Derivative free solvers?

This is another important choice and it will definitely ruin the day if not taken proper care of!.

  1. Is the cost/ fitness function differentiable?
  2. Is it noisy?
  3. Is there an analytical solution? Or is the numerical solution good/ stable?

If the answer to first is yes, then the options are wider, else the derivative free solvers have to be used or the cost function could be refactored to be differentiable.

If the second is also yes then one might need to employ smoothing functions or to refactor the cost function.

If differentiable but too much effort for analytical solution, then numerical differentiation could be used. But Numerical Differentiation poses additional dangers and can severely degrade performance if not used with care. I would call this the most important learnt-lesson for my case. Once I had a 10% drop of failures just by changing to central difference method alone! An alternative is Automatic Differentiation (Available with libraries like Eigen, see for more)

So far, in my experience, gradient or derivative or similarly motivated solvers converge fast under right conditions than simple brute force searches. So use them if possible. Else Genetic algorithms, CMA-ES, Pattern search, particle swarm optimization, etc could be used.

Lesson 3: Think before leap!

Before wasting hours or days, think it through which solver is good for the particular case. While it is tempting to use Levenberg Marquardt (probably the 3rd mention in this post) which works pretty well for a vast range of situations, it might not be the best for the job. Also pay attention to the factor of differentiability and quality of the differentiation if such solvers are used.

Low cost LIDAR experiment. Part 1


LIDAR (Light Detection And Ranging) is a technology similar to Radar, used to measure distance, etc using light. Generally the principle of measuring is same of basic radar, using time of flight technique. (Measuring time taken to light for travel).

I wanted to build or find a low cost setup a few years back, but the search was quite useless 😛 . Building one need sensitive components and finding these specialized components in Sri Lanka is almost impossible, so I gave up.

At the present, there is one or two low cost products like LIDAR lite ($150) approx. with  max range of 40m. There was another kickstarter project as well, similar pricing.


For the first stage, I’ll simply experiment with available research, low cost components. Later on, optics and range will be improved.

  • Will use PIN photodiodes instead of commonly used APD (Avalanche Photo Diode) – cost and requirement of high voltage circuit for APD makes it unattractive.
  • Attempt to use phase shift method and time of flight method.


  1. Light emitting part
  2. Light receiver and amplification circuit. This is probably the most crucial stage for a low cost setup due to the less sensitive PIN photodiodes.
  3. Timing circuit
  4. Final processing circuit.

Trial 1

My first trial was using Osram SFH 4545 IR emitting diode and Everlight PD333-3C photodiode.

IR emitter

I used a TI Tiva C launchpad to generate the needed pulse output and a BC337 transistor to drive the LED. This setup is not much ideal and I need to use a MOSFET to get good rise/ fall time and current output. (This LED can handle up to 1A pulse).

IR Reciever

I started with available components I had. Judging on the literature available, the needed circuit is called “trans impedance amplifier” – this simply mean the amplifier convert a current to a voltage. The reason was that photo diodes actually generate current than a voltage on different lights. After referring articles mainly by Texas Instruments, I constructed the following circuit. I did not do any formal calculation or analysis, this was just a trial and error setup.

The Op amp I had at hand was TLC25L4A. This is a low power op amp with decent gain but the bandwidth is not that impressive. Nevertheless, the circuit was the foundation of the next iteration.

Op amp : TLC25L4A, R2 = 3Mohm, CR1 = PD333-3C. Image from SBOA060 – TI application note.

Results and limitations

The circuit amplified signals picked up from the photodiode quite okay. But the range was severely limited. Measurements were taken from my 5 year old DSO Quad oscilloscope.

The main issues I faced was the lack of optical filter on the photo diode, therefore it picked up 50Hz light came from the lights in my room, etc. Attempts for filters were rather futile (I tried RC filters only, didn’t have inductors at hand).

At maximum range, approx 60cm!. The amplifier did not have a big gain.

In the next installment, I’ll explore the next set of circuits and other changes such as IR pulse width, etc.

My Github repository

I started hosting a few projects a libraries on github, My repo. is located at

Why github?

In the recent years, GitHub has become the de-facto place to host open source projects. The success has been proven by the number of Sourceforge projects being moved to Github and the closure of google code (and their suggestion to move for github).

Git has been there for a long time, in fact it was there since the days linux kernel was born and git was invented by Linus Torvolds and he used git to manage the source of linux kernel.

However I believe that ease of use and the clean interface of github made many people to use it rather than sourceforge. I have a sourceforge account too, but I almost never used it and even to date, browsing through the source, downloading, etc is somewhat inconvenient and there were instances that the advertisements mislead people!.

Converting Keyboard to MIDI with a microcontroller

I had some curiosity and interest with MIDI devices for some years and this small project came up after my Yamaha Portasound PSS-190 had a couple of burnt traces and the synthesizer IC was gone.

This particular instrument isn’t really high end, so there were no velocity sensing (velocity matters, try pressing a key on a piano slower and faster). And the keys were wired to form a matrix – thus known as matrix keyboards. This means each wont connect directly to the microprocessor/ synthesizer IC but the wiring arrangement is like a matrix/ table.

A matrix keypad schematics. Same story with the synthesizer keyboard!. Image source :

This means you only need 8 wires to access 16 keys. In case of the Yamaha keyboard I tried to fix, it had 7 + 6 wires and served 30+ keys. However the drawback is, you cannot all the keys at the same time (real time – I’m talking microseconds), the reason is that these needs “scanning”, enabling one row and read the values and so on. But this can be implemented to be fast enough for a human. For example, the standard PC keyboard is almost always a matrix keyboard with the internals scanning for key presses at least hundred times per second.

Back to the topic, since I couldn’t source the original synth IC, I decided to build a MIDI synth and install inside the keyboard! I had a STM32F4 Discovery board with me, so i went for the “Goom” ( ported to MidiBox ( Midibox is a platform to build various types of MIDI instruments. Will discuss about this in a later post.

With the synth running, what I needed was to get MIDI signals upon key presses of the keyboard. Therefore I did some googling, found out that MIDI is pretty much serial communication at 31250 baud rate. So to test, I used Arduino Mega.

// Pin Definitions
// Rows are connected to

const byte Mask = 255;
double oldtime;
uint8_t keyToMidiMap[32];

boolean keyPressed[50];
int command = 0x90;
int noteVelocity = 60;

//#define DEBUG

// use prepared bit vectors instead of shifting bit left everytime
byte bits[] = { 
 B00000001, B00000010, B00000100, B00001000, B00010000, B00100000, B01000000, B10000000 };
byte colVals[] = {
 255,255, 255, 255, 255, 255, 255, 255 };
byte bits1[] = { 
 B11111110, B11111101, B11111011, B11110111, B11101111, B11011111, B10111111, B01111111 };

void scanColumn(int value) {

void setup() {

 // Enable the pullups
 PORTC = PORTC | Mask;

 for(int i=0; i<50;i++){


 for (int note = 0x1E; note < 0x5A; note ++) {
 //Note on channel 1 (0x90), some note value (note), middle velocity (0x45):
 noteOn(0x90, note, 0x45);
 //Note on channel 1 (0x90), some note value (note), silent velocity (0x00):
 noteOn(0x90, note, 0x00);
 // noteOn(176,124,0);

void loop() {

 for (int col = 0; col < 7; col++) {

 // shift scan matrix to following column
 scanColumn(bits1[col]); //enable all except one.

 byte rowVal1 = PINC & Mask;
 byte rowVal= ~rowVal1;//inverted rowVal => key press = 1 
 if(colVals[col] == rowVal1){
 colVals[col] = rowVal1;

 for (int row = 0; row < 6; row++) {
 if(col==0 &&row>0){
 int index =row + ((int)col *6) ;
 int note= index + 48;

 byte k =(bits[row] & rowVal);
 if(k>0 && keyPressed[index]==false){ //and op. on each bit of rowval and determine note press.
 if(k==0 && keyPressed[index]==true){



void noteOn(int cmd, int pitch, int velocity) {
 * DEBUG stuff
 Serial.print("Note: ");
 Serial.print(" Velocity :");

First I setup the basics, enable internal pullups, then set the output port (PORT A) to a given arrangement – one pin turned OFF, others turned ON. The reason to do this than other way around is due to the usage of pullups instead of pull down resistors.

Then I read the input at PORT C. now this is where the rows are connected, so if a key is pressed, the corresponding pin would go LOW. For ease of processing I inverted this reading and I also keep track of “change of state” which means the code will proceed if an only if the previous state was changed.

Then depending whether it was a press down or releasing a key, the appropriate MIDI command is sent. – 0x91 means channel 1, note ON. Pitch is mapped as “48” = C3. (refer for detailed mapping information).

With the code tested, all that remains is to wire it up to the keyboard and test!

Stellaris Launchpad – Starting with ARM Microcontrollers

Last year I ordered a Stellaris Launchpad Evaluation Board from Texas Instruments for $12.99. It arrived through FedEx in 3-4 days (free shipping!).

Update 1: They have changed the brand names from Stellaris to Tiva. 

Update 2: Now they offer an updated variant with chip number “TM4C123G”. This version got built in PWM modules and some other differences. But the old stellaris code can be uploaded directly.Follow the migration document:



Specifications (LM4F120 based Launchpad – Original version) :

Microcontroller Architecture : Arm Cortex M4

Maximum clock speed: 80MHz

RAM : 32kB

PWM pins: 16 (Using timer interrupts instead of dedicated hardware PWM modules)

GPIO pins on the microcontroller: 43 (including PWM; All GPIO are not accessible in the launchpad board)

SSI/ SPI Ports : 4

I2C Ports: 4

UART Ports: 8


Although the number of GPIO and other available pheripharels looks impressive, it should be noted that pinmuxing is used. In simple terms, same pin can be configured to use one of the available peripheral, therefore in practical terms you cannot use all SSI/SPI, I2C, UART, PWM ports at the same time.

To reduce this pin-multiplexing (pinmux) confusion, the “PinMux utility” by Texas instruments can be used to configure the pin usage. The program will generate the necessary code to use in the projects. –

Word Of Caution: Note the copyright notice in the generated code, I think the best idea is to use the program to get an idea on pin config, but not using the C files directly in the code to reduce copyright troubles in case you are worried of the legal wording!

Similar to Arduino “Shields” there are “Booster packs” that can be plugged to the launchpad. Or you can design your own boosterpack like we did.


Setting up toolchain

Several options are offered for development, from Texas Instrument owned “Code Sourcery” to Arm’s Keil or GNU C compiler.

I went for GNU C compiler based path. The ide used was eclipse, however other IDE’s can be used without a problem. The below link explain the method in detail.

In essence, the configuring is into 3 steps,

  • Installing GNU Arm C compiler (and other stuff = toolchain)
  • Installing the flashing utility, setting up UDEV rules, etc
  • Setting up a project (template) on eclipse with the needed settings

For Tiva C setup, follow the below link. Referring the above page is highly recommended**

The language complications

The Stellarisware/Tivaware library set gives high level functions to access and control the peripharels and others without referring the registers directly (these functions do it for you). This helps programmers who are used for encapsulated high level programming to start working on the device instead of bothering which register is used to do something. But knowing how the low level works is highly recommended to proceed in path of Arm or other embedded architectures.

For me, working with this microcontroller through C helped to understand the somewhat confusing concept of “Pointers”. Also I practically used bit shifting and binary operations.


The highlight of this architecture can be easily marked as the interrupts. The NVIC (Nested Vectored Interrupt Controller) enables the programmer to define the priority of different interrupts. For example, updating a display panel or responding to a polling query is low priority than dealing with an encoder.

Also the large amount of available interrupts is quite useful on real time work since running everything on the while loop is not only inefficient, it cannot guarantee constant time between execution of each cycle of the loop.

So do not use the infinite while loop for Control loops, instead use timed interrupts OR measure the time duration- this way is messy. I used these interrupts extensively in one of the major projects with the launchpad. That used UART, GPIO, systick and PWM timer interupts.


Despite the lesser documentation and libraries found for this architecture  (that use Stellarisware/tivaware) than competitors like ST Microelectronics, etc development board. This evaluation board is cheap (13 US Dollars) and decently fast which is ideal for newcomers for Arm and for hobbyists!

Also you can try the easier way by using “Energia IDE” (based on Arduino project). It looks and works like arduino IDE!!


Dead UPS to DIY Spot welder

While i was looking for some spot welder ideas, i saw a project with some transformers of industrial UPS (Uninterrupted Power Supply). Then the old dead UPS lying around my room clicked to my mind and the transformer was huge as i hoped!

The rectification of the UPS was originally done by the diodes built into MOSFET’s used to drive the transformer from battery. To test the rig I used some 1mm copper wire and short circuited it, the result was quite good while the wire got very hot.

Playing more around the transformer showed that it had a center tap for the low voltage winding and a 40A fuse going to the battery. With some trouble, the centre tap got divided to separate coils and I connected them in parallel to increase the maximum current output (nearly doubling the initial amount). From the initial heated 1mm house wiring cable, I switched to multi stranded heaveier guage wire which got less heated and can melt the metal through it’s two connectors pretty easy.

My final conclusion was that using a dead UPS transformer is much easier than modifying a microwave as the risks are less, the coil needs minimal modifications and a nice casing!

Pinguino – PIC based Arduino-like board

For some time I’ve been looking for a microcontroller  development board for my projects so I can do the prototyping a lot easier. The first such system I met is arduino (an open source development board based on Atmel mircocontrollers), I started to search for something like it, but using PIC microcontrollers instead. In the current trend, serial ports are becoming obsolete from newer computers and almost none on laptops, therefore I looked for a USB based solution. Finally, an open source system so messing up with it is some fun!

After some googling, I found out about Pinguino which is based on 28 pin pic18F2550 and bigger version of it, pic18F4550 (40 pin) microcontrollers. Depending on the requirements, the user have choice for either for 2550 or 4550 and both are USB supported in hardware (not bit banging). Just as I had some 18F2550 with me, i made the “traditional 2550” version.

That’s the basic description! The board communicate with the computer as a HID (the microcontroller must be flashed with the firmware in pinguino site to make it the pinguino!) and the IDE given is a python based software which can communicate with the microcontroller. The language used in the IDE is SDCC that can be learnt pretty quickely after following the tutorial.

Overall I like the whole system as it’s very fast in communication (40 seconds in serial vs 2 seconds via pinguino) as well as it’s very responsive. Even though a boot loader usually affect a microcontroller’s performance, this didn’t do as I assumed.