Using iterative solvers – a few lessons learnt

I’ve been working on a couple of projects relating to camera and joint calibration on the Nao Robot for the last couple of years. During the last few months I put more focus on calibrating joints of this robot, particularly the legs.

Almost all calibration tools I’ve know of usually rely on non-linear iterative solvers, particularly the venerable Levenberg-Marquardt solver. The usual setting is the user define a cost function or a fitness function (usually in case of a genetic or like algorithm). This can also be introduced as an optimization problem depending on the field of interest.

This post is not meant to be an exhaustive review of the vast field of optimization or to give specific recommendations as I’m not an expert at this, but to share a few issues I came across and how they affect if not taken proper care of.

Lesson 1: Local or Global solver?

This is one of the most crucial selections in my opinion as it determines the time for calculation and many other factors. Why would this matter? Can’t we just throw in a global optimizer for all problems? – The simple answer is No!

Extrema example
Figure 1. Local and Global minimum; Source: I, KSmrq [CC BY-SA 3.0 (
Taking a look at Figure 1, if this plot relates to a cost function, then minimizing the cost is the objective, thus the solution would be at the global minimum. Another example can be in robotics, the case of inverse kinematics where there can be multiple solutions for a robot to reach the same place.

Majority of the solvers, rely on the gradient of the curve (Derivative or Jacobian matrix in case of multi-variable function), why? knowing the derivative helps the solver to quickly know if it is heading in the right direction! Some might not directly use the derivative, but see if it reduced or increased the cost.

The class of solvers called “local minima” solvers will terminate when they find a minima. Since they don’t spend searching the entire solution space, they are usually fast to give a solution. The ones that rely on derivative (AKA gradient) tend to converge fast as they usually employ “acceleration factors”. The overall result is quick results than brute-force search.

Unfortunately, this behavior of the local solvers also means it is sensitive to the initial position/ initial parameter vector, as the solution depends on where it started. Therefore these solvers work poorely when the cost function is noisy or discontinuous.

Why not use a global solver all the time?

Simply because it consumes a lot of time as it has to check the entire solution space (Consider a system with 20 variables with granularity of 0.1 and bounds of +- 5 -> (10/ 0.1)^20 possible solutions. There are huge systems with thousands of parameters. In order to reduce the workload, bounds for the parameters can be introduced and also to simplify the model. There are also methods such as Particle Swarm optimization which enables parallel processing, etc. Yet the results weren’t that magnificent in a few quick trials, coupled with the time for calculation, I would not recommend global solvers unless absolutely necessary.

This is a widely researched field and most of the machine learning or neural network training algorithms rely on one or more flavours of this class of solvers, clearly there are many new and old methods.

L1 : Conclusion

Try to use a local solver, such as Levenberg-Marquardt. But keep in mind of their limitations, particularly providing a good initial parameter set.

Lesson 2: Derivative free solvers?

This is another important choice and it will definitely ruin the day if not taken proper care of!.

  1. Is the cost/ fitness function differentiable?
  2. Is it noisy?
  3. Is there an analytical solution? Or is the numerical solution good/ stable?

If the answer to first is yes, then the options are wider, else the derivative free solvers have to be used or the cost function could be refactored to be differentiable.

If the second is also yes then one might need to employ smoothing functions or to refactor the cost function.

If differentiable but too much effort for analytical solution, then numerical differentiation could be used. But Numerical Differentiation poses additional dangers and can severely degrade performance if not used with care. I would call this the most important learnt-lesson for my case. Once I had a 10% drop of failures just by changing to central difference method alone! An alternative is Automatic Differentiation (Available with libraries like Eigen, see for more)

So far, in my experience, gradient or derivative or similarly motivated solvers converge fast under right conditions than simple brute force searches. So use them if possible. Else Genetic algorithms, CMA-ES, Pattern search, particle swarm optimization, etc could be used.

Lesson 3: Think before leap!

Before wasting hours or days, think it through which solver is good for the particular case. While it is tempting to use Levenberg Marquardt (probably the 3rd mention in this post) which works pretty well for a vast range of situations, it might not be the best for the job. Also pay attention to the factor of differentiability and quality of the differentiation if such solvers are used.

Picking the right language for the task..

In the past year I had to juggle multiple programming languages much more intensively. While it helped me to improve my skills and refresh, I also came to some conclusions.

Knowing to select a language for a task, knowing these languages and their capabilities is useful in RnD area as time and effort can be saved in writing the code and focus can be put to the actual research.

Below views are my personal opinions based on past experience. They might not be the most correct.

C/ C++

Best for high performance, real time work. Can be a hindrance for quick testing or prototyping. My usual setup is CMake + C++, commonly used for OpenCV projects, etc. I’m not very happy when it comes to things like REST calls, JSON objects…

IDE support can be sometimes bit annoying, esp. finding an open source one, they tend to hog memory or CPU. I use QT Creator, VS Code ( != visual studio**) and just plan text editors.

Yet when written well, its in it’s own class of beauty 😛


After a long pause, I got back to using python. I paid more attention on code quality, learnt more of specific python3 matters.

Quite nice for quick prototyping, etc. Can handle many things well. Specially scientific calculations, matrices, etc got excellent support with libraries such as Numpy and Scipy. In essence, it is entirely possible to use Python in place of Matlab and can be quite faster and of course free of charge.

These libraries are good in performance too, thanks for nice support for calling C/ C++ API’s from python, computationally intensive algorithms can be written in C/C++ and wrap around Python. (C++ calling from Java was a nightmare in comparison). This same feature would be useful when providing Python API’s for C/C++ libraries or calling computationally intensive, optimized algorithms.

I’m not very convinced of production-level code, probably better to convert to Java or C++ (or use a wrapper?) when bundling. These questions mainly arise when dealing with proprietary code that has to be deployed to customers. There is bytecode based distributing option like Java, but I don’t know much about options.

Python definitely do better than JS for prototyping, esp. matrices, etc stuff.


Alright for quick prototyping with NodeJS. Being a born-in-browser language, it’s support for web and related things, async programming is nice.

However, for larger projects, consider writing in Typescript or similar typed language for better code quality and ease of debugging/ IDE support.

There is at least 2 libraries for everything, but finding a feature-complete, good ones isn’t that easy. Don’t even bother to do matrix calculations or “big-data”/ statistical stuff in this 😛

I rather stupidly implemented a camera calibration tool on JS since the debug framework was NodeJS + browser, pretty much wrote everything including a bug-freed Levenberg Marquardt solver.


Despite “write once, run anywhere”, C++ is preferable on some ends (with many FOSS libraries available in most platforms). Java can do pretty much everything like C/ C++, but that doesn’t mean it should be used for everything!

However deploying in many platforms is easier with Java . Yet I don’t fully agree on “enterprise grade stuff is written in Java” story.

I believe Java usage in Enterprise environments got accelerated with Sun certifications, hardware, the very nature of JVM and there are many mature tools written in Java. My disagreement comes because there are better tools out there for certain jobs but some people cannot do anything without Java which can become a bottleneck in RnD environments.

Languages/ frameworks should be chosen for the task and situation, not by the fact the programmer or system architect is __ years experienced with Java based systems.


This used to be my de facto choice for server side web apps, rendering, etc. WordPress, Joomla, etc are written on PHP!

It’s quite fast (According to a recent benchmark I saw, PHp7.0 is only slower than -o2 C++ code. In the sense of variables typing, PHP is similar to Javascript and the code can look very messy.

Decent looking and readable code can be written in PHP with good use of classes and using well made frameworks. In terms of web-servers, Java tend to be generally slower in this area, and PHP is optimized and pretty much made for this job.


While it is possible to do pretty much everything from all these languages, it should be noted that all of these languages are written in C/ C++ in general – why? performance. But that doesn’t mean everything should be written in C/ C++.

Some people like to write everything in Java, C/C++ or Python, what I wanted to point out in this rather entangled and not-so verbose post is that tools are there for convenience and to get the job done!!

Don’t use a sledgehammer to crack a nut

Low cost LIDAR experiment. Part 1


LIDAR (Light Detection And Ranging) is a technology similar to Radar, used to measure distance, etc using light. Generally the principle of measuring is same of basic radar, using time of flight technique. (Measuring time taken to light for travel).

I wanted to build or find a low cost setup a few years back, but the search was quite useless 😛 . Building one need sensitive components and finding these specialized components in Sri Lanka is almost impossible, so I gave up.

At the present, there is one or two low cost products like LIDAR lite ($150) approx. with  max range of 40m. There was another kickstarter project as well, similar pricing.


For the first stage, I’ll simply experiment with available research, low cost components. Later on, optics and range will be improved.

  • Will use PIN photodiodes instead of commonly used APD (Avalanche Photo Diode) – cost and requirement of high voltage circuit for APD makes it unattractive.
  • Attempt to use phase shift method and time of flight method.


  1. Light emitting part
  2. Light receiver and amplification circuit. This is probably the most crucial stage for a low cost setup due to the less sensitive PIN photodiodes.
  3. Timing circuit
  4. Final processing circuit.

Trial 1

My first trial was using Osram SFH 4545 IR emitting diode and Everlight PD333-3C photodiode.

IR emitter

I used a TI Tiva C launchpad to generate the needed pulse output and a BC337 transistor to drive the LED. This setup is not much ideal and I need to use a MOSFET to get good rise/ fall time and current output. (This LED can handle up to 1A pulse).

IR Reciever

I started with available components I had. Judging on the literature available, the needed circuit is called “trans impedance amplifier” – this simply mean the amplifier convert a current to a voltage. The reason was that photo diodes actually generate current than a voltage on different lights. After referring articles mainly by Texas Instruments, I constructed the following circuit. I did not do any formal calculation or analysis, this was just a trial and error setup.

The Op amp I had at hand was TLC25L4A. This is a low power op amp with decent gain but the bandwidth is not that impressive. Nevertheless, the circuit was the foundation of the next iteration.

Op amp : TLC25L4A, R2 = 3Mohm, CR1 = PD333-3C. Image from SBOA060 – TI application note.

Results and limitations

The circuit amplified signals picked up from the photodiode quite okay. But the range was severely limited. Measurements were taken from my 5 year old DSO Quad oscilloscope.

The main issues I faced was the lack of optical filter on the photo diode, therefore it picked up 50Hz light came from the lights in my room, etc. Attempts for filters were rather futile (I tried RC filters only, didn’t have inductors at hand).

At maximum range, approx 60cm!. The amplifier did not have a big gain.

In the next installment, I’ll explore the next set of circuits and other changes such as IR pulse width, etc.

File services with Dreamfactory (file creation)

This and the following posts will cover the areas about the file API that was not clearly covered in the Dreamfactory Documentation. The focus on this article is about file & folder creation.


  1. Create a file
    1. Set content via JSON Request Body
    2. Set content via Multi-part form upload (file upload)
    3. Download to server from URL
  2. Create a folder
  3. Combine folder and file creation

General Information

For all these requests, a JSON request body is used in the following basic format.

POST : Each element of resource array will define a File, Folder request.

  • type : file or folder
  • name : name of file
  • path : path to file
  • content_type : media type ie: text/json
  • content : contents of the file
    "resource": [
            "name": "folder2",
            "path": "folder2",
            "name": "folder2/file1.txt",
            "path": "folder2/file1.txt",
            "content_type": "text",

1. Create a file

1.1 One or multiple files with JSON data.

Type : POST

Note: Setting property “is_base64”: true will enable to upload images with “content” set as Base 64 encoded string.

Warning: Using Base64 encoding for large images is highly discouraged!!! (I tried uploading 18MP image and it was not worth the trouble). For these cases, go for direct file upload by multi-part form.

    "resource": [
            "name": "folder2/file1.txt",
            "path": "folder2/file1.txt",
            "content_type": "text",
            "name": "folder2/file2.txt",
            "path": "folder2/file2.txt",
            "content_type": "text",

This will create two files in the folder names “folder2”. If the folder does not exist, an error will occur.

1.2 Multi-part form upload (file upload)

When I tried to upload some big images, base 64 encoding was not good for the performance at any level. Therefore direct file upload was the best option for my scenario. Unfortunately this was not clearly documented in the wiki.

Type : POST (multi part form)

I have tested and used the following three methods.

Method 1. Plain HTML + minimum Javascript Based.

This can be done in a few ways. Below example is from “test_rest.html” in dreamfactory to test REST calls. Of course the javascript can be completely dropped out if needed.


<form enctype="multipart/form-data" onsubmit="postForm(this)" action="/api/v2/system/user/" method="POST">
 <input type="hidden" name="app_name" value="admin" />
 <!-- MAX_FILE_SIZE must precede the file input field -->
 <input type="hidden" name="MAX_FILE_SIZE" value="3000000000000" />
 <!-- Name of input element determines name in $_FILES array -->
 Test importing users via file: <br/>
 <input name="files" type="file" />
 <input type="submit" value="Send File" />


function postForm(form){
    var jwt = $('#token').val(); //Session Token
    var apiKey= $('#app').val(); // API Key for app
    var url = $('#url').val(); //url for the REST call
    form.action = url+'?session_token='+jwt+"&api_key="+apiKey;
    // the token and api key can be sent as headers (if going for AJAX call)

Method 2. JQuery AJAX Based.


Method 3. Java (okHTTP) based

This method employs the already given Java SDK example for Dreamfactory. I have modified, for my purpose. Original link

Note: This method was done to use on Java 8 on a PC. The method for Android is different and can be found in the api info.

public void addImageFromLocalFile(String fileServiceName, String imageDir, String imageName, File imgFile, Callback<FileRecord> callBackFileRecord) {

 RequestBody requestBody = new MultipartBody.Builder()
 .addFormDataPart("files", "imageName-1.png",
 RequestBody.create(MediaType.parse("image/png"), imgFile))
 final ImageService imageService = DreamFactoryAPI.getInstance(App.SESSION_TOKEN).getService(ImageService.class);

 imageService.addLocalImage(fileServiceName, imageDir, imageName, requestBody).enqueue(callBackFileRecord);

Call<FileRecord> addLocalImage(@Path(value = "file_service_name") 
 String fileServiceName, @Path(value = "id") 
 String contactId, @Path(value = "name") 
 String name, @Body RequestBody file);

1.3 Download to server from URL

This method is quite straightforward and explained in Dreamfactory Wiki;

2. Create Folder

Creating a directory is similar to creating a file, the format is called FolderRequest. The difference is, you will be calling a directory in the API call. From the looks of it, this trick may work without specifying the exact folder, etc!!!

    "resource": [
            "name": "folder2",
            "path": "folder2",

3. Combine Folder and File Creation

Same URL format in the above case, the only difference is you can ask Dreamfactory to create a folder and put files into it. First the file to be created must be specified, then the files to place inside the folder.

    "resource": [
            "name": "folder2",
            "path": "folder2",
            "name": "folder2/file1.txt",
            "path": "folder2/file1.txt",
            "content_type": "text",


The file API is quite nice and create a layer between the filesystem and our applications. You can easily switch to a cloud based storage or a different network drive without the users noticing it.


Dreamfactory – API Automation!

I came across Dreamfactory while I and a colleague of mine were searching for a REST API for PHP. In summary, this framework simplified a lot of setting up and development time! Specially this is an open source project and has enterprise support.

The most valuable feature I saw is ability to connect to almost all major database types and automatically generate the REST API calls. On top of that, this system offers role based authentication and a lot of features.

What you can do with Dreamfactory;

  • Connect to a database and get all necessary REST calls
  • User management, role based authentication, application level access control
  • Custom server side scripting with v8JS, PHP, Node.JS, Python
  • Auto generated API documentation with “try now” option (Based on swagger)

Setting up this framework need some practice and experience with the command line, however following the wiki articles will certainly do the job.

Performance : not so much! Depending on the server, the time to process a REST call may take up to a half a second or more.

You can try Dreamfactory on their trial accounts or you can clone the Git repo and set it up on a local machine or a hosted server.

Setting up GCC, CMake, Boost and Opencv on Windows

Background Story

For a project I’ve been working on, the need came to build the program to run on Windows OS. The project was written in c++ and used OpenCV and Boost libraries. For ease of configuration I employed CMake.

Despite the target being Windows, I was developing and testing everything under GNU/Linux 😛 , fortunately I managed to write the code with minimal amount of native unix API calls. For example, file handling was done via Boost Filesystem and so on.

Therefore the only consideration was running the CMake script in windows, using Visual Studio or GCC (via MinGW). First attempt was done with VS 2013, but compilation failed with a bug of VS c++ compiler related to c++ template classes and getting a later version was taking time. So I gave a try for gcc on windows!


First I wanted to see if c++ building on windows with relative ease is possible. Next, I wanted to avoid the dependency of Visual Studio for the matter. This might be specially useful if you code for commercial work, but cannot afford to buy the license or simply dislike Visual Studio 😛

Step 1. Install MinGW

I went for MinGW-w64 ( build since its more accepted and supports 64 bit. (Don’t expect citations justifying this 😀 ).

  1. Mingw Builds ( distribution was chosen as I didn’t want to install cygwin or win builds. Its a simple install. The following combination of settings worked for me.
    • Target Architecture – 64 bit (personal choice, depends on target system)
    • Threads – Win32 (Some recommend POSIX over Win32, However openCV build failed with mysterious problems with POSIX threads)
    • Exception – seh (I didn’t do much research here, just kept first available option
  2. Once the installation is complete, navigate to the install location and look for “bin” folder.
  3. Add that location to the PATH variable
    1. Control Panel -> System -> Advanced System settings -> Environment Variables -> system variables -> choose Path and click Edit
    2. Append the path to “bin” folder into the PATH variable. (There are tons of guides of how to do this)
  4. Open a command line (Start -> cmd.exe)
  5. Type the following commands it should show the version and other info
    • gcc -v
    • This is merely used to confirm setting the PATH variable worked.
  6. Once gcc works, navigate to “bin” folder of MinGW install and make a copy of “mingw32-make” and rename it to “make”. (This executable provide “make”) This step is for convenience with CMake 😉

Step 2. Install CMake

  1. Download (
  2. Run the installer
    • It’ll ask whether it is okay to include CMake binary location to the PATH variable – Tick yes, preferably system wide.
  3. Open a command line (cmd.exe) and run the following
    • cmake --version
    • Provided everyting works, the output will show the CMake version.

Step 3. Build and Install OpenCV

  1. Download OpenCV from
    • I built 3.1.0 with default options.
  2. Extract the archive (ie: D:\opencv-3_1_0)
  3. Open CMD and change directory to source folder of opencv. From this step, all commands would be executed through CMD unless otherwise noted.
    • D:
    • cd D:\opencv-3_1_0\source
    • First command is not needed if OpenCV resides in C: for other partitions enter the partition name to change to that, then use the cd command. (I find this inconvenient 😛 )
  4. Run CMake. The main change is additional “MinGW Makefiles” parameter. Other than that, this step is pretty much going with standard OpenCV Documentation. Add necessary arguments fitting the needs!.
    • cmake -G "MinGW Makefiles" [other arguments] ../build
    • “../build” at end pointed “build” directory as destination of MakeFile
  5. Once Cmake configures successfully, change to the build directory and execute make.
    • cd ../build
    • make -j5
    • I used -j5 as my computer have 4 logical processors so 5 threads is well enough to fully load it! If the computer have 8 cores, use -j8 or -j9
    • Use the multithreaded compile option (-j5) with caution, some laptops tend to go into thermal shutdown with maximum load!!
  6. If the build complete successfully, next run make install
    • make install
    • This step will finalize the install.

Step 3. Build and Install Boost

  1. Download and extract boost archieve from (
  2. Make sure to download the source package!
  3. Extract the archive, enter the directory from CMD.exe (example below)
    • cd "D:\Program Files\boost\"
  4. Run the following commands
    • bootstrap.bat gcc
    • b2 --build-dir=build cxxflags="-std=c++11" -j5 --with-filesystem --with-system define=BOOST_SYSTEM_NO_DEPRECATED toolset=gcc stage
    • Note the use of “toolset=gcc”, “-j5” options. These are self explanatory!
    •  “–with-<library_name>” flag  is used to explicitly include the necessary ones only. If you choose to build all libraries, then don’t specify this at all.
  5. If everything works fine, then Boost Build is complete!. Navigate to “stage” folder and go through the inner folders, there would be the compiled DLL files. (ie: libboost_system_xxx_mingw_xx.dll)

Step 4. Setting up CMake Script to work with windows

  1. Usage
    • cmake -G "MinGW Makefiles" .
  2. I mashed up OpenCV and Boost CMake example scripts and some of my experiments to come up with the following CMake script.
  3. This script is a bare-bone CMake script, it has to be modified to suite your own project. I’m not an expert in CMake, so there would be room for improvement.
  4. I’ve tested the script with following configurations for the same exact project.
    • Kubuntu (16.04 LTS) with CMake 3.2.2, OpenCV 3.1.0, Boost 1.58
    • Windows 7 Professional, with CMake 3.6.0-rc_2, OpenCV 3.1.0, Boost 1.61
  5. The script is an updated version featured in my previous post on Boost, OpenCV, CUDA and CMake on Linux (
cmake_minimum_required(VERSION 3.2 FATAL_ERROR)

set(execFiles test.cpp)

 set(OpenCV_DIR "D:\opencv-3-10\build") #Change this


 set (CMAKE_CXX_FLAGS "-std=c++11 -lm")
 set(BOOST_ROOT "F:\Program Files\boost_1_61_0") #change this, critical!

find_package( OpenCV REQUIRED )
FIND_PACKAGE( Boost 1.58 COMPONENTS filesystem system REQUIRED )
message(boost ${BOOST_INCLUDEDIR})


add_executable (testApp ${execFiles})

target_link_libraries(testApp ${OpenCV_LIBS} ${Boost_LIBRARIES})

Possible Issues, Observations

I came across a few issues while setting up GNU Toolchain on Windows as well as configuring Boost, Opencv.

  1. CMake complains “CMake was unable to find a build program corresponding to “MinGW Makefiles”. CMAKE_MAKE_PROGRAM is not set.”
    • Cause : Seems to be CMake not recognizing “mingw32-make” as the make program despite CMake documentation saying it works! (
    • Workaround is making a copy of mingw32-make and rename it as “make”
    • The workaround may clash with existing “make” executable in the PATH. So take care!!
  2. When configuring Boost, “c1 is not recognized as an internal command”
    • Cause : Not specifying toolchain when executing bootstrap.bat and b2.exe
  3. Windows shows errors “libboost_system_xx_mingw_xx.dll” is not installed or “libopencv_imgproc310.dll is not installed” or similar errors
    • Cause : Windows cannot locate the DLL files.
    • Simplest fix is just copying the necessary DLL files and package them when distributing.
    • Warning : Always check legal matters (license agreement) before packing libraries that are not owned by you. Even if the libraries are open source, the license type may restrict distribution in binary format like this.

Nvidia Optimus, Bumblebee and CUDA on Kbuntu 15.10

I decided to write this post after experiencing a chain of weird events with setting up CUDA with ubuntu (Kubuntu,etc)!

I’ve used CUDA on OpenCV with Archlinux in 2014-2015 and it wasn’t too hard to get to work. But the story with Ubuntu is completely different 😛

First path is nvidia developer repo. That have own perils but you get latest CUDA version. (7.5). Second path is Ubuntu provided way which is much safer but not the latest (6.5).

Option 1 CUDA through nvidia repo. ( nvidia proprietary drivers + nvidia-primus)

Follow the guide at

They’ll replace ubuntu driver with their version (352.79 instead of 352.68 in my case)

Warning 1 :Do not reboot right now, otherwise you may come to a black screen and will have to boot from a live cd and chroot !! From my experience you have to run gpu-manager manually (explained below)

Warning 2: nvidia-prime will set nvidia chipset as the default** This doesn’t work very well and consume power (caused kde to crash with multiple monitors, strange font sizes, etc).

  • Use nvidia-settings and set intel as the primary chipset.
  • Login from the command line (alt + ctrl +f1);
  • Run the following commands, first will stop the DM (sddm for kubuntu, kde dropped KDM since kde 5) second will run gpu-manager that’ll go through the configuration.
  1. sudo systemctl stop sddm
  2. sudo gpu-manager
  • Observe the output of gpu-manager. Now you can reboot and see the results.
  • Make sure everything works fine (multiple monitors, etc).

Option 2 CUDA through Ubuntu Repo (nvidia proprietary + bumblebee or nvidia-primus)

Install nvidia drivers. Easiest path is using the “driver manager” software of ubuntu/ kubuntu. In kubuntu its accessible in System settings.

Note "Driver Management" icon in hardware section.
Note “Driver Management” icon in hardware section.
Choose the nvidia proprietary driver. (352 recommended)
Choose the nvidia proprietary driver. (352 recommended)

Next install the following packages: nvidia-cuda nvidia-cuda-toolkit

sudo apt-get install nvidia-cuda-dev nvidia-cuda-toolkit

Now run “nvcc -V” and see if the compiler runs.

Getting CUDA to work with cmake and gcc

I prefer to use cmake script for OpenCV projects so I’ll explain that method. Other options are easily found on internet.

If you observe closely, the compatibility matrix shown at nvidia website, maximum supported gcc version at this time is 4.9 with CUDA 7.5 Now the issue is Ubuntu 15.10 have gcc 5+.

So the first fix had to be install gcc 4.9 and point nvcc to gcc 4.9. In cMake scripts, the following declarations worked for me. In addition I had to specify some more info. Some people suggest editing nvcc.profile but I didn’t bother, I was already using cmake for the opencv projects!

set(CUDA_TOOLKIT_ROOT_DIR "/usr/local/cuda")
set(CUDA_HOST_COMPILER "/usr/bin/gcc-4.9")
find_package(CUDA 7.5)

The first two lines found via opencv cmake config and the third had to be added after cmake complained “missing CUDA_CUDART_LIBRARY”.

Now everything should work fine;

If you get “invalid device ordinal” when running CUDA apps, the reason is the driver seems not to load properly on resuming from sleep. Dmesg will show this as “gpu falling off the bus”. Currently I couldn’t find a fix for the matter, I guess editing the config of nvidia-primus or bumblebee may help.

CUDA devicequery

Some history without dates 😛

Initially the only solution for graphics switching and keeping the nvidia chipset from overheating or acting strange required some hacking of ACPI calls. Then this project called Bumblebee emerged ( Thanks for the bumblebee daemon, it was possible to have proper power management. Some time later, they released a kernel module called “bbswitch”. This module make life even easier by automatically enabling power management.

Since the beginning I went with bumblebee + nvidia propriatery drivers, so as usual I went that way with ubuntu.

By now, Nvidia has released their version of optimus switching for Linux, the package is known as nvidia-prime. In past, the only alternative was bumblebee. I’m yet to see which works better, but I’m not much concerned as the only use of nvidia in linux is for CUDA based stuff. On windows however, it sees enough action ;).