mxnet里im2rec的坑

用了一下mxnet,感觉速度上确实比Tensorflow快不少。之前试过部署到Moible端[LINK],也挺方便的。
mxnet训练的时候可以通过ImageDataIter从预处理过的rec文件里读图。rec基本上就是一个打包了label和image的大文件。用rec文件配合DataIter做训练,GPU利用率很高。

遇到两个问题,废了些时间。

1.用im2rec.py生成rec的时候,输入文件格式是,图片序号+Label+图片路径。之间必须用\t分隔,不然会有奇怪的报错。
2.im2rec.py生成图片的时候是用OpenCV读的图,然后从BGR转RGB。图读出来是HxWxC,需要转成CxHxW。不然结果自然会差很多。

Crazyflie 2.0 + Camera

最近给Crazyflie加上了Camera,挺有趣的。

之前在网上看过不少帖子讨论各种解决方案,但是始终没有发现一个简单易行的方法。

IMG_6237

IMG_6239

IMG_6238

最后发现是上面这种带无线传输的一体式的摄像头最方便,重量也小,Crazyflie带着飞无压力。

Crazyflie 2.0 + Flow Deck + Camera!

从VCOM的引脚直接给摄像头供电可以利用Crazyflie本身的开关来控制Camera的开和关。摄像头除了供电以外和Crazyflie是独立的。

接下来的问题就是怎么样在另一台机器上得到Crazyflie传回的图像。因为是摄像头输出的其实是模拟信号,所以需要一个模转数的模块。这部分在很早之前做 智能轮椅 (^_^) 的时候就发现很麻烦。通常是去找一个叫Easycap的小东西,但似乎因为仿制Easycap或者是Ez-Cap的厂家太多了,所以很难找到一个真的可以兼容信号和系统的Easycap。而且Easycap只是一个模转数的模块,在此之前还需要一个多频道的无线接收器。因为这个无线接收器通常需要12V的电源,所以整套接收图像的设备非常“笨重”,而且还不一定好用…

不过所幸最近发现了一个从USB供电的一体的无线接收模转数的模块,非常棒。随便用一个带摄像头的drone试了一下:

可以直接用OpenCV读到图像!可以做CV乐!

IMG_6268

“Distribute” an under development app without enrolling iOS Developer Program

You are developing an iOS App.

Prior to Xcode 7, you have to enroll into Apple’s iOS Developer Program (IDP) to install an App to a device, either yours or your friend’s.

Now in Xcode 7, you can compile and deploy an under development App to your device.
The problem is: How do you “distribute” your App to your friend?

I quoted the “distribute” here, because I am talking about a casual situation that it is not a serious distribution but only within a small scope.

If your friend lives nearby, this is easy. You do the something you did for yourself. You connect his/her device to your Mac/Macbook and use Xcode to install the App.

If you need to make this happen remotely, we need some work. This solution only works if your friend has a Mac with Xcode 7 installed.

There are two options.

1. You send the whole project to your friend and ask him/her to compile it and deploy it to the device. This requires advanced knowledge if your project is somehow complex.

2. After you compiled your App, you can get the actually “AppName.app” file after compiling it for device from Xcode. You send this “AppName.app” to your friend. He/she can open Xcode and follow “Window->Device”, then drag-n-drop the “AppName.app” to his/her device.

However, it is very likely that the second option will show an error “a valid provisioning profile for this executable was not found” on your friend’ screen.

This is because without IDP, the provisioning file we get from Xcode contains the exact UDID (unique device ID) of the device connected to Xcode during compilation. Any other device would not accept the App unless its UDID is listed in the provisioning file. But without IDP, we can not explicitly add UDID to an provisioning file.

Here is the workaround.

1. Register a new Apple ID if you don’t want to share your Apple ID with your friend (for good reason).
2. Ask your friend to add this new Apple ID to your Xcode Account (Preference -> Account -> + button).
3. Your friend will need to create an iOS project (simply choose the single-view Application template), compile it and install it to the device. During this process, Xcode will prompt to ask to “fix issues”. Confirm that and make sure the right “Apple ID” is selected in the “Team” field.
4. Now the UDID of the remote device is associated with the Apple ID.
5. On your local machine, use the same Apple ID to compile the App for device and send out the “AppName.app” file. Now because the UDID of both your device and the remote device are all listed in the provisioning file, the “AppName.app” can be installed into the remote device by either iTunes or Xcode.

This solution is not perfect but it works well without paying $99 each year. Good luck!

Build a Robot From A Power Wheelchair (1/2)

I have been working on a robotic project for a while. In this project, we build a robotic wheelchair controlled by head motions.

IMG_2245

It starts from a commercial powered wheelchair (Titan Front Wheel Drive Power Chair).

We took over the control signals, added a number of sensors and wrote some computer vision softwares. It is a lot of fun. At this time point, we can drive it with head motions, a little slow but very easy to control. A short video is posted.

Robotic Wheelchair
[KGVID width=”640″ height=”360″]http://personal.stevens.edu/~hli18/video/robot.mp4[/KGVID]

The most challenging part so far, to our surprise, is how to take over the driving signals. In fact, once the powered wheelchair can be controlled with digital signals, it is somehow a robot.

We can find quite a few posts discussing how to hack the joysticks with Arduino. If you have a VR2 joystick, it would be much easier. This post (http://myrobotnstuff.blogspot.com.au/2012/07/circuit-for-wheelchair-robot.html) explains the process well. Basically, you can reproduce the digital signals triggered by the joystick with the Arduino.

But unfortunately, the joystick with the Titan power chair is with a different design and it is much more difficult to hack.
It is Dynamic Shark controller. If your remote (joystick) looks like this one , you probably have this kind of joystick.

The difficulty in reproducing the control signals generated by this kind of joystick is that it directly outputs 25V differential signals from the remote to the power module, which is a big black box under the seat. The signals are encoded and there is probably no way to get the encoding protocol.

We did quite a few rounds of trial and error and finally figured out a plan after burning out a joystick…

IMG_1587

This kind of joystick works in a special way. At the end of the handle of joystick, there is with a small coil. 4 small coils are on the board under it. Because of the electromagnetic induction, the coil on the handle of the joystick produces different signals at different positions in the 4 small coils under it.

Roughly speaking, the (analog) signals produced by the 4 small coils are then processed by an on-board micro-controller to produce the driving signal. Because the signals produced by the micro-controller are encoded in an unknown protocol, we choose to get in by reproducing the outputs from the 4 small coils.

If you take away all 4 small coils (DO NOT do that…), the board looks like this

A

After some processing circuits, finally the outputs of the 4 coils can be found as two signals passed the two resistances in the red circuits. If you use an oscilloscope to check the signals, you will find that they are sine waves!

With this observation, the plan is clear. Soldering two wires at the right end of these two resistances (be careful!), we need to use an Arduino to reproduce these two sine waves to drive the wheelchair.

One of the two sine waves is for the X-direction and the other one is controlling the driving in the Y-direction.
The phase the sine wave indicates the sign (forward in X-direction or backward in X-direction) and the amplitude indicates the speed. The two sine waves should be sort-of aligned, which means the phase difference should be either 0 or 180.

With an Arduino, we will produce these sine waves signals to the two wires.

[Bug] g++4.6 参数顺序

遇到一个bug, 看起来像是g++-4.6的问题。

问题是这样的。这个源文件用到了OpenCV:

//< file: test.cpp
#include 

int main (int argc, char** argv) {
    cv::Mat image;
    return 0;
}

用这样一行命令编译:

g++-4.6 `pkg-config --libs opencv`  -o test.bin test.cpp

遇到了错误:

/tmp/ccs2MlQz.o: In function `cv::Mat::~Mat()':
test.cpp:(.text._ZN2cv3MatD2Ev[_ZN2cv3MatD5Ev]+0x39): undefined reference to `cv::fastFree(void*)'
/tmp/ccs2MlQz.o: In function `cv::Mat::release()':
test.cpp:(.text._ZN2cv3Mat7releaseEv[cv::Mat::release()]+0x47): undefined reference to `cv::Mat::deallocate()'
collect2: ld returned 1 exit status

错误的原因应该是g++没有正确的链接到OpenCV的库。各种尝试之后发现只要调换一下参数的位置就可以正常编译 -_-!!
改用这样一行命令编译就没有问题了。

g++-4.6 test.cpp `pkg-config --libs opencv`  -o test.bin

具体原因不明,但是如果把g++-4.6换作g++-4.4就没有这个问题。
这行命令也可以正常编译:

g++-4.4 `pkg-config --libs opencv`  -o test.bin test.cpp 

这么看起来很有可能是g++-4.6的bug,或者是改进..?

[Vim]用行号参与替换

一个小技巧。Vim有好处千种,”替换”只是其中一个。

除了强大的正则表达式,\=也是一个好用的工具。
比如要生成这么一个文件

This is number 1
This is number 2
This is number 3
This is number 4
This is number 5
This is number 6
This is number 7
This is number 8
This is number 9
This is number 10

方法当然有很多。用\=可以这么做:
先输入一行

This is number X

复制出另外9行

yy9p

得到

This is number X
This is number X
This is number X
This is number X
This is number X
This is number X
This is number X
This is number X
This is number X

然后冒号进入Command-line模式 (关于Vim的几种模式)

:%s@X@\=line('.')

就得到了

This is number 1
This is number 2
This is number 3
This is number 4
This is number 5
This is number 6
This is number 7
This is number 8
This is number 9
This is number 10

\=其实就是对\=之后的表达式求值用来做替换。line(‘.’)是一个返回数值的函数,返回当前行的行号,所以每一行的行号被作为\=的返回值,用来替换X,就得到了需要的结果。

其他方法比如做一个宏(Macro)来逐行递增也可以达到效果,但是不如用这个\=方便。
因为\=后面的部分是作为表达式来处理的,所以更复杂一些的替换都可以很简单的得到实现,比如 (先撤销掉之前的改动,下同):

:%s@X@\=line('.')*line('.')

就可以得到

This is number 1
This is number 4
This is number 9
This is number 16
This is number 25
This is number 36
This is number 49
This is number 64
This is number 81
This is number 100

我个人觉得最好用的是这个功能

:%s@X@\=printf("%03d", line('.'))

可以得到

This is number 001
This is number 002
This is number 003
This is number 004
This is number 005
This is number 006
This is number 007
This is number 008
This is number 009
This is number 010

printf的加入又带来了太多种可能的玩法,非常称手。

[OpenCV] detectMultiScale: output detection score

OpenCV provides quite decent implementation of the Viola-Jones Face detector.

A quick example looks like this (OpenCV 2.4.5 tested):

// File: main.cc
#include 

using namespace cv;

int main(int argc, char **argv) {

    CascadeClassifier cascade;
    const float scale_factor(1.2f);
    const int min_neighbors(3);

    if (cascade.load("./lbpcascade_frontalface.xml")) {

        for (int i = 1; i < argc; i++) {

            Mat img = imread(argv[i], CV_LOAD_IMAGE_GRAYSCALE);
            equalizeHist(img, img);
            vector objs;
            cascade.detectMultiScale(img, objs, scale_factor, min_neighbors);

            Mat img_color = imread(argv[i], CV_LOAD_IMAGE_COLOR);
            for (int n = 0; n < objs.size(); n++) {
                rectangle(img_color, objs[n], Scalar(255,0,0), 8);
            }
            imshow("VJ Face Detector", img_color);
            waitKey(0);
        }
    }

    return 0;
}
g++ -std=c++0x -I/usr/local/include `pkg-config --libs opencv` main.cc -o main

The detection results are as shown below:
result

For more serious user, it would be nice to have a detection result for each detected face.
The OpenCV provides a overloaded function designed for this usage which is lack of detailed documentation:

vector reject_levels;
vector level_weights;
cascade.detectMultiScale(img, objs, reject_levels, level_weights, scale_factor, min_neighbors);

The reject_levels and level_weights will keep being empty until you write it like this (The whole file):

// File: main.cc
#include 

using namespace cv;

int main(int argc, char **argv) {

    CascadeClassifier cascade;
    const float scale_factor(1.2f);
    const int min_neighbors(3);

    if (cascade.load("./lbpcascade_frontalface.xml")) {

        for (int i = 1; i < argc; i++) {

            Mat img = imread(argv[i], CV_LOAD_IMAGE_GRAYSCALE);
            equalizeHist(img, img);
            vector objs;
            vector reject_levels;
            vector level_weights;
            cascade.detectMultiScale(img, objs, reject_levels, level_weights, scale_factor, min_neighbors, 0, Size(), Size(), true);

            Mat img_color = imread(argv[i], CV_LOAD_IMAGE_COLOR);
            for (int n = 0; n < objs.size(); n++) {
                rectangle(img_color, objs[n], Scalar(255,0,0), 8);
                putText(img_color, std::to_string(level_weights[n]),
                        Point(objs[n].x, objs[n].y), 1, 1, Scalar(0,0,255));
            }
            imshow("VJ Face Detector", img_color);
            waitKey(0);
        }
    }

    return 0;
}

However, this will give you a large number of detected rectangles:
result-org

This is because OpenCV skips the step of filtering out the overlapped small rectangles. I have no idea whether this is by design. But output likes this would not be helpful at least in my own case.

So we would need to make our own changes in the OpenCV's source code.
There are different ways to design detection score, such as
"In the OpenCV implementation, stage_sum is computed and compared against the i stage_threshold for each stage to accept/reject a candidate window. We define the detection score for a candidate window as K*stage_when_rejected + stage_sum_for_stage_when_rejected. If a window is accepted by the cascade, we just K*last_stage + stage_sum_for_last_stage. Choosing K as a large value e.g., 1000, we ensure that windows rejected at stage i have higher score than those rejected at stage i-1." from http://vis-www.cs.umass.edu/fddb/faq.html

Actually, I found a straightforward design of detection score works well in my own work. In the last stage of the face detector in OpenCV, detection rectangles are grouped into clustered to eliminated small overlapped rectangles while keeping the most potential rectangles. The number of final detected faces is at most same as the number of clusters. So we can simply use the number of rectangles grouped into the cluster as the detection score of the associated final rectangle, which may not be accurate but could work.

To make this change, in OpenCV-2.4.5, find the file modules/objdetect/src/cascadedetect.cpp (line 200)

// modules/objdetect/src/cascadedetect.cpp (line 200)
// int n1 = levelWeights ? rejectLevels[i] : rweights[i]; //< comment out this line
int n1 = rweights[i]; //< the change

We then modify the file main.cc accordingly:

// File: main.cc
#include 

using namespace cv;

int main(int argc, char **argv) {

    CascadeClassifier cascade;
    const float scale_factor(1.2f);
    const int min_neighbors(3);

    if (cascade.load("./lbpcascade_frontalface.xml")) {

        for (int i = 1; i < argc; i++) {

            Mat img = imread(argv[i], CV_LOAD_IMAGE_GRAYSCALE);
            equalizeHist(img, img);
            vector objs;
            vector reject_levels;
            vector level_weights;
            cascade.detectMultiScale(img, objs, reject_levels, level_weights, scale_factor, min_neighbors, 0, Size(), Size(), true);

            Mat img_color = imread(argv[i], CV_LOAD_IMAGE_COLOR);
            for (int n = 0; n < objs.size(); n++) {
                rectangle(img_color, objs[n], Scalar(255,0,0), 8);
                putText(img_color, std::to_string(reject_levels[n]),
                        Point(objs[n].x, objs[n].y), 1, 1, Scalar(0,0,255));
            }
            imshow("VJ Face Detector", img_color);
            waitKey(0);
        }
    }

    return 0;
}

And we can have the detection scores like this:
result-final

On 2 dimensional array of C++

I was asked about this today. In practice, I rarely use 2-dimensional array, instead I use vector of vectors.

To allocate a 2-d array on the stack, a C-style array is

int d[2][3];

Then to refer to an element it is like

d[i][j];

To make a dynamical allocation, one can NOT write his code like this

int **wrong_d = new int[2][3];

since the d in int d[2][3]; is not a int**, instead it is of the type

int (*)[3]

or in your evil human words, it is a int[3] pointer.

It is a little tricky to declare a 2-d array dynamically.

int (*d)[3] = new int[2][3];

Or

int v1 = 2;
int (*d)[3] = new int[v1][3];

The number 3 here can NOT be replace by a non-constant. My understanding is that since this value is associated with the type of d, in a strong type language like C/C++ which should be known by the compiler.

We can check the size of variables to verify this interpretation.

 1 #include 
  2 
  3 using namespace std;
  4 
  5 int main(int argc, char **argv)
  6 {
  7     int v1 = 2;
  8     int (*d)[3] = new int[v1][3];
  9     cout << "sizeof(d): " << sizeof(d) << endl;
 10     cout << "sizeof(d[0]): " << sizeof(d[0]) << endl;
 11     cout << "sizeof(d[1]): " << sizeof(d[1]) << endl;
 12     cout << "sizeof(d[0][0]): " << sizeof(d[0][0]) << endl;
 13     return 0;
 14 }

The output is:

$./test 
sizeof(d): 8        //< 2 pointers, &d[0] and &d[1]
sizeof(d[0]): 12    //< int[3], d[0][0] d[0][1] d[0][2]
sizeof(d[1]): 12    //< int[3], d[1][0] d[1][1] d[1][2]
sizeof(d[0][0]): 4  //< an integer

To save your life, I would recommend to use vectors.

int v1 = 2;
int v2 = 3;
vector > d(v1, vector(v2, 0)); 

Nested Array in Bash

用了Bash这么久,才知道Bash支持Array。但是却缺乏对嵌套数组,或者是多维数组的支持。自己的实验里面需要用到结构性的数据,这样看起来或者改起来会比较方便,而且因为这部分是用来处理实验结果,需要经常修改,所以不适合放到C++里面去写。
因此就有了需要Bash支持嵌套数组的需求。

最终的解决方法不是很漂亮,但是也足够我自己用了。这个想法的出发点是这样的:

Bash在处理数组的时候会用到IFS这个环境变量。比如这样一段字符

Li,Age*1;Weight*2;Height*3;Friends*Sun^Wang Wang,Age*11;Weight*12;Height*13;Friends*Li^Sun

如果IFS是空格,我们就可以得到两个元素

  • Li,Age*1;Weight*2;Height*3;Friends*Sun^Wang
  • Wang,Age*11;Weight*12;Height*13;Friends*Li^Sun

如果IFS是分号(;),我们可以得到另一个数组

  • Li,Age*1
  • Weight*2
  • Height*3
  • Friends*Sun^Wang Wang,Age*11
  • Weight*12
  • Height*13
  • Friends*Li^Sun

也就是说我们可以通过指定不同的IFS让一段字符成为不同的数组。

所以,我们可以通过对每一层使用不同的IFS来表达一个嵌套数组。比如上面那段字符,我们用不同的IFS字符,先用空格( ),再用逗号(,),之后用分号(;)……
如此做下去就可以达到层层盘剥的效果(-_-!)

在实际使用中,我们只要保证每一层用到的IFS不会在数组内容中出现就行了。
为了能够生成那样一段字符,我们自然需要一些函数做辅助。

具体的代码放到Github上了。目前这个方法虽然有效,但是使用起来仍然不够简洁,看起来如果找不到更好的Bash下的解决方法,就得考虑换用一种20世纪的脚本语言了…

Github: Nested-Array-Bash

rand函数不可重入

写C代码的时候,srand(int seed) 和 rand() 是常用的伪随机数生成函数。
这两个函数的使用方法很简单,但是一个可能被忽略的细节是,rand() 依赖一个内部的、全局的状态变量。所以 rand() 是不可重入,也是不是线程安全的 (thread-safe) 。

如果多个线程同时调用 rand() 函数,那么无论你如何使用 srand(int seed) 都无法保证结果是可以重现的。每次运行程序,各个线程中 rand() 函数生成的伪随机数序列都和上次不同。

在调试的时候,不能重现的结果会是比较棘手的障碍。

幸好,我们可以用C++11 提供的伪随机数生成器 Pseudo-random number generation (这么翻译好机械-_-!)用法很容易在网络上找到,这里有一个最简单的例子。

#include 
//.....
{
std::default_random_engine gen(0);
int a_random_number = gen();
}

default_random_engine维护自己的内部状态,各个线程都用同样的参数初始化default_random_engine,就可以得到一致的伪随机数序列了。