This post was originally published at Antmicro.
Antmicro is involved in many projects where video processing is the key focus – from SDI-enabled AV equipment, through multi-camera robotics systems, biometric identifications devices to VR and medical cameras.
When working with off-the-shelf video sources (or prerecorded test datasets) it is easy to assume that a camera always just spits out a good looking picture and all you have to do is run your high-level algorithms to process it.
Things are wildly different when – like us – you’re actually involved in building camera systems, which is often required for custom, innovative applications that push the limits of what is possible. In fact, when you start working on a lower level, say by capturing the data with an FPGA board or when you are implementing camera drivers for new cameras or new platforms, the number of moving parts you are juggling before you get usable video make debugging pretty difficult. That is why we developed a new open source image analysis tool called Raviewer to assist in such cases, and as with many other tools originally created for our internal needs while developing products for our customers, we released it to help reduce frustration related to working with difficult engineering problems.
Processing video data 101
The theory behind image data is very simple. Most of the capture devices use one of the popular predefined formats (an RGB, YUV or BAYER variant). In addition to that, images have a width and a height. In practice, before reaching the user, the video data typically goes through many layers of processing, and even a small misconfiguration in the pipeline can render the image useless.
There are several things that can go wrong. The receiver (the FPGA pipeline or the sensor driver) needs to be configured to match the data provider (e.g. a camera sensor). The match needs to be bit-perfect. The problems we usually encounter include reversing the bit order, mismatched image format configuration (e.g. RGB instead of YUV, or RGB24 instead of RGB32), switched endianness, incorrect dimensions, etc.
In the world of embedded platforms that we typically deal with, most vendors (like Nvidia or NXP) equip their chips with dedicated video capture blocks which do some processing and conversion themselves, providing an extra layer of complication. Similar issues arise when working with proprietary codecs and other closed SoC building blocks which more often than not involve binary blobs which make debugging even harder.
Debugging issues with image data
Experience shows that debugging visual problems is best done with visual tools. The very first step to solving the problem needs to be acquisition and a proper identification of the data. Once the exact format of the data is known, the problem becomes much easier to fix.
Raviewer is an open source tool for handling arbitrary binary data and visualizing it using selected parameters. Changing the parameters gives visual cues that help identify the data. Let’s take a look at an example.
Assume that we captured a file called
unknown_data, which we know to contain an image – but we do not know anything about it. The usual first step (without the right tools) would be to look at the dump of the data
$ hexdump unknown_data | head -n 20 00000000 ff 7f 13 ff ff 7e 12 ff fb 7d 18 ff f9 7b 16 ff 00000010 f5 7b 21 ff f3 79 1f ff e9 71 22 ff e3 6b 1c ff 00000020 da 65 21 ff da 65 21 ff d5 65 2f ff d4 64 2e ff 00000030 c4 5f 33 ff b9 54 28 ff ac 4d 2b ff a9 4a 28 ff 00000040 99 49 37 ff 91 41 2f ff 83 3f 30 ff 80 3c 2d ff 00000050 76 3f 37 ff 72 3b 33 ff 66 37 3b ff 62 33 37 ff 00000060 58 2d 38 ff 57 2c 37 ff 53 2b 37 ff 53 2b 37 ff 00000070 52 2c 37 ff 52 2c 37 ff 4f 2d 35 ff 4f 2d 35 ff 00000080 54 2e 37 ff 53 2d 36 ff 52 2d 33 ff 52 2d 33 ff 00000090 53 2e 34 ff 52 2d 33 ff 51 2c 30 ff 4f 2a 2e ff 000000a0 51 2a 2f ff 51 2a 2f ff 50 29 2d ff 51 2a 2e ff 000000b0 55 2c 2e ff 56 2d 2f ff 56 2d 2f ff 55 2c 2e ff 000000c0 4f 2c 30 ff 4f 2c 30 ff 4f 2c 30 ff 4e 2b 2f ff 000000d0 4c 29 2d ff 4b 28 2c ff 4c 29 2d ff 4d 2a 2e ff 000000e0 50 2d 31 ff 50 2d 31 ff 50 2d 31 ff 50 2d 31 ff 000000f0 4f 2c 30 ff 4e 2b 2f ff 4c 29 2d ff 4c 29 2d ff 00000100 48 2c 2d ff 47 2b 2c ff 46 2a 2b ff 45 29 2a ff 00000110 45 29 2c ff 45 29 2c ff 46 2a 2d ff 46 2a 2d ff 00000120 45 2a 2d ff 45 2a 2d ff 45 2a 2d ff 44 29 2c ff 00000130 43 27 2d ff 43 27 2d ff 42 26 2c ff 42 26 2c ff
We can clearly see some kind of data is there, but very little beyond that. Raviewer can help with that. Importing the file into Raviewer renders the following:
Now we can see something that resembles an image, or is it one at all? Typically to arrive at some meaningful conclusion in your research it’s good to experiment a bit. The first step will be to change the format we interpret the data with to see if we can get a better result.
Changing the format to a 32 bit RGB format starts showing some shapes. The image seems to be showing some interlacing-like artifacts though. That usually means the line width interpretation is not correct yet. Let’s see if that can be fixed.
Much better. However the colors still seem to be a little bit off, in addition, some artifacts near the edges can be seen. The color format is still not correct, let’s try other formats, keeping what we already know about the image constant.
Switching endianness produces a correct, good looking image.
Raviewer proved to be useful in our recent projects, for example in developing the FPGA debayering core in which we implemented a demosaicing system that converts raw data obtained from CCD or CMOS sensors.
In addition to that, at Antmicro we believe that proper continuous testing is needed to keep things working. That’s why Raviewer comes with a command line interface that can be used in CI pipelines.
Boost your video processing platform with Raviewer
The described use case is just a small taste of what Raviewer is capable of. Our goal is to make sure that Raviewer can be used in other projects for more complex and detailed analysis. In order to achieve that we are working on several enhancements, which among others, would enable working with v4l2 cameras directly or use neural networks to interpret the data and e.g. detect the image format.