Hello fellow reader, and thank you for your interest in advance :). It's a nice sunny Thursday in Istanbul and I'm in the mood of chit-chatting with ya. This is about my latest project and code quality. So... I've been working on a video conferencing application for the last 25 days and successfully got my hands off the project before the deadline. The first phase was to develop the "middleware" that the application will be built on. I've developed 2 DirectShow filters to capture/push media samples on the fly. And a simple packet interchange framework along with a sample P2P video conferencing program demonstrating the usage of these.
First, a crash course on DirectShow (Go RTFM if you're serious about DirectShow :) ). In DS any media stream is divided into these packages called "media samples". And these packagas run through a graph of "filters" which itself, is called a "filter graph". I assume you know about what a "graph" is. Every filter (node) in the graph is connected to each other via it's "pins". And there are shared buffers to exchange the media samples. These buffers are accessed via the IMediaSample interface as the pin's Receive() or FillBuffer() method is called. These methods are callback methods and they are called as the stream flows through the graph. So you could write a custom DS filter and for example zero-out green and blue bits of all the pixels thus leaving only the red channel of the video sample in your implementation and call it a "OhYeahRed! Filter" :) Or you could compress these pixels and put your own typeid on the header and create your own compressor codec :) Got the idea ? And an example filter graph for the clueless reader's convenience is as follows :)
Suppose you have an application that just displays what your webcam captures. The DS filter graph would be like:
[WebCam device filter] ---> [Color Space Converter] ---> [Video Renderer]
WebCam device filter is a device filter around your webcam driver and
Color space converter filter changes the format of the picture and matches them to your screen resolution. And finally the video renderer filter shows the bits in a window it exposes (It's the ancient renderer, VMR7 or VMR9 is the way to go for a nicer application)
So a video conference application should basically do
WebCam ---> Video Compressor ---> SinkFilter ---> [App] .... Internet .... [App] ---> PushFilter --> Decompressor --> Video Renderer
SinkFilter: passes all media samples captured from the upstream filter to the app with a custom COM interface.
PushFilter: passes all media samples retrieved from the network to filter graph
This is my naive approach to the problem also.
Well I've developed a SinkFilter, a PushFilter, a very simple protocol (only 4 types of packets namely, audiotype, videotype, audiosample, videosample), a demo server app and a demo client app. The demo applications were developed in C#. And all the buffering work is done on these applications. I've avoided to do any complex job in the filters. Because DirectShow filters are COM objects and it's a very tedious job (though not so overwhelming but requires time) to develop a reliable buffering and synchronization mechanism on that level (with the added hassle of resource management with C++ and reference counting trouble of COM objects etc.) . The thing is, everything worked fine on my computer during my tests, but when my client tried to run the demo he ran into many error messages which I haven't bothered to handle :). Shame on me on that regard. You know the feeling when these kind of things happen. And yes, these things happen :). So I revised all the work I've done and noticed many places that could be flourished with proper error handling mechanisms thus expelling the feeling of uncertainty. While gazing at my 250 hours of keyboard crunching for a moment (which is 5 distinct projects), I realized that:
The most important characteristic of high quality code is
ERROR HANDLING (period)
Shame on me, shame on me... :)