Building a video softphone part III

It feels like I’ve made a lot of progress from in the last few days although reflecting on what I’ve achieved I actually haven’t got much closer to the softphone goal. My two accomplishments, which seemed exciting at the time, were:

  • Successfully get a list of all the video modes that my webcam supports,
  • Get a video stream from my webcam and save a single frame as a bitmap.

The first step was to get an IMFSourceReader from the IMFMediaSource (my webcam) I created in the part II. My understanding of the way these two interfaces work is that IMFMediaSource is implemented by a class that wraps a device, file, network stream etc. that is capable of providing some audio or video and IMFSourceReader by the class that knows how to read samples from the media source.

The code I used to list my webcam’s video modes is shown below.

// Initialize the Media Foundation platform.
hr = MFStartup(MF_VERSION);
if (SUCCEEDED(hr))
{
    // Create the source reader.
    IMFSourceReader *pReader;
 
    hr = MFCreateSourceReaderFromMediaSource(*ppSource,	pConfig, &pReader);

    if (SUCCEEDED(hr))
    {
        while (SUCCEEDED(hr))
       {
            IMFMediaType *pType = NULL;
            hr = pReader->GetNativeMediaType(0, dwMediaTypeIndex, &pType);
            if (hr == MF_E_NO_MORE_TYPES)
            {
                hr = S_OK;
                break;
            }
            else if (SUCCEEDED(hr))
            {
                // Examine the media type. 
                CMediaTypeTrace *nativeTypeMediaTrace = new CMediaTypeTrace(pType);
                printf("Native media type: %s.n", nativeTypeMediaTrace->GetString());
                pType->Release();
            }

            ++dwMediaTypeIndex;
        }
    }
}

The code in the snippet is just using the standard except for the CMediaTypeTrace class. That’s actually the useful class since it takes the IMFMediaType, which is mostly a bunch of GUIDs that map to constants to describe one of the webcam’s modes, and spits out some plain English to represent the resolution, format etc. of the webcam’s mode. The CMediaTypeTrace class is not actually in the Media Foundation library and instead is provided in mediatypetrace.h which is in one of the samples in the MediaFoundation directory that comes with the Windows SDK (on my system it’s in Windowsv7.1Samplesmultimediamediafoundationtopoedittedutil). As it happens the two video modes that my camera supports, RGB24 and I420, were not included in the list of GUIDs in mediatypetrace.h so I had to search around the place to find what they were and then add them in.

LPCSTR STRING_FROM_GUID( GUID Attr )
{
    ...
    INTERNAL_GUID_TO_STRING( MFVideoFormat_RGB24, 14 );	  // RGB24
    INTERNAL_GUID_TO_STRING( WMMEDIASUBTYPE_I420, 15 );   // I420
}

A full list of the modes my web cam supports are listed below.

Device Name: Logitech QuickCam Pro 9000.
Current media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 640, H: 480.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 640, H: 480.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 160, H: 90.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 160, H: 100.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 160, H: 120.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 176, H: 144.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 320, H: 180.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 320, H: 200.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 320, H: 240.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 352, H: 288.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 640, H: 360.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 640, H: 400.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 864, H: 480.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 768, H: 480.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 800, H: 450.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 800, H: 500.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 800, H: 600.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 960, H: 720.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 1280, H: 720.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 1280, H: 800.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 1280, H: 1024.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 1600, H: 900.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 1600, H: 1000.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=RGB24, FRAME_SIZE=W 1600, H: 1200.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 640, H: 480.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 160, H: 90.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 160, H: 100.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 160, H: 120.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 176, H: 144.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 320, H: 180.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 320, H: 200.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 320, H: 240.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 352, H: 288.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 640, H: 360.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 640, H: 400.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 864, H: 480.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 768, H: 480.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 800, H: 450.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 800, H: 500.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 800, H: 600.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 960, H: 720.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 1280, H: 720.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 1280, H: 800.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 1280, H: 1024.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 1600, H: 900.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 1600, H: 1000.
Native media type: Video: MAJOR_TYPE=Video, SUBTYPE=I420, FRAME_SIZE=W 1600, H: 1200.

The second thing I was able to do was to take a sample from my webcam and save it as a bitmap. To do this I took a lot some short-cuts, namely hard coding the size of the sample, which I know from my webcam’s default mode (640 x 480), and relying on the fact that that mode does not result in any padding (I’m not 100% on that and have taken an educated guess). I found someone else’s sample that created a bitmap file and blatantly copied it. Below is the code I used to extract the sample and save the bitmap.

// Initialize the Media Foundation platform.
hr = MFStartup(MF_VERSION);
if (SUCCEEDED(hr))
{
	// Create the source reader.
	IMFSourceReader *pReader;

	hr = MFCreateSourceReaderFromMediaSource(
		*ppSource,
		pConfig,
		&pReader);

	//GetCurrentMediaType(pReader);
	//ListModes(pReader);
				
	DWORD streamIndex, flags;
	LONGLONG llTimeStamp;
	IMFSample *pSample = NULL;

	while(!pSample)
	{
		// Initial read results in a null pSample??
		hr = pReader->ReadSample(
			MF_SOURCE_READER_ANY_STREAM,    // Stream index.
			0,                              // Flags.
			&streamIndex,                   // Receives the actual stream index. 
			&flags,                         // Receives status flags.
			&llTimeStamp,                   // Receives the time stamp.
			&pSample                        // Receives the sample or NULL.
			);

		wprintf(L"Stream %d (%I64d)n", streamIndex, llTimeStamp);
	}

	// Use non-2D version of sample.
	IMFMediaBuffer *mediaBuffer = NULL;
	BYTE *pData = NULL;
	DWORD writePosn = 0;

	pSample->ConvertToContiguousBuffer(&mediaBuffer);

	hr = mediaBuffer->Lock(&pData, NULL, NULL);

	HANDLE file = CreateBitmapFile(&writePosn);

	WriteFile(file, pData, 640 * 480 * (24/8), &writePosn, NULL);

	CloseHandle(file);

	mediaBuffer->Unlock();

	// Shut down Media Foundation.
	MFShutdown();
}

HANDLE CreateBitmapFile(DWORD *writePosn)
{
	HANDLE file;
	BITMAPFILEHEADER fileHeader;
	BITMAPINFOHEADER fileInfo;
	//DWORD write = 0;
 
	file = CreateFile(L"sample.bmp",GENERIC_WRITE,0,NULL,CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL,NULL);  //Sets up the new bmp to be written to
 
	fileHeader.bfType = 19778;                                                                    //Sets our type to BM or bmp
	fileHeader.bfSize = sizeof(fileHeader.bfOffBits) + sizeof(RGBTRIPLE);                                                //Sets the size equal to the size of the header struct
	fileHeader.bfReserved1 = 0;                                                                    //sets the reserves to 0
	fileHeader.bfReserved2 = 0;
	fileHeader.bfOffBits = sizeof(BITMAPFILEHEADER)+sizeof(BITMAPINFOHEADER);                    //Sets offbits equal to the size of file and info header
 
	fileInfo.biSize = sizeof(BITMAPINFOHEADER);
	fileInfo.biWidth = 640;
	fileInfo.biHeight = 480;
	fileInfo.biPlanes = 1;
	fileInfo.biBitCount = 24;
	fileInfo.biCompression = BI_RGB;
	fileInfo.biSizeImage = 640 * 480 * (24/8);
	fileInfo.biXPelsPerMeter = 2400;
	fileInfo.biYPelsPerMeter = 2400;
	fileInfo.biClrImportant = 0;
	fileInfo.biClrUsed = 0;
 
	WriteFile(file, &fileHeader, sizeof(fileHeader), writePosn, NULL);
	WriteFile(file, &fileInfo, sizeof(fileInfo), writePosn, NULL);

	return file;
}

So that was all fun but it hasn’t gotten me much closer to have an H.264 stream ready for bundling into my RTP packets. Getting the H.264 stream will be my next focus. I think I’ll try capturing it to an .mp4 file as a first step. Actually I wonder if there’s a way I can test an .mp4 file with a softphone and VLC? That would be a handy way to test if the H.264 stream I get is actually going to work when I use it in a VoIP call.

I also ordered Developing Microsoft Media Foundation Applications from Amazon thinking it might help only me on this journey only to find it available for free online a couple of days later :(.

Reply

Your email address will not be published. Required fields are marked *