首页 > 其他分享> > ffmpeg 本地麦克风声音和系统声音混音后，再混合本地桌面成最终的mp4文件-修正

ffmpeg 本地麦克风声音和系统声音混音后，再混合本地桌面成最终的mp4文件-修正

2021-12-10 20:06:56 作者：互联网

之前本人写过一篇博客：
ffmpeg 本地麦克风声音和系统声音混音后，再混合本地桌面成最终的mp4文件

但是存在着下面两个问题：
1.系统声音和麦克风对应的设备的采样率不一样，没有进行重采样，比如系统声音设备的采样率是 48000，若不进行重采样，则最终系统声音播放出来，会发现播放变慢。
2.av_read_frame采集的视频图像，在用下面两个函数编码时，avcodec_receive_packet经常返回 AVERROR(EAGAIN)，若单纯写一个桌面录制的功能，在主线程中进行抓图，编码，则大概率发现是正常的，但是如果创建一个子线程，然后在这个线程里面进行抓图，编码，就会发现出错概率很高。
我曾经将ffmpeg命令行在子线程中调用，发现录制1分钟时，生成的视频文件只有2M。

ret = avcodec_send_frame(pCodecEncodeCtx_Video, pFrameYUV);
if (ret == AVERROR(EAGAIN))
{
	continue;
}

ret = avcodec_receive_packet(pCodecEncodeCtx_Video, &packet);
if (ret == AVERROR(EAGAIN))
{
	continue;
}

针对第一个问题：系统声音设备的采样率是48000，麦克风的是44100，最终编码的声音要求是44100，则先将系统声音重采样成44100，再和麦克风的混音即可。

针对第二个问题：估计是自己目前学艺不精的问题，av_read_frame出现了大问题，故本人自己采取gdi抓图，不用ffmpeg的库函数av_read_frame，关于这个的具体细节，读者可以参考我写的一篇博客：ffmpeg录制桌面(自己用gdi抓图)

下面讲讲本人录制桌面视频，系统音频，麦克风音频的大致方法。

m_hAudioInnerCapture = CreateThread(NULL, 0, AudioInnerCaptureProc, this, 0, NULL);
m_hAudioInnerResample = CreateThread(NULL, 0, AudioInnerResampleProc, this, 0, NULL);
m_hAudioMicCapture = CreateThread(NULL, 0, AudioMicCaptureProc, this, 0, NULL);
m_hAudioMix = CreateThread(NULL, 0, AudioMixProc, this, 0, NULL);
m_hScreenCapture = CreateThread(NULL, 0, ScreenCaptureProc, this, 0, NULL);
m_hScreenAudioMix = CreateThread(NULL, 0, ScreenAudioMixProc, this, 0, NULL);

上面代码中，一共创建了6个线程，其中：
m_hAudioInnerCapture代码的是系统声音抓取。
m_hAudioInnerResample代表的是系统声音重采样。
m_hAudioMicCapture代表的是麦克风声音抓取。
m_hAudioMix代表的是混音，是将重采样后的系统声音和麦克风声音混合。
m_hScreenCapture代表的是桌面图像抓取
m_hScreenAudioMix代表的是桌面图像和混合后的音频进行混合，生成最终的mp4文件。

如下所示，本人定义了五个队列，其中m_pVideoFifo代表的是桌面图像，m_hScreenCapture线程往这个队列里面写数据，m_hScreenAudioMix从这个队列里面读取数据。
m_pAudioInnerFifo代表的系统声音，原始的，m_hAudioInnerCapture线程往这个队列里面写数据，m_hAudioInnerResample线程从这个队列里面读取数据，进行重采样，然后将采样后的结果放入队列
m_pAudioInnerResampleFifo。

AVFifoBuffer *m_pVideoFifo = NULL;
	AVAudioFifo *m_pAudioInnerFifo = NULL;
	AVAudioFifo *m_pAudioInnerResampleFifo = NULL;
	AVAudioFifo *m_pAudioMicFifo = NULL;
	AVAudioFifo *m_pAudioMixFifo = NULL;

下面我给出自己的代码结构：
在这里插入图片描述
其中appfun和log两个文件夹，没有具体业务含义，大家不用管。
CaptureScreen.cpp用于gdi抓取桌面图像。
ULinkRecord.cpp里面完成了音视频的抓取，混音，以及声音和视频混合。

FfmpegVideoAudioAndMicOneFileTest.cpp是调用方，其代码如下：

#include "ULinkRecord.h"
#include <stdio.h>
#include <conio.h>








int main()
{
	ULinkRecord cULinkRecord;

	cULinkRecord.SetMicName(L"麦克风 (2- Synaptics HD Audio)");
	cULinkRecord.SetRecordPath("E:\\learn\\ffmpeg\\FfmpegTest\\x64\\Release");

	RECT rect;
	rect.left = 0;
	rect.top = 0;
	rect.right = 1920;
	rect.bottom = 1080;

	cULinkRecord.SetRecordRect(rect);

	cULinkRecord.StartRecord();

	Sleep(60000);

	printf("begin StopRecord\n");
	cULinkRecord.StopRecord();
	printf("end StopRecord\n");
	return 0;
}

可以看出，本次抓取的时长为1分钟。

下面再给出其他4个主要文件的内容，尽管很长，我还是觉得贴出来会比较好。
CaptureScreen.h内容如下：

#ifndef _CCAPTURE_SCREEN_HH
#define _CCAPTURE_SCREEN_HH

#include<time.h>
#include <d3d9.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <windows.h>

#include <tchar.h>
#include <winbase.h>
#include <winreg.h>
#include <Strsafe.h>


//
// ---抓屏类----
//
class CCaptureScreen
{
public:
	CCaptureScreen(void);
	~CCaptureScreen(void);

public:
	/*-----------定义外部调用函数-----------*/
	int Init(int&, int&);//初始化
	BYTE* CaptureImage(); //抓取屏幕

private:
	/*-----------定义内部调用函数-----------*/
	void* CaptureScreenFrame(int, int, int, int);//抓屏
	HCURSOR FetchCursorHandle(); //获取鼠标光标

private:
	/*-----------定义私有变量-----------*/
	int m_width;
	int m_height;
	UINT   wLineLen;
	DWORD  dwSize;
	DWORD  wColSize;

	//设备句柄
	HDC hScreenDC;
	HDC hMemDC;
	//图像RGB内存缓存
	PRGBTRIPLE m_hdib;
	//位图头信息结构体
	BITMAPINFO pbi;

	HBITMAP hbm;
	//鼠标光标
	HCURSOR m_hSavedCursor;


};

#endif //--_CCAPTURE_SCREEN_HH

CaptureScreen.cpp内容如下：

//#include "stdafx.h"
#include "CaptureScreen.h"

CCaptureScreen::CCaptureScreen(void)
{
	m_hdib = NULL;
	m_hSavedCursor = NULL;
	hScreenDC = NULL;
	hMemDC = NULL;
	hbm = NULL;
	m_width = 1920;
	m_height = 1080;
	FetchCursorHandle();
}
//
// 释放资源
//
CCaptureScreen::~CCaptureScreen(void)
{
	DeleteObject(hbm);
	if (m_hdib){

		free(m_hdib);
		m_hdib = NULL;
	}
	if (hScreenDC){

		::ReleaseDC(NULL, hScreenDC);
	}
	if (hMemDC) {

		DeleteDC(hMemDC);
	}
	if (hbm)
	{
		DeleteObject(hbm);
	}
}

//
// 初始化
//
int CCaptureScreen::Init(int& src_VideoWidth, int& src_VideoHeight)
{
	hScreenDC = ::GetDC(GetDesktopWindow());
	if (hScreenDC == NULL) return 0;

	int m_nMaxxScreen = GetDeviceCaps(hScreenDC, HORZRES);
	int m_nMaxyScreen = GetDeviceCaps(hScreenDC, VERTRES);

	hMemDC = ::CreateCompatibleDC(hScreenDC);
	if (hMemDC == NULL) return 0;

	m_width = m_nMaxxScreen;
	m_height = m_nMaxyScreen;

	if (!m_hdib){
		m_hdib = (PRGBTRIPLE)malloc(m_width * m_height * 3);//24位图像大小
	}
	//位图头信息结构体
	pbi.bmiHeader.biSize = sizeof(BITMAPINFOHEADER);
	pbi.bmiHeader.biWidth = m_width;
	pbi.bmiHeader.biHeight = m_height;
	pbi.bmiHeader.biPlanes = 1;
	pbi.bmiHeader.biBitCount = 24;
	pbi.bmiHeader.biCompression = BI_RGB;

	src_VideoWidth = m_width;
	src_VideoHeight = m_height;

	hbm = CreateCompatibleBitmap(hScreenDC, m_width, m_height);
	SelectObject(hMemDC, hbm);

	wLineLen = ((m_width * 24 + 31) & 0xffffffe0) / 8;
	wColSize = sizeof(RGBQUAD)* ((24 <= 8) ? 1 << 24 : 0);
	dwSize = (DWORD)(UINT)wLineLen * (DWORD)(UINT)m_height;

	return 1;
}

//抓取屏幕数据
BYTE* CCaptureScreen::CaptureImage()
{

	VOID*  alpbi = CaptureScreenFrame(0, 0, m_width, m_height);
	return (BYTE*)(alpbi);
}

void* CCaptureScreen::CaptureScreenFrame(int left, int top, int width, int height)
{

	if (hbm == NULL || hMemDC == NULL || hScreenDC == NULL) return NULL;

	BitBlt(hMemDC, 0, 0, width, height, hScreenDC, left, top, SRCCOPY);
	/*-------------------------捕获鼠标-------------------------------*/
	{
		POINT xPoint;
		GetCursorPos(&xPoint);
		HCURSOR hcur = FetchCursorHandle();
		xPoint.x -= left;
		xPoint.y -= top;

		ICONINFO iconinfo;
		BOOL ret;
		ret = GetIconInfo(hcur, &iconinfo);
		if (ret){
			xPoint.x -= iconinfo.xHotspot;
			xPoint.y -= iconinfo.yHotspot;

			if (iconinfo.hbmMask) DeleteObject(iconinfo.hbmMask);
			if (iconinfo.hbmColor) DeleteObject(iconinfo.hbmColor);
		}
		/*画鼠标*/
		::DrawIcon(hMemDC, xPoint.x, xPoint.y, hcur);
	}
	//动态分配的内存
	PRGBTRIPLE hdib = m_hdib;
	if (!hdib)
		return hdib;

	GetDIBits(hMemDC, hbm, 0, m_height, hdib, (LPBITMAPINFO)&pbi, DIB_RGB_COLORS);
	return hdib;
}

//
// 获取窗体鼠标光标
//
HCURSOR CCaptureScreen::FetchCursorHandle()
{
	if (m_hSavedCursor == NULL)
	{
		m_hSavedCursor = GetCursor();
	}
	return m_hSavedCursor;
}

ULinkRecord.h的内容如下：

#pragma once

#include <string>
#include <Windows.h>

#ifdef	__cplusplus
extern "C"
{
#endif
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
#include "libswscale/swscale.h"
#include "libswresample/swresample.h"
#include "libavdevice/avdevice.h"
#include "libavutil/audio_fifo.h"
#include "libavutil/avutil.h"
#include "libavutil/fifo.h"
#include "libavutil/frame.h"
#include "libavutil/imgutils.h"

#include "libavfilter/avfilter.h"
#include "libavfilter/buffersink.h"
#include "libavfilter/buffersrc.h"


#pragma comment(lib, "avcodec.lib")
#pragma comment(lib, "avformat.lib")
#pragma comment(lib, "avutil.lib")
#pragma comment(lib, "avdevice.lib")
#pragma comment(lib, "avfilter.lib")
#pragma comment(lib, "postproc.lib")
#pragma comment(lib, "swresample.lib")
#pragma comment(lib, "swscale.lib")


#ifdef __cplusplus
};
#endif

class ULinkRecord
{
public:
	ULinkRecord();
	~ULinkRecord();
public:
	void SetMicName(const wchar_t* pMicName);
	void SetRecordPath(const char* pRecordPath);
	void SetRecordRect(RECT rectRecord);
	int StartRecord();
	void StopRecord();
private:
	int OpenAudioInnerCapture();
	int OpenAudioMicCapture();
	int OpenOutPut();
	int InitFilter(const char* filter_desc);
	void Clear();
private:
	static DWORD WINAPI AudioInnerCaptureProc(LPVOID lpParam);
	void AudioInnerCapture();

	static DWORD WINAPI AudioInnerResampleProc(LPVOID lpParam);
	void AudioInnerResample();

	static DWORD WINAPI AudioMicCaptureProc(LPVOID lpParam);
	void AudioMicCapture();

	static DWORD WINAPI AudioMixProc(LPVOID lpParam);
	void AudioMix();

	static DWORD WINAPI ScreenCaptureProc(LPVOID lpParam);
	void ScreenCapture();

	static DWORD WINAPI ScreenAudioMixProc(LPVOID lpParam);
	void ScreenAudioMix();
private:
	std::wstring m_wstrMicName;
	std::string m_strRecordPath;
	std::string m_strFilePrefix;
private:
	CRITICAL_SECTION m_csVideoSection;
	CRITICAL_SECTION m_csAudioInnerSection;
	CRITICAL_SECTION m_csAudioInnerResampleSection;
	CRITICAL_SECTION m_csAudioMicSection;
	CRITICAL_SECTION m_csAudioMixSection;

	AVFifoBuffer *m_pVideoFifo = NULL;
	AVAudioFifo *m_pAudioInnerFifo = NULL;
	AVAudioFifo *m_pAudioInnerResampleFifo = NULL;
	AVAudioFifo *m_pAudioMicFifo = NULL;
	AVAudioFifo *m_pAudioMixFifo = NULL;

	AVFormatContext *m_pFormatCtx_Out = NULL;
	AVFormatContext	*m_pFormatCtx_AudioInner = NULL;
	AVFormatContext	*m_pFormatCtx_AudioMic = NULL;

	AVCodecContext *m_pReadCodecCtx_AudioInner = NULL;
	AVCodecContext *m_pReadCodecCtx_AudioMic = NULL;
	AVCodec *m_pReadCodec_Video = NULL;

	AVCodecContext	*m_pCodecEncodeCtx_Video = NULL;
	AVCodecContext	*m_pCodecEncodeCtx_Audio = NULL;
	AVCodec			*m_pCodecEncode_Audio = NULL;

	SwsContext *m_pImgConvertCtx = NULL;
	SwrContext *m_pAudioInnerResampleCtx = NULL;
	SwrContext *m_pAudioConvertCtx = NULL;


	AVFilterGraph* m_pFilterGraph = NULL;
	AVFilterContext* m_pFilterCtxSrcInner = NULL;
	AVFilterContext* m_pFilterCtxSrcMic = NULL;
	AVFilterContext* m_pFilterCtxSink = NULL;

	int m_iVideoStreamIndex = 0;
	int m_iAudioStreamIndex = 0;
	bool m_bRecord;

	HANDLE m_hAudioInnerCapture = NULL;
	HANDLE m_hAudioInnerResample = NULL;
	HANDLE m_hAudioMicCapture = NULL;
	HANDLE m_hAudioMix = NULL;
	HANDLE m_hScreenCapture = NULL;
	HANDLE m_hScreenAudioMix = NULL;

	int m_iYuv420FrameSize = 0;

	int m_iRecordPosX = 0;
	int m_iRecordPosY = 0;
	int m_iRecordWidth = 0;
	int m_iRecordHeight = 0;

	int m_iFrameNumber = 0;
};

ULinkRecord.cpp的内容如下：

#include "ULinkRecord.h"
#include "log/log.h"
#include "appfun/appfun.h"

#include "LocalRecord.h"
#include "CaptureScreen.h"



typedef struct BufferSourceContext {
	const AVClass    *bscclass;
	AVFifoBuffer     *fifo;
	AVRational        time_base;     ///< time_base to set in the output link
	AVRational        frame_rate;    ///< frame_rate to set in the output link
	unsigned          nb_failed_requests;
	unsigned          warning_limit;

	/* video only */
	int               w, h;
	enum AVPixelFormat  pix_fmt;
	AVRational        pixel_aspect;
	char              *sws_param;

	AVBufferRef *hw_frames_ctx;

	/* audio only */
	int sample_rate;
	enum AVSampleFormat sample_fmt;
	int channels;
	uint64_t channel_layout;
	char    *channel_layout_str;

	int got_format_from_params;
	int eof;
} BufferSourceContext;


static char *dup_wchar_to_utf8(const wchar_t *w)
{
	char *s = NULL;
	int l = WideCharToMultiByte(CP_UTF8, 0, w, -1, 0, 0, 0, 0);
	s = (char *)av_malloc(l);
	if (s)
		WideCharToMultiByte(CP_UTF8, 0, w, -1, s, l, 0, 0);
	return s;
}


/* just pick the highest supported samplerate */
static int select_sample_rate(const AVCodec *codec)
{
	const int *p;
	int best_samplerate = 0;

	if (!codec->supported_samplerates)
		return 44100;

	p = codec->supported_samplerates;
	while (*p) {
		if (!best_samplerate || abs(44100 - *p) < abs(44100 - best_samplerate))
			best_samplerate = *p;
		p++;
	}
	return best_samplerate;
}




/* select layout with the highest channel count */
static int select_channel_layout(const AVCodec *codec)
{
	const uint64_t *p;
	uint64_t best_ch_layout = 0;
	int best_nb_channels = 0;

	if (!codec->channel_layouts)
		return AV_CH_LAYOUT_STEREO;

	p = codec->channel_layouts;
	while (*p) {
		int nb_channels = av_get_channel_layout_nb_channels(*p);

		if (nb_channels > best_nb_channels) {
			best_ch_layout = *p;
			best_nb_channels = nb_channels;
		}
		p++;
	}
	return best_ch_layout;
}


unsigned char clip_value(unsigned char x, unsigned char min_val, unsigned char  max_val) {
	if (x > max_val) {
		return max_val;
	}
	else if (x < min_val) {
		return min_val;
	}
	else {
		return x;
	}
}

//RGB to YUV420
bool RGB24_TO_YUV420(unsigned char *RgbBuf, int w, int h, unsigned char *yuvBuf)
{
	unsigned char*ptrY, *ptrU, *ptrV, *ptrRGB;
	memset(yuvBuf, 0, w*h * 3 / 2);
	ptrY = yuvBuf;
	ptrU = yuvBuf + w * h;
	ptrV = ptrU + (w*h * 1 / 4);
	unsigned char y, u, v, r, g, b;
	for (int j = h - 1; j >= 0; j--) {
		ptrRGB = RgbBuf + w * j * 3;
		for (int i = 0; i < w; i++) {

			b = *(ptrRGB++);
			g = *(ptrRGB++);
			r = *(ptrRGB++);


			y = (unsigned char)((66 * r + 129 * g + 25 * b + 128) >> 8) + 16;
			u = (unsigned char)((-38 * r - 74 * g + 112 * b + 128) >> 8) + 128;
			v = (unsigned char)((112 * r - 94 * g - 18 * b + 128) >> 8) + 128;
			*(ptrY++) = clip_value(y, 0, 255);
			if (j % 2 == 0 && i % 2 == 0) {
				*(ptrU++) = clip_value(u, 0, 255);
			}
			else {
				if (i % 2 == 0) {
					*(ptrV++) = clip_value(v, 0, 255);
				}
			}
		}
	}
	return true;
}


ULinkRecord::ULinkRecord()
{
	InitializeCriticalSection(&m_csVideoSection);
	InitializeCriticalSection(&m_csAudioInnerSection);
	InitializeCriticalSection(&m_csAudioInnerResampleSection);
	InitializeCriticalSection(&m_csAudioMicSection);
	InitializeCriticalSection(&m_csAudioMixSection);

	avdevice_register_all();
}

ULinkRecord::~ULinkRecord()
{
	DeleteCriticalSection(&m_csVideoSection);
	DeleteCriticalSection(&m_csAudioInnerSection);
	DeleteCriticalSection(&m_csAudioInnerResampleSection);
	DeleteCriticalSection(&m_csAudioMicSection);
	DeleteCriticalSection(&m_csAudioMixSection);
}

void ULinkRecord::SetMicName(const wchar_t* pMicName)
{
	m_wstrMicName = pMicName;
}

void ULinkRecord::SetRecordPath(const char* pRecordPath)
{
	m_strRecordPath = pRecordPath;
	if (!m_strRecordPath.empty())
	{
		if (m_strRecordPath[m_strRecordPath.length() - 1] != '\\')
		{
			m_strRecordPath = m_strRecordPath + "\\";
		}
	}
}

void ULinkRecord::SetRecordRect(RECT rectRecord)
{
	m_iRecordPosX = rectRecord.left;
	m_iRecordPosY = rectRecord.top;
	m_iRecordWidth = rectRecord.right - rectRecord.left;
	m_iRecordHeight = rectRecord.bottom - rectRecord.top;
}

int ULinkRecord::StartRecord()
{
	int iRet = -1;
	do 
	{
		m_pAudioConvertCtx = swr_alloc();
		av_opt_set_channel_layout(m_pAudioConvertCtx, "in_channel_layout", AV_CH_LAYOUT_STEREO, 0);
		av_opt_set_channel_layout(m_pAudioConvertCtx, "out_channel_layout", AV_CH_LAYOUT_STEREO, 0);
		av_opt_set_int(m_pAudioConvertCtx, "in_sample_rate", 44100, 0);
		av_opt_set_int(m_pAudioConvertCtx, "out_sample_rate", 44100, 0);
		av_opt_set_sample_fmt(m_pAudioConvertCtx, "in_sample_fmt", AV_SAMPLE_FMT_S16, 0);
		//av_opt_set_sample_fmt(audio_convert_ctx, "in_sample_fmt", AV_SAMPLE_FMT_FLTP, 0);
		av_opt_set_sample_fmt(m_pAudioConvertCtx, "out_sample_fmt", AV_SAMPLE_FMT_FLTP, 0);

		iRet = swr_init(m_pAudioConvertCtx);


		m_pAudioInnerResampleCtx = swr_alloc();
		av_opt_set_channel_layout(m_pAudioInnerResampleCtx, "in_channel_layout", AV_CH_LAYOUT_STEREO, 0);
		av_opt_set_channel_layout(m_pAudioInnerResampleCtx, "out_channel_layout", AV_CH_LAYOUT_STEREO, 0);
		av_opt_set_int(m_pAudioInnerResampleCtx, "in_sample_rate", 48000, 0);
		av_opt_set_int(m_pAudioInnerResampleCtx, "out_sample_rate", 44100, 0);
		av_opt_set_sample_fmt(m_pAudioInnerResampleCtx, "in_sample_fmt", AV_SAMPLE_FMT_S16, 0);
		av_opt_set_sample_fmt(m_pAudioInnerResampleCtx, "out_sample_fmt", AV_SAMPLE_FMT_S16, 0);

		iRet = swr_init(m_pAudioInnerResampleCtx);


		if (OpenAudioInnerCapture() < 0)
		{
			break;
		}

		if (OpenAudioMicCapture() < 0)
		{
			break;
		}

		if (OpenOutPut() < 0)
		{
			break;
		}

		const char* filter_desc = "[in0][in1]amix=inputs=2[out]";
		iRet = InitFilter(filter_desc);
		if (iRet < 0)
		{
			break;
		}

		if (NULL == m_pAudioInnerResampleFifo)
		{
			m_pAudioInnerResampleFifo = av_audio_fifo_alloc((AVSampleFormat)m_pFormatCtx_AudioInner->streams[0]->codecpar->format,
				m_pFormatCtx_AudioInner->streams[0]->codecpar->channels, 3000 * 1024);
		}

		if (NULL == m_pAudioInnerFifo)
		{
			m_pAudioInnerFifo = av_audio_fifo_alloc((AVSampleFormat)m_pFormatCtx_AudioInner->streams[0]->codecpar->format,
				m_pFormatCtx_AudioInner->streams[0]->codecpar->channels, 3000 * 1024);
		}

		m_iYuv420FrameSize = av_image_get_buffer_size(AV_PIX_FMT_YUV420P, m_iRecordWidth, m_iRecordHeight, 1);

		//申请30帧缓存
		m_pVideoFifo = av_fifo_alloc(30 * m_iYuv420FrameSize);

		m_hAudioInnerCapture = CreateThread(NULL, 0, AudioInnerCaptureProc, this, 0, NULL);
		m_hAudioInnerResample = CreateThread(NULL, 0, AudioInnerResampleProc, this, 0, NULL);
		m_hAudioMicCapture = CreateThread(NULL, 0, AudioMicCaptureProc, this, 0, NULL);
		m_hAudioMix = CreateThread(NULL, 0, AudioMixProc, this, 0, NULL);
		m_hScreenCapture = CreateThread(NULL, 0, ScreenCaptureProc, this, 0, NULL);
		m_hScreenAudioMix = CreateThread(NULL, 0, ScreenAudioMixProc, this, 0, NULL);

		iRet = 0;
	} while (0);
	
	if (0 == iRet)
	{
		m_bRecord = true;
	}
	else
	{
		Clear();
	}


	return iRet;
}


void ULinkRecord::StopRecord()
{
	m_bRecord = false;
	if (m_hAudioInnerCapture == NULL && m_hAudioInnerResample == NULL && m_hAudioMicCapture == NULL && m_hAudioMix == NULL && m_hScreenCapture == NULL && m_hScreenAudioMix == NULL)
	{
		return;
	}
	Sleep(1000);
	HANDLE hThreads[6];
	hThreads[0] = m_hAudioInnerCapture;
	hThreads[1] = m_hAudioInnerResample;
	hThreads[2] = m_hAudioMicCapture;
	hThreads[3] = m_hAudioMix;
	hThreads[4] = m_hScreenCapture;

	WaitForMultipleObjects(6, hThreads, TRUE, INFINITE);

	CloseHandle(m_hAudioInnerCapture);
	m_hAudioInnerCapture = NULL;

	CloseHandle(m_hAudioInnerResample);
	m_hAudioInnerResample = NULL;

	CloseHandle(m_hAudioMicCapture);
	m_hAudioMicCapture = NULL;

	CloseHandle(m_hAudioMix);
	m_hAudioMix = NULL;

	
	if (m_hScreenCapture != NULL)
	{
		CloseHandle(m_hScreenCapture);
		m_hScreenCapture = NULL;
	}

	if (m_hScreenAudioMix != NULL)
	{
		CloseHandle(m_hScreenAudioMix);
		m_hScreenAudioMix = NULL;
	}
	

	Clear();
}

int ULinkRecord::OpenAudioInnerCapture()
{
	int iRet = -1;

	do 
	{
		//查找输入方式
		const AVInputFormat *pAudioInputFmt = av_find_input_format("dshow");

		//以Direct Show的方式打开设备，并将 输入方式 关联到格式上下文
		//const char * psDevName = dup_wchar_to_utf8(L"audio=麦克风 (2- Synaptics HD Audio)");
		char * psDevName = dup_wchar_to_utf8(L"audio=virtual-audio-capturer");

		if (avformat_open_input(&m_pFormatCtx_AudioInner, psDevName, pAudioInputFmt, NULL) < 0)
		{
			LOG_ERROR("avformat_open_input failed, m_pFormatCtx_AudioInner");
			break;
		}

		if (avformat_find_stream_info(m_pFormatCtx_AudioInner, NULL) < 0)
		{
			break;
		}

		if (m_pFormatCtx_AudioInner->streams[0]->codecpar->codec_type != AVMEDIA_TYPE_AUDIO)
		{
			LOG_ERROR("Couldn't find audio stream information");
			break;
		}


		const AVCodec *tmpCodec = avcodec_find_decoder(m_pFormatCtx_AudioInner->streams[0]->codecpar->codec_id);

		m_pReadCodecCtx_AudioInner = avcodec_alloc_context3(tmpCodec);

		m_pReadCodecCtx_AudioInner->sample_rate = m_pFormatCtx_AudioInner->streams[0]->codecpar->sample_rate;
		m_pReadCodecCtx_AudioInner->channel_layout = select_channel_layout(tmpCodec);
		m_pReadCodecCtx_AudioInner->channels = av_get_channel_layout_nb_channels(m_pReadCodecCtx_AudioInner->channel_layout);

		m_pReadCodecCtx_AudioInner->sample_fmt = (AVSampleFormat)m_pFormatCtx_AudioInner->streams[0]->codecpar->format;
		//pReadCodecCtx_Audio->sample_fmt = AV_SAMPLE_FMT_FLTP;

		if (0 > avcodec_open2(m_pReadCodecCtx_AudioInner, tmpCodec, NULL))
		{
			LOG_ERROR("avcodec_open2 failed, m_pReadCodecCtx_AudioInner");
			break;
		}

		avcodec_parameters_from_context(m_pFormatCtx_AudioInner->streams[0]->codecpar, m_pReadCodecCtx_AudioInner);

		iRet = 0;
	} while (0);

	if (iRet != 0)
	{
		if (m_pReadCodecCtx_AudioInner != NULL)
		{
			avcodec_free_context(&m_pReadCodecCtx_AudioInner);
			m_pReadCodecCtx_AudioInner = NULL;
		}

		if (m_pFormatCtx_AudioInner != NULL)
		{
			avformat_close_input(&m_pFormatCtx_AudioInner);
			m_pFormatCtx_AudioInner = NULL;
		}
	}

	return iRet;
}


int ULinkRecord::OpenAudioMicCapture()
{
	int iRet = -1;
	do 
	{
		//查找输入方式
		const AVInputFormat *pAudioInputFmt = av_find_input_format("dshow");

		//以Direct Show的方式打开设备，并将 输入方式 关联到格式上下文
		std::wstring strDeviceName = L"audio=";
		strDeviceName += m_wstrMicName;
		const char * psDevName = dup_wchar_to_utf8(strDeviceName.c_str());

		if (avformat_open_input(&m_pFormatCtx_AudioMic, psDevName, pAudioInputFmt, NULL) < 0)
		{
			LOG_ERROR("avformat_open_input failed, m_pFormatCtx_AudioMic");
			break;
		}

		if (m_pFormatCtx_AudioMic->streams[0]->codecpar->codec_type != AVMEDIA_TYPE_AUDIO)
		{
			LOG_ERROR("Couldn't find audio stream information");
			break;
		}


		const AVCodec *tmpCodec = avcodec_find_decoder(m_pFormatCtx_AudioMic->streams[0]->codecpar->codec_id);

		m_pReadCodecCtx_AudioMic = avcodec_alloc_context3(tmpCodec);

		m_pReadCodecCtx_AudioMic->sample_rate = select_sample_rate(tmpCodec);
		m_pReadCodecCtx_AudioMic->channel_layout = select_channel_layout(tmpCodec);
		m_pReadCodecCtx_AudioMic->channels = av_get_channel_layout_nb_channels(m_pReadCodecCtx_AudioMic->channel_layout);

		m_pReadCodecCtx_AudioMic->sample_fmt = (AVSampleFormat)m_pFormatCtx_AudioMic->streams[0]->codecpar->format;
		//pReadCodecCtx_Audio->sample_fmt = AV_SAMPLE_FMT_FLTP;

		if (0 > avcodec_open2(m_pReadCodecCtx_AudioMic, tmpCodec, NULL))
		{
			LOG_ERROR("avcodec_open2 failed, m_pReadCodecCtx_AudioMic");
			break;
		}

		avcodec_parameters_from_context(m_pFormatCtx_AudioMic->streams[0]->codecpar, m_pReadCodecCtx_AudioMic);

		iRet = 0;
	} while (0);

	if (iRet != 0)
	{
		if (m_pReadCodecCtx_AudioMic != NULL)
		{
			avcodec_free_context(&m_pReadCodecCtx_AudioMic);
			m_pReadCodecCtx_AudioMic = NULL;
		}

		if (m_pFormatCtx_AudioMic != NULL)
		{
			avformat_close_input(&m_pFormatCtx_AudioMic);
			m_pFormatCtx_AudioMic = NULL;
		}
	}
	return iRet;
}


int ULinkRecord::OpenOutPut()
{
	std::string strFileName = m_strRecordPath;
	m_strFilePrefix = strFileName + time_tSimpleString(time(NULL));

	int iRet = -1;

	AVStream *pAudioStream = NULL;
	AVStream *pVideoStream = NULL;

	do
	{
		std::string strFileName = m_strRecordPath;
		strFileName += time_tSimpleString(time(NULL));
		strFileName += ".mp4";

		const char *outFileName = strFileName.c_str();
		avformat_alloc_output_context2(&m_pFormatCtx_Out, NULL, NULL, outFileName);

		{
			m_iVideoStreamIndex = 0;

			AVCodec* pCodecEncode_Video = (AVCodec *)avcodec_find_encoder(m_pFormatCtx_Out->oformat->video_codec);

			m_pCodecEncodeCtx_Video = avcodec_alloc_context3(pCodecEncode_Video);
			if (!m_pCodecEncodeCtx_Video)
			{
				LOG_ERROR("avcodec_alloc_context3 failed, m_pCodecEncodeCtx_Audio");
				break;
			}

			pVideoStream = avformat_new_stream(m_pFormatCtx_Out, pCodecEncode_Video);
			if (!pVideoStream)
			{
				break;
			}

			int frameRate = 10;
			m_pCodecEncodeCtx_Video->flags |= AV_CODEC_FLAG_QSCALE;
			m_pCodecEncodeCtx_Video->bit_rate = 4000000;
			m_pCodecEncodeCtx_Video->rc_min_rate = 4000000;
			m_pCodecEncodeCtx_Video->rc_max_rate = 4000000;
			m_pCodecEncodeCtx_Video->bit_rate_tolerance = 4000000;
			m_pCodecEncodeCtx_Video->time_base.den = frameRate;
			m_pCodecEncodeCtx_Video->time_base.num = 1;

			m_pCodecEncodeCtx_Video->width = m_iRecordWidth;
			m_pCodecEncodeCtx_Video->height = m_iRecordHeight;
			//pH264Encoder->pCodecCtx->frame_number = 1;
			m_pCodecEncodeCtx_Video->gop_size = 12;
			m_pCodecEncodeCtx_Video->max_b_frames = 0;
			m_pCodecEncodeCtx_Video->thread_count = 4;
			m_pCodecEncodeCtx_Video->pix_fmt = AV_PIX_FMT_YUV420P;
			m_pCodecEncodeCtx_Video->codec_id = AV_CODEC_ID_H264;
			m_pCodecEncodeCtx_Video->codec_type = AVMEDIA_TYPE_VIDEO;

			av_opt_set(m_pCodecEncodeCtx_Video->priv_data, "b-pyramid", "none", 0);
			av_opt_set(m_pCodecEncodeCtx_Video->priv_data, "preset", "superfast", 0);
			av_opt_set(m_pCodecEncodeCtx_Video->priv_data, "tune", "zerolatency", 0);

			if (m_pFormatCtx_Out->oformat->flags & AVFMT_GLOBALHEADER)
				m_pCodecEncodeCtx_Video->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;

			if (avcodec_open2(m_pCodecEncodeCtx_Video, pCodecEncode_Video, 0) < 0)
			{
				//编码器打开失败，退出程序
				LOG_ERROR("avcodec_open2 failed, m_pCodecEncodeCtx_Audio");
				break;
			}
		}


		if (m_pFormatCtx_AudioInner->streams[0]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO && m_pFormatCtx_AudioMic->streams[0]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO)
		{
			pAudioStream = avformat_new_stream(m_pFormatCtx_Out, NULL);

			m_iAudioStreamIndex = 1;

			m_pCodecEncode_Audio = (AVCodec *)avcodec_find_encoder(m_pFormatCtx_Out->oformat->audio_codec);

			m_pCodecEncodeCtx_Audio = avcodec_alloc_context3(m_pCodecEncode_Audio);
			if (!m_pCodecEncodeCtx_Audio)
			{
				LOG_ERROR("avcodec_alloc_context3 failed, m_pCodecEncodeCtx_Audio");
				break;
			}


			//pCodecEncodeCtx_Audio->codec_id = pFormatCtx_Out->oformat->audio_codec;
			m_pCodecEncodeCtx_Audio->sample_fmt = m_pCodecEncode_Audio->sample_fmts ? m_pCodecEncode_Audio->sample_fmts[0] : AV_SAMPLE_FMT_FLTP;
			m_pCodecEncodeCtx_Audio->bit_rate = 64000;
			m_pCodecEncodeCtx_Audio->sample_rate = 44100;
			m_pCodecEncodeCtx_Audio->channel_layout = AV_CH_LAYOUT_STEREO;
			m_pCodecEncodeCtx_Audio->channels = av_get_channel_layout_nb_channels(m_pCodecEncodeCtx_Audio->channel_layout);


			AVRational timeBase;
			timeBase.den = m_pCodecEncodeCtx_Audio->sample_rate;
			timeBase.num = 1;
			pAudioStream->time_base = timeBase;

			if (avcodec_open2(m_pCodecEncodeCtx_Audio, m_pCodecEncode_Audio, 0) < 0)
			{
				//编码器打开失败，退出程序
				LOG_ERROR("avcodec_open2 failed, m_pCodecEncodeCtx_Audio");
				break;
			}
		}


		if (!(m_pFormatCtx_Out->oformat->flags & AVFMT_NOFILE))
		{
			if (avio_open(&m_pFormatCtx_Out->pb, outFileName, AVIO_FLAG_WRITE) < 0)
			{
				LOG_ERROR("avio_open failed, m_pFormatCtx_Out->pb");
				break;
			}
		}

		avcodec_parameters_from_context(pVideoStream->codecpar, m_pCodecEncodeCtx_Video);
		avcodec_parameters_from_context(pAudioStream->codecpar, m_pCodecEncodeCtx_Audio);

		if (avformat_write_header(m_pFormatCtx_Out, NULL) < 0)
		{
			LOG_ERROR("avio_open avformat_write_header,m_pFormatCtx_Out");
			break;
		}

		iRet = 0;
	} while (0);


	if (iRet != 0)
	{
		if (m_pCodecEncodeCtx_Video != NULL)
		{
			avcodec_free_context(&m_pCodecEncodeCtx_Video);
			m_pCodecEncodeCtx_Video = NULL;
		}

		if (m_pCodecEncodeCtx_Audio != NULL)
		{
			avcodec_free_context(&m_pCodecEncodeCtx_Audio);
			m_pCodecEncodeCtx_Audio = NULL;
		}

		if (m_pFormatCtx_Out != NULL)
		{
			avformat_free_context(m_pFormatCtx_Out);
			m_pFormatCtx_Out = NULL;
		}
	}

	return iRet;
}


int ULinkRecord::InitFilter(const char* filter_desc)
{
	char args_inner[512];
	const char* pad_name_inner = "in0";
	char args_mic[512];
	const char* pad_name_mic = "in1";

	AVFilter* filter_src_spk = (AVFilter *)avfilter_get_by_name("abuffer");
	AVFilter* filter_src_mic = (AVFilter *)avfilter_get_by_name("abuffer");
	AVFilter* filter_sink = (AVFilter *)avfilter_get_by_name("abuffersink");
	AVFilterInOut* filter_output_inner = avfilter_inout_alloc();
	AVFilterInOut* filter_output_mic = avfilter_inout_alloc();
	AVFilterInOut* filter_input = avfilter_inout_alloc();
	m_pFilterGraph = avfilter_graph_alloc();

	/*sprintf_s(args_inner, sizeof(args_inner), "time_base=%d/%d:sample_rate=%d:sample_fmt=%s:channel_layout=0x%I64x",
		m_pReadCodecCtx_AudioInner->time_base.num,
		m_pReadCodecCtx_AudioInner->time_base.den,
		m_pReadCodecCtx_AudioInner->sample_rate,
		av_get_sample_fmt_name((AVSampleFormat)m_pReadCodecCtx_AudioInner->sample_fmt),
		m_pReadCodecCtx_AudioInner->channel_layout);*/

	sprintf_s(args_inner, sizeof(args_inner), "time_base=%d/%d:sample_rate=%d:sample_fmt=%s:channel_layout=0x%I64x",
		m_pReadCodecCtx_AudioMic->time_base.num,
		m_pReadCodecCtx_AudioMic->time_base.den,
		m_pReadCodecCtx_AudioMic->sample_rate,
		av_get_sample_fmt_name((AVSampleFormat)m_pReadCodecCtx_AudioMic->sample_fmt),
		m_pReadCodecCtx_AudioMic->channel_layout);

	sprintf_s(args_mic, sizeof(args_mic), "time_base=%d/%d:sample_rate=%d:sample_fmt=%s:channel_layout=0x%I64x",
		m_pReadCodecCtx_AudioMic->time_base.num,
		m_pReadCodecCtx_AudioMic->time_base.den,
		m_pReadCodecCtx_AudioMic->sample_rate,
		av_get_sample_fmt_name((AVSampleFormat)m_pReadCodecCtx_AudioMic->sample_fmt),
		m_pReadCodecCtx_AudioMic->channel_layout);

	//sprintf_s(args_spk, sizeof(args_spk), "time_base=%d/%d:sample_rate=%d:sample_fmt=%s:channel_layout=0x%I64x", _fmt_ctx_out->streams[_index_a_out]->codec->time_base.num, _fmt_ctx_out->streams[_index_a_out]->codec->time_base.den, _fmt_ctx_out->streams[_index_a_out]->codec->sample_rate, av_get_sample_fmt_name(_fmt_ctx_out->streams[_index_a_out]->codec->sample_fmt), _fmt_ctx_out->streams[_index_a_out]->codec->channel_layout);
	//sprintf_s(args_mic, sizeof(args_mic), "time_base=%d/%d:sample_rate=%d:sample_fmt=%s:channel_layout=0x%I64x", _fmt_ctx_out->streams[_index_a_out]->codec->time_base.num, _fmt_ctx_out->streams[_index_a_out]->codec->time_base.den, _fmt_ctx_out->streams[_index_a_out]->codec->sample_rate, av_get_sample_fmt_name(_fmt_ctx_out->streams[_index_a_out]->codec->sample_fmt), _fmt_ctx_out->streams[_index_a_out]->codec->channel_layout);


	int ret = 0;
	ret = avfilter_graph_create_filter(&m_pFilterCtxSrcInner, filter_src_spk, pad_name_inner, args_inner, NULL, m_pFilterGraph);
	if (ret < 0)
	{
		printf("Filter: failed to call avfilter_graph_create_filter -- src inner\n");
		return -1;
	}
	ret = avfilter_graph_create_filter(&m_pFilterCtxSrcMic, filter_src_mic, pad_name_mic, args_mic, NULL, m_pFilterGraph);
	if (ret < 0)
	{
		printf("Filter: failed to call avfilter_graph_create_filter -- src mic\n");
		return -1;
	}

	ret = avfilter_graph_create_filter(&m_pFilterCtxSink, filter_sink, "out", NULL, NULL, m_pFilterGraph);
	if (ret < 0)
	{
		printf("Filter: failed to call avfilter_graph_create_filter -- sink\n");
		return -1;
	}
	AVCodecContext* encodec_ctx = m_pCodecEncodeCtx_Audio;
	//ret = av_opt_set_bin(_filter_ctx_sink, "sample_fmts", (uint8_t*)&encodec_ctx->sample_fmt, sizeof(encodec_ctx->sample_fmt), AV_OPT_SEARCH_CHILDREN);
	ret = av_opt_set_bin(m_pFilterCtxSink, "sample_fmts", (uint8_t*)&m_pReadCodecCtx_AudioInner->sample_fmt, sizeof(m_pReadCodecCtx_AudioInner->sample_fmt), AV_OPT_SEARCH_CHILDREN);
	if (ret < 0)
	{
		printf("Filter: failed to call av_opt_set_bin -- sample_fmts\n");
		return -1;
	}
	ret = av_opt_set_bin(m_pFilterCtxSink, "channel_layouts", (uint8_t*)&encodec_ctx->channel_layout, sizeof(encodec_ctx->channel_layout), AV_OPT_SEARCH_CHILDREN);
	if (ret < 0)
	{
		printf("Filter: failed to call av_opt_set_bin -- channel_layouts\n");
		return -1;
	}
	ret = av_opt_set_bin(m_pFilterCtxSink, "sample_rates", (uint8_t*)&encodec_ctx->sample_rate, sizeof(encodec_ctx->sample_rate), AV_OPT_SEARCH_CHILDREN);
	if (ret < 0)
	{
		printf("Filter: failed to call av_opt_set_bin -- sample_rates\n");
		return -1;
	}

	filter_output_inner->name = av_strdup(pad_name_inner);
	filter_output_inner->filter_ctx = m_pFilterCtxSrcInner;
	filter_output_inner->pad_idx = 0;
	filter_output_inner->next = filter_output_mic;

	filter_output_mic->name = av_strdup(pad_name_mic);
	filter_output_mic->filter_ctx = m_pFilterCtxSrcMic;
	filter_output_mic->pad_idx = 0;
	filter_output_mic->next = NULL;

	filter_input->name = av_strdup("out");
	filter_input->filter_ctx = m_pFilterCtxSink;
	filter_input->pad_idx = 0;
	filter_input->next = NULL;

	AVFilterInOut* filter_outputs[2];
	filter_outputs[0] = filter_output_inner;
	filter_outputs[1] = filter_output_mic;

	ret = avfilter_graph_parse_ptr(m_pFilterGraph, filter_desc, &filter_input, filter_outputs, NULL);
	if (ret < 0)
	{
		printf("Filter: failed to call avfilter_graph_parse_ptr\n");
		return -1;
	}

	ret = avfilter_graph_config(m_pFilterGraph, NULL);
	if (ret < 0)
	{
		printf("Filter: failed to call avfilter_graph_config\n");
		return -1;
	}

	avfilter_inout_free(&filter_input);
	av_free(filter_src_spk);
	av_free(filter_src_mic);
	avfilter_inout_free(filter_outputs);
	//av_free(filter_outputs);

	char* temp = avfilter_graph_dump(m_pFilterGraph, NULL);
	printf("%s\n", temp);

	return 0;
}

void ULinkRecord::Clear()
{
	if (m_pReadCodecCtx_AudioInner)
	{
		avcodec_free_context(&m_pReadCodecCtx_AudioInner);
		m_pReadCodecCtx_AudioInner = NULL;
	}

	if (m_pReadCodecCtx_AudioMic != NULL)
	{
		avcodec_free_context(&m_pReadCodecCtx_AudioMic);
		m_pReadCodecCtx_AudioMic = NULL;
	}

	if (m_pCodecEncodeCtx_Audio != NULL)
	{
		avcodec_free_context(&m_pCodecEncodeCtx_Audio);
		m_pCodecEncodeCtx_Audio = NULL;
	}

	if (m_pAudioConvertCtx != NULL)
	{
		swr_free(&m_pAudioConvertCtx);
		m_pAudioConvertCtx = NULL;
	}

	if (m_pAudioInnerFifo != NULL)
	{
		av_audio_fifo_free(m_pAudioInnerFifo);
		m_pAudioInnerFifo = NULL;
	}

	if (m_pAudioInnerResampleFifo != NULL)
	{
		av_audio_fifo_free(m_pAudioInnerResampleFifo);
		m_pAudioInnerResampleFifo = NULL;
	}

	if (m_pAudioMicFifo != NULL)
	{
		av_audio_fifo_free(m_pAudioMicFifo);
		m_pAudioMicFifo = NULL;
	}

	if (m_pAudioMixFifo != NULL)
	{
		av_audio_fifo_free(m_pAudioMixFifo);
		m_pAudioMixFifo = NULL;
	}

	if (m_pFormatCtx_AudioInner != NULL)
	{
		avformat_close_input(&m_pFormatCtx_AudioInner);
		m_pFormatCtx_AudioInner = NULL;
	}

	if (m_pFormatCtx_AudioMic != NULL)
	{
		avformat_close_input(&m_pFormatCtx_AudioMic);
		m_pFormatCtx_AudioMic = NULL;
	}

	if (m_pFormatCtx_Out != NULL)
	{
		avformat_free_context(m_pFormatCtx_Out);
		m_pFormatCtx_Out = NULL;
	}

	if (m_hScreenCapture != NULL)
	{
		CloseHandle(m_hScreenCapture);
		m_hScreenCapture = NULL;
	}

	if (m_hScreenAudioMix != NULL)
	{
		CloseHandle(m_hScreenAudioMix);
		m_hScreenAudioMix = NULL;
	}
}

DWORD WINAPI ULinkRecord::AudioInnerCaptureProc(LPVOID lpParam)
{
	ULinkRecord *pULinkRecord = (ULinkRecord *)lpParam;
	if (pULinkRecord != NULL)
	{
		pULinkRecord->AudioInnerCapture();
	}
	return 0;
}

void ULinkRecord::AudioInnerCapture()
{
	AVFrame *pFrame;
	pFrame = av_frame_alloc();

	AVPacket packet = { 0 };
	int ret = 0;
	while (m_bRecord)
	{
		av_packet_unref(&packet);
		if (av_read_frame(m_pFormatCtx_AudioInner, &packet) < 0)
		{
			continue;
		}

		ret = avcodec_send_packet(m_pReadCodecCtx_AudioInner, &packet);
		if (ret >= 0)
		{
			ret = avcodec_receive_frame(m_pReadCodecCtx_AudioInner, pFrame);
			if (ret == AVERROR(EAGAIN))
			{
				continue;
			}
			else if (ret == AVERROR_EOF)
			{
				break;
			}
			else if (ret < 0) {
				fprintf(stderr, "Error during decoding\n");
				exit(1);
			}

			int buf_space = av_audio_fifo_space(m_pAudioInnerFifo);
			if (buf_space >= pFrame->nb_samples)
			{
				//AudioSection
				EnterCriticalSection(&m_csAudioInnerSection);
				ret = av_audio_fifo_write(m_pAudioInnerFifo, (void **)pFrame->data, pFrame->nb_samples);
				LeaveCriticalSection(&m_csAudioInnerSection);
			}



			av_packet_unref(&packet);
		}

	}

	av_frame_free(&pFrame);
}

DWORD WINAPI ULinkRecord::AudioInnerResampleProc(LPVOID lpParam)
{
	ULinkRecord *pULinkRecord = (ULinkRecord *)lpParam;
	if (pULinkRecord != NULL)
	{
		pULinkRecord->AudioInnerResample();
	}
	return 0;
}

void ULinkRecord::AudioInnerResample()
{
	int ret = 0;

	while (1)
	{
		if (av_audio_fifo_size(m_pAudioInnerFifo) >=
			(m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size > 0 ? m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size : 1024))
		{
			AVFrame *frame_audio_inner = NULL;
			frame_audio_inner = av_frame_alloc();

			frame_audio_inner->nb_samples = m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size > 0 ? m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size : 1024;
			frame_audio_inner->channel_layout = m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->channel_layout;
			frame_audio_inner->format = m_pFormatCtx_AudioInner->streams[0]->codecpar->format;
			frame_audio_inner->sample_rate = m_pFormatCtx_AudioInner->streams[0]->codecpar->sample_rate;
			av_frame_get_buffer(frame_audio_inner, 0);

			EnterCriticalSection(&m_csAudioInnerSection);
			int readcount = av_audio_fifo_read(m_pAudioInnerFifo, (void **)frame_audio_inner->data,
				(m_pFormatCtx_AudioInner->streams[0]->codecpar->frame_size > 0 ? m_pFormatCtx_AudioInner->streams[0]->codecpar->frame_size : 1024));
			LeaveCriticalSection(&m_csAudioInnerSection);

			AVFrame *frame_audio_inner_resample = NULL;
			frame_audio_inner_resample = av_frame_alloc();

			//int iDelaySamples = swr_get_delay(audio_convert_ctx, frame_mic->sample_rate);
			int iDelaySamples = 0;
			//int dst_nb_samples = av_rescale_rnd(iDelaySamples + frame_mic->nb_samples, frame_mic->sample_rate, pCodecEncodeCtx_Audio->sample_rate, AVRounding(1));
			int dst_nb_samples = av_rescale_rnd(iDelaySamples + frame_audio_inner->nb_samples, m_pCodecEncodeCtx_Audio->sample_rate, frame_audio_inner->sample_rate, AV_ROUND_UP);


			frame_audio_inner_resample->nb_samples = m_pCodecEncodeCtx_Audio->frame_size;
			frame_audio_inner_resample->channel_layout = m_pCodecEncodeCtx_Audio->channel_layout;
			frame_audio_inner_resample->format = m_pFormatCtx_AudioInner->streams[0]->codecpar->format;
			frame_audio_inner_resample->sample_rate = m_pCodecEncodeCtx_Audio->sample_rate;
			av_frame_get_buffer(frame_audio_inner_resample, 0);

			uint8_t* out_buffer = (uint8_t*)frame_audio_inner_resample->data[0];

			int nb = swr_convert(m_pAudioInnerResampleCtx, &out_buffer, dst_nb_samples, (const uint8_t**)frame_audio_inner->data, frame_audio_inner->nb_samples);



			//if (av_audio_fifo_space(fifo_audio_resample) >= pFrame->nb_samples)
			{
				EnterCriticalSection(&m_csAudioInnerResampleSection);
				ret = av_audio_fifo_write(m_pAudioInnerResampleFifo, (void **)frame_audio_inner_resample->data, dst_nb_samples);
				LeaveCriticalSection(&m_csAudioInnerResampleSection);
			}

			av_frame_free(&frame_audio_inner);
			av_frame_free(&frame_audio_inner_resample);

			if (!m_bRecord)
			{
				if (av_audio_fifo_size(m_pAudioInnerFifo) < 1024)
				{
					break;
				}
			}
		}
		else
		{
			if (!m_bRecord)
			{
				break;
			}
		}
	}
}

DWORD WINAPI ULinkRecord::AudioMicCaptureProc(LPVOID lpParam)
{
	ULinkRecord *pULinkRecord = (ULinkRecord *)lpParam;
	if (pULinkRecord != NULL)
	{
		pULinkRecord->AudioMicCapture();
	}
	return 0;
}

void ULinkRecord::AudioMicCapture()
{
	AVFrame *pFrame;
	pFrame = av_frame_alloc();

	AVPacket packet = { 0 };
	int ret = 0;
	while (m_bRecord)
	{
		av_packet_unref(&packet);
		if (av_read_frame(m_pFormatCtx_AudioMic, &packet) < 0)
		{
			continue;
		}

		ret = avcodec_send_packet(m_pReadCodecCtx_AudioMic, &packet);
		if (ret >= 0)
		{
			ret = avcodec_receive_frame(m_pReadCodecCtx_AudioMic, pFrame);
			if (ret == AVERROR(EAGAIN))
			{
				continue;
			}
			else if (ret == AVERROR_EOF)
			{
				break;
			}
			else if (ret < 0) {
				break;
			}

			if (NULL == m_pAudioMicFifo)
			{
				m_pAudioMicFifo = av_audio_fifo_alloc((AVSampleFormat)m_pFormatCtx_AudioMic->streams[0]->codecpar->format,
					m_pFormatCtx_AudioMic->streams[0]->codecpar->channels, 3000 * pFrame->nb_samples);
			}

			int buf_space = av_audio_fifo_space(m_pAudioMicFifo);
			if (buf_space >= pFrame->nb_samples)
			{
				EnterCriticalSection(&m_csAudioMicSection);
				ret = av_audio_fifo_write(m_pAudioMicFifo, (void **)pFrame->data, pFrame->nb_samples);
				LeaveCriticalSection(&m_csAudioMicSection);
			}



			av_packet_unref(&packet);
		}

	}

	av_frame_free(&pFrame);
}


DWORD WINAPI ULinkRecord::AudioMixProc(LPVOID lpParam)
{
	ULinkRecord *pULinkRecord = (ULinkRecord *)lpParam;
	if (pULinkRecord != NULL)
	{
		pULinkRecord->AudioMix();
	}
	return 0;
}

void ULinkRecord::AudioMix()
{
	int ret = 0;

	int iAudioFrameMixedIndex = 0;

	AVFrame *frame_audio_inner_resample = NULL;
	frame_audio_inner_resample = av_frame_alloc();

	AVFrame *frame_audio_mic = NULL;
	frame_audio_mic = av_frame_alloc();


	if (NULL == m_pAudioMixFifo)
	{
		m_pAudioMixFifo = av_audio_fifo_alloc((AVSampleFormat)m_pFormatCtx_AudioInner->streams[0]->codecpar->format,
			m_pFormatCtx_AudioInner->streams[0]->codecpar->channels, 3000 * 1024);
	}


	while (m_bRecord)
	{
		if (NULL == m_pAudioInnerResampleFifo)
		{
			continue;
		}
		if (NULL == m_pAudioMicFifo)
		{
			continue;
		}

		int fifo_inner_resample_size = av_audio_fifo_size(m_pAudioInnerResampleFifo);
		int fifo_mic_size = av_audio_fifo_size(m_pAudioMicFifo);
		//int frame_inner_min_size = pReadCodecCtx_Audio->frame_size;
		//int frame_mic_min_size = pReadCodecCtx_AudioMic->frame_size;

		int frame_inner_resample_min_size = 1024;
		int frame_mic_min_size = 1024;

		if (fifo_inner_resample_size >= frame_inner_resample_min_size && fifo_mic_size >= frame_mic_min_size)
		{
			frame_audio_inner_resample->nb_samples = frame_inner_resample_min_size;
			frame_audio_inner_resample->channel_layout = 3;
			frame_audio_inner_resample->format = m_pFormatCtx_AudioInner->streams[0]->codecpar->format;
			frame_audio_inner_resample->sample_rate = m_pFormatCtx_AudioMic->streams[0]->codecpar->sample_rate;
			av_frame_get_buffer(frame_audio_inner_resample, 0);

			frame_audio_mic->nb_samples = frame_mic_min_size;
			frame_audio_mic->channel_layout = 3;
			frame_audio_mic->format = m_pFormatCtx_AudioMic->streams[0]->codecpar->format;
			frame_audio_mic->sample_rate = m_pFormatCtx_AudioMic->streams[0]->codecpar->sample_rate;
			av_frame_get_buffer(frame_audio_mic, 0);

			EnterCriticalSection(&m_csAudioInnerResampleSection);
			int readcount = av_audio_fifo_read(m_pAudioInnerResampleFifo, (void **)frame_audio_inner_resample->data, frame_inner_resample_min_size);
			LeaveCriticalSection(&m_csAudioInnerResampleSection);


			EnterCriticalSection(&m_csAudioMicSection);
			readcount = av_audio_fifo_read(m_pAudioMicFifo, (void**)frame_audio_mic->data, frame_mic_min_size);
			LeaveCriticalSection(&m_csAudioMicSection);


			frame_audio_inner_resample->pts = iAudioFrameMixedIndex * m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size;
			frame_audio_mic->pts = iAudioFrameMixedIndex * m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size;

			BufferSourceContext* s = (BufferSourceContext*)m_pFilterCtxSrcInner->priv;
			bool b1 = (s->sample_fmt != frame_audio_inner_resample->format);
			bool b2 = (s->sample_rate != frame_audio_inner_resample->sample_rate);
			bool b3 = (s->channel_layout != frame_audio_inner_resample->channel_layout);
			bool b4 = (s->channels != frame_audio_inner_resample->channels);

			ret = av_buffersrc_add_frame(m_pFilterCtxSrcInner, frame_audio_inner_resample);
			if (ret < 0)
			{
				printf("Mixer: failed to call av_buffersrc_add_frame (speaker)\n");
				break;
			}

			ret = av_buffersrc_add_frame(m_pFilterCtxSrcMic, frame_audio_mic);
			if (ret < 0)
			{
				printf("Mixer: failed to call av_buffersrc_add_frame (microphone)\n");
				break;
			}


			while (1)
			{
				AVFrame* pFrame_out = av_frame_alloc();

				ret = av_buffersink_get_frame_flags(m_pFilterCtxSink, pFrame_out, 0);
				if (ret < 0)
				{
					av_frame_free(&pFrame_out);
					//printf("Mixer: failed to call av_buffersink_get_frame_flags\n");
					break;
				}
				iAudioFrameMixedIndex++;
				EnterCriticalSection(&m_csAudioMixSection);
				ret = av_audio_fifo_write(m_pAudioMixFifo, (void **)pFrame_out->data, pFrame_out->nb_samples);
				LeaveCriticalSection(&m_csAudioMixSection);

				av_frame_free(&pFrame_out);
			}
		}
	}


	av_frame_free(&frame_audio_inner_resample);
	av_frame_free(&frame_audio_mic);
}


DWORD WINAPI ULinkRecord::ScreenCaptureProc(LPVOID lpParam)
{
	ULinkRecord *pULinkRecord = (ULinkRecord *)lpParam;
	if (pULinkRecord != NULL)
	{
		pULinkRecord->ScreenCapture();
	}
	return 0;
}


void ULinkRecord::ScreenCapture()
{
	CCaptureScreen* ccs = new CCaptureScreen();
	int width = 0;
	int height = 0;

	ccs->Init(width, height);

	AVFrame *pFrameYUV = av_frame_alloc();
	pFrameYUV->format = AV_PIX_FMT_YUV420P;
	pFrameYUV->width = width;
	pFrameYUV->height = height;

	int y_size = m_iRecordWidth * m_iRecordHeight;

	int frame_size = av_image_get_buffer_size(AV_PIX_FMT_YUV420P, m_iRecordWidth, m_iRecordHeight, 1);
	BYTE* out_buffer_yuv420 = new BYTE[frame_size];
	av_image_fill_arrays(pFrameYUV->data, pFrameYUV->linesize, out_buffer_yuv420, AV_PIX_FMT_YUV420P, m_iRecordWidth, m_iRecordHeight, 1);

	AVPacket* packet = av_packet_alloc();
	m_iFrameNumber = 0;

	int ret = 0;

	DWORD dwBeginTime = ::GetTickCount();
	while(m_bRecord)
	{
		BYTE* frameimage = ccs->CaptureImage();

		RGB24_TO_YUV420(frameimage, width, height, out_buffer_yuv420);
		//Sframe->pkt_dts = frame->pts = frameNumber * avCodecCtx_Out->time_base.num * avStream->time_base.den / (avCodecCtx_Out->time_base.den *  avStream->time_base.num);
		pFrameYUV->pkt_dts = pFrameYUV->pts = av_rescale_q_rnd(m_iFrameNumber, m_pCodecEncodeCtx_Video->time_base, m_pFormatCtx_Out->streams[m_iVideoStreamIndex]->time_base, (AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
		pFrameYUV->pkt_duration = 0;
		pFrameYUV->pkt_pos = -1;

		if (av_fifo_space(m_pVideoFifo) >= frame_size)
		{
			EnterCriticalSection(&m_csVideoSection);
			av_fifo_generic_write(m_pVideoFifo, pFrameYUV->data[0], y_size, NULL);
			av_fifo_generic_write(m_pVideoFifo, pFrameYUV->data[1], y_size / 4, NULL);
			av_fifo_generic_write(m_pVideoFifo, pFrameYUV->data[2], y_size / 4, NULL);
			LeaveCriticalSection(&m_csVideoSection);

			m_iFrameNumber++;

			DWORD dwCurrentTime = ::GetTickCount();
			int dwPassedMillSeconds = dwCurrentTime - dwBeginTime;
			int dwDiff = m_iFrameNumber * 100 - dwPassedMillSeconds;
			if (dwDiff > 0)
			{
				Sleep(dwDiff);
			}
		}
	}
	av_packet_free(&packet);
	av_frame_free(&pFrameYUV);
	delete[] out_buffer_yuv420;
}


DWORD WINAPI ULinkRecord::ScreenAudioMixProc(LPVOID lpParam)
{
	ULinkRecord *pULinkRecord = (ULinkRecord *)lpParam;
	if (pULinkRecord != NULL)
	{
		pULinkRecord->ScreenAudioMix();
	}
	return 0;
}

void ULinkRecord::ScreenAudioMix()
{
	int ret = 0;
	int cur_pts_v = 0;
	int cur_pts_a = 0;

	AVFrame *pFrameYUVInMain = av_frame_alloc();
	uint8_t *out_buffer_yuv420 = (uint8_t *)av_malloc(m_iYuv420FrameSize);
	av_image_fill_arrays(pFrameYUVInMain->data, pFrameYUVInMain->linesize, out_buffer_yuv420, AV_PIX_FMT_YUV420P, m_iRecordWidth, m_iRecordHeight, 1);

	int AudioFrameIndex_mic = 1;
	AVPacket packet = { 0 };

	int iPicCount = 0;
	while (m_bRecord)
	{
		if (NULL == m_pVideoFifo)
		{
			continue;
		}

		if (NULL == m_pAudioMixFifo)
		{
			continue;
		}

		if (av_compare_ts(cur_pts_v, m_pFormatCtx_Out->streams[m_iVideoStreamIndex]->time_base,
			cur_pts_a, m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->time_base) <= 0)
		{
			if (av_fifo_size(m_pVideoFifo) >= m_iYuv420FrameSize)
			{
				EnterCriticalSection(&m_csVideoSection);
				av_fifo_generic_read(m_pVideoFifo, out_buffer_yuv420, m_iYuv420FrameSize, NULL);
				LeaveCriticalSection(&m_csVideoSection);

				pFrameYUVInMain->pkt_dts = pFrameYUVInMain->pts = av_rescale_q_rnd(iPicCount, m_pCodecEncodeCtx_Video->time_base, m_pFormatCtx_Out->streams[m_iVideoStreamIndex]->time_base, (AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
				pFrameYUVInMain->pkt_duration = 0;
				pFrameYUVInMain->pkt_pos = -1;

				pFrameYUVInMain->width = m_iRecordWidth;
				pFrameYUVInMain->height = m_iRecordHeight;
				pFrameYUVInMain->format = AV_PIX_FMT_YUV420P;

				cur_pts_v = packet.pts;

				ret = avcodec_send_frame(m_pCodecEncodeCtx_Video, pFrameYUVInMain);

				ret = avcodec_receive_packet(m_pCodecEncodeCtx_Video, &packet);

				if (packet.size > 0)
				{
					//av_packet_rescale_ts(packet, avCodecCtx_Out->time_base, avStream->time_base);
					av_write_frame(m_pFormatCtx_Out, &packet);
					iPicCount++;
				}
			}
		}
		else
		{
			if (av_audio_fifo_size(m_pAudioMixFifo) >=
				(m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size > 0 ? m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size : 1024))
			{
				AVFrame *frame_mix = NULL;
				frame_mix = av_frame_alloc();

				frame_mix->nb_samples = m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size > 0 ? m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size : 1024;
				frame_mix->channel_layout = m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->channel_layout;
				//frame_mix->format = pFormatCtx_Out->streams[iAudioStreamIndex]->codecpar->format;
				frame_mix->format = 1;
				frame_mix->sample_rate = m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->sample_rate;
				av_frame_get_buffer(frame_mix, 0);

				EnterCriticalSection(&m_csAudioMixSection);
				int readcount = av_audio_fifo_read(m_pAudioMixFifo, (void **)frame_mix->data,
					(m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size > 0 ? m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size : 1024));
				LeaveCriticalSection(&m_csAudioMixSection);

				AVPacket pkt_out_mic = { 0 };

				pkt_out_mic.data = NULL;
				pkt_out_mic.size = 0;

				frame_mix->pts = AudioFrameIndex_mic * m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size;


				AVFrame *frame_mic_encode = NULL;
				frame_mic_encode = av_frame_alloc();

				frame_mic_encode->nb_samples = m_pCodecEncodeCtx_Audio->frame_size;
				frame_mic_encode->channel_layout = m_pCodecEncodeCtx_Audio->channel_layout;
				frame_mic_encode->format = m_pCodecEncodeCtx_Audio->sample_fmt;
				frame_mic_encode->sample_rate = m_pCodecEncodeCtx_Audio->sample_rate;
				av_frame_get_buffer(frame_mic_encode, 0);



				int dst_nb_samples = av_rescale_rnd(swr_get_delay(m_pAudioConvertCtx, frame_mix->sample_rate) + frame_mix->nb_samples, frame_mix->sample_rate, frame_mix->sample_rate, AVRounding(1));

				//uint8_t *audio_buf = NULL;
				uint8_t *audio_buf[2] = { 0 };
				audio_buf[0] = (uint8_t *)frame_mic_encode->data[0];
				audio_buf[1] = (uint8_t *)frame_mic_encode->data[1];

				int nb = swr_convert(m_pAudioConvertCtx, audio_buf, dst_nb_samples, (const uint8_t**)frame_mix->data, frame_mix->nb_samples);

				ret = avcodec_send_frame(m_pCodecEncodeCtx_Audio, frame_mic_encode);

				ret = avcodec_receive_packet(m_pCodecEncodeCtx_Audio, &pkt_out_mic);
				if (ret == AVERROR(EAGAIN))
				{
					continue;
				}
				av_frame_free(&frame_mix);
				av_frame_free(&frame_mic_encode);
				{
					pkt_out_mic.stream_index = m_iAudioStreamIndex;
					pkt_out_mic.pts = AudioFrameIndex_mic * m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size;
					pkt_out_mic.dts = AudioFrameIndex_mic * m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size;
					pkt_out_mic.duration = m_pFormatCtx_Out->streams[m_iAudioStreamIndex]->codecpar->frame_size;

					cur_pts_a = pkt_out_mic.pts;
					av_write_frame(m_pFormatCtx_Out, &pkt_out_mic);
					//int ret2 = av_interleaved_write_frame(m_pFormatCtx_Out, &pkt_out_mic);
					av_packet_unref(&pkt_out_mic);
				}
				AudioFrameIndex_mic++;
			}
		}
	}


	Sleep(100);
	av_write_trailer(m_pFormatCtx_Out);

	avio_close(m_pFormatCtx_Out->pb);

	av_frame_free(&pFrameYUVInMain);
}

标签：audio,ffmpeg,int,frame,mp4,本地,av,NULL,pFormatCtx
来源： https://blog.csdn.net/tusong86/article/details/121863390