请介绍有关于语音压缩的方法和算法

mirong 2000-09-03 07:22:00
我想对语音进行压缩
希望各位大虾介绍一些算法和免费资源
我的邮箱: snoopcat@etang.com
十分感谢!
...全文
641 10 打赏 收藏 转发到动态 举报
写回复
用AI写文章
10 条回复
切换为时间正序
请发表友善的回复…
发表回复
Analyst 2000-09-09
  • 打赏
  • 举报
回复
大虾,您的主页我连不上,能用mail发给我吗?
Redspider 2000-09-05
  • 打赏
  • 举报
回复
算你运气好,呵呵,我有G.723.1的文档和源代码,在我主页上:http://redspider.126.com

to mirong:
关于音频算法的压缩能力,你可以通过VC SAMPLES里的那几个例子看到,三个中有一个就是比较
压缩前后数据大小的。至于是否适合实时传输,一般来说,都是用G.723.1,尽管它会因为分块
压缩而产生不可避免的30ms的延时。实际上用DSP GROUP等也是可以的,如果对压缩率不是很苛刻
的话。
Analyst 2000-09-05
  • 打赏
  • 举报
回复
谁有关于G.723.1算法的资料,麻烦mail给me一份可以吗?
我在网上找了半天,发现这些资料都是要收费的。
Dev 2000-09-04
  • 打赏
  • 举报
回复
请教一下stone_fish,在一个完全光秃的系统上,例如新安装的系统上没有安装NETMEETING,仅加入MSG723.acm并按照你说的步骤注册就可以了吗,不需要安装其他任何辅助的程序了吗?
stone_fish 2000-09-04
  • 打赏
  • 举报
回复
Dev说的对,确实需要G723的驱动,如果你安装了Netmeeting,则它已帮你安装好,否则你需要msg723.acm,将其放到system/system32下,同时需要修改注册表, WIN98下还要改一下system.ini下[drivers32]下加入msacm.msg711=msg723.acm.很多地方可以得到msg723.acm的.
Dev 2000-09-04
  • 打赏
  • 举报
回复
虽然可以用ACM,但前提条件是必须有支持ACM的驱动程序。例如,如果你没有安装带G723驱动的微软程序(例如NETMEETING),你根本无法使用G723。不知我的提法是否有问题,交流一下!
stone_fish 2000-09-04
  • 打赏
  • 举报
回复
前面那位大虾说的对,下面是一个G723压缩的例子,关键在于unsigned char extra[10] ={ 2, 0, 0xce, 0x9a, 0x32, 0xf7, 0xa2, 0xae, 0xde, 0xac }的值,当然,不同的算法其扩展位不同,有的没有扩展位.
// conv.cpp
//
// convert a PCM wave to some other format

#include <windows.h>
#include <mmsystem.h>
#include <mmreg.h> // Multimedia registration
#include <msacm.h> // Audio Compression Manager
#include <stdio.h>
#include <math.h>

// Locate a driver that supports a given format and return its ID
typedef struct {
HACMDRIVERID hadid;
WORD wFormatTag;
} FIND_DRIVER_INFO;


// callback function for format enumeration
BOOL CALLBACK find_format_enum(HACMDRIVERID hadid, LPACMFORMATDETAILS pafd, DWORD dwInstance, DWORD fdwSupport)
{

FIND_DRIVER_INFO* pdi = (FIND_DRIVER_INFO*) dwInstance;
if (pafd->dwFormatTag == (DWORD)pdi->wFormatTag) {
// found it
pdi->hadid = hadid;
printf(" %4.4lXH, %s\n", pafd->dwFormatTag, pafd->szFormat);
return FALSE; // stop enumerating
}
//printf(" FORMAT not MATCH.\n");
return TRUE; // continue enumerating
}

// callback function for driver enumeration
BOOL CALLBACK find_driver_enum(HACMDRIVERID hadid, DWORD dwInstance, DWORD fdwSupport)
{

FIND_DRIVER_INFO* pdi = (FIND_DRIVER_INFO*) dwInstance;

// open the driver
HACMDRIVER had = NULL;
HRESULT mmr = acmDriverOpen(&had, hadid, 0);
if (mmr) {

// some error
return FALSE; // stop enumerating

}

// enumerate the formats it supports
DWORD dwSize = 0;
mmr = acmMetrics(had, ACM_METRIC_MAX_SIZE_FORMAT, &dwSize);
if (dwSize < sizeof(WAVEFORMATEX)) dwSize = sizeof(WAVEFORMATEX); // for MS-PCM
WAVEFORMATEX* pwf = (WAVEFORMATEX*) malloc(dwSize);
memset(pwf, 0, dwSize);
pwf->cbSize = LOWORD(dwSize) - sizeof(WAVEFORMATEX);
pwf->wFormatTag = pdi->wFormatTag;
ACMFORMATDETAILS fd;
memset(&fd, 0, sizeof(fd));
fd.cbStruct = sizeof(fd);
fd.pwfx = pwf;
fd.cbwfx = dwSize;
fd.dwFormatTag = pdi->wFormatTag;
mmr = acmFormatEnum(had, &fd, find_format_enum, (DWORD)(VOID*)pdi, 0);
free(pwf);
acmDriverClose(had, 0);
if (pdi->hadid || mmr) {
// found it or some error
return FALSE; // stop enumerating
}

return TRUE; // continue enumeration
}

// locate the first driver that supports a given format tag
HACMDRIVERID find_driver(WORD wFormatTag)
{
FIND_DRIVER_INFO fdi;
fdi.hadid = NULL;
fdi.wFormatTag = wFormatTag;
MMRESULT mmr = acmDriverEnum(find_driver_enum, (DWORD)(VOID*)&fdi, 0);
if (mmr) return NULL;
return fdi.hadid;
}

// get a description of the first format supported for a given tag
WAVEFORMATEX* get_driver_format(HACMDRIVERID hadid, WORD wFormatTag)
{
// open the driver
HACMDRIVER had = NULL;
MMRESULT mmr = acmDriverOpen(&had, hadid, 0);
if (mmr) {
return NULL;
}

// allocate a structure for the info
DWORD dwSize = 0;
mmr = acmMetrics(had, ACM_METRIC_MAX_SIZE_FORMAT, &dwSize);
if (dwSize < sizeof(WAVEFORMATEX)) dwSize = sizeof(WAVEFORMATEX); // for MS-PCM
WAVEFORMATEX* pwf = (WAVEFORMATEX*) malloc(dwSize);
memset(pwf, 0, dwSize);
pwf->cbSize = LOWORD(dwSize) - sizeof(WAVEFORMATEX);
pwf->wFormatTag = wFormatTag;

ACMFORMATDETAILS fd;
memset(&fd, 0, sizeof(fd));
fd.cbStruct = sizeof(fd);
fd.pwfx = pwf;
fd.cbwfx = dwSize;
fd.dwFormatTag = wFormatTag;

// set up a struct to control the enumeration
FIND_DRIVER_INFO fdi;
fdi.hadid = NULL;
fdi.wFormatTag = wFormatTag;

mmr = acmFormatEnum(had, &fd, find_format_enum, (DWORD)(VOID*)&fdi, 0);
acmDriverClose(had, 0);
if ((fdi.hadid == NULL) || mmr) {
free(pwf);
return NULL;
}

return pwf;
}

int main(int argc, char* argv[])
{
// First we create a wave that might have been just recorded.
// The format is 11.025 kHz, 8 bit mono PCM which is a recording
// format available on all machines.
// our sample wave will be 1 second long and will be a sine wave
// of 1kHz which is exactly 1,000 cycles

WAVEFORMATEX wfSrc;
memset(&wfSrc, 0, sizeof(wfSrc));
wfSrc.cbSize = 0;
wfSrc.wFormatTag = WAVE_FORMAT_PCM; // pcm
wfSrc.nChannels = 1; // mono
wfSrc.nSamplesPerSec = 8000; // 11.025 kHz
wfSrc.wBitsPerSample = 16;// 8 bit
wfSrc.nBlockAlign = wfSrc.nChannels * wfSrc.wBitsPerSample / 8;
wfSrc.nAvgBytesPerSec = wfSrc.nSamplesPerSec * wfSrc.nBlockAlign;
DWORD dwSrcSamples = wfSrc.nSamplesPerSec;
DWORD dwSrcBytes = wfSrc.nAvgBytesPerSec;
unsigned char pSrcData[22050]; // 1 second duration

double f = 1000.0;
double pi = 4.0 * atan(1.0);
double w = 2.0 * pi * f;
for (DWORD dw = 0; dw < dwSrcBytes; dw++) {
double t = (double) dw / (double) wfSrc.nSamplesPerSec;
pSrcData[dw] = 128 + (unsigned char)(127.0 * sin(w * t));
}


// Select a format to convert to
// WORD wFormatTag = WAVE_FORMAT_ADPCM;
// WORD wFormatTag = WAVE_FORMAT_IMA_ADPCM;
// WORD wFormatTag = WAVE_FORMAT_GSM610;
// WORD wFormatTag = WAVE_FORMAT_ALAW;
// WORD wFormatTag = WAVE_FORMAT_MULAW;
// WORD wFormatTag = 0x32; // MSN
// WORD wFormatTag = WAVE_FORMAT_DSPGROUP_TRUESPEECH;
WORD wFormatTag = 0x42; // G.723.1

// Now we locate a CODEC that supports the destination format tag
HACMDRIVERID hadid = find_driver(wFormatTag);
if (hadid == NULL) {
printf("No driver found\n");
exit(1);
}
printf("Driver found (hadid: %4.4lXH)\n", hadid);


// get the details of the format
// Note: this is just the first of one or more possible formats for the given tag
//WAVEFORMATEX* pwfDrv = get_driver_format(hadid, wFormatTag);
WAVEFORMATEX* pwfDrv;
pwfDrv = (WAVEFORMATEX*)new char[28];
pwfDrv->wFormatTag = 66;
pwfDrv->nChannels = 1;
pwfDrv->nSamplesPerSec = 8000;
pwfDrv->nAvgBytesPerSec = 800;
pwfDrv->nBlockAlign = 24;
pwfDrv->wBitsPerSample = 0;
pwfDrv->cbSize = 10;
unsigned char extra[10] ={ 2, 0, 0xce, 0x9a, 0x32, 0xf7, 0xa2, 0xae, 0xde, 0xac };
for(int i=0;i<pwfDrv->cbSize;i++)
*((unsigned char *)pwfDrv+18+i) = extra[i];

for(i=0;i<28;i++)
printf("%d,",*((unsigned char *)pwfDrv+i));

if (pwfDrv == NULL) {
printf("Error getting format info\n");
exit(1);
}
printf("Driver format: %u bits, %lu samples per second\n", pwfDrv->wBitsPerSample, pwfDrv->nSamplesPerSec);

///////////////////////////////////////////////////////////////////////////////////
// convert the intermediate PCM format to the final format

// open the driver
HACMDRIVER had = NULL;
MMRESULT mmr;
mmr = acmDriverOpen(&had, hadid, 0);
if (mmr) {
printf("Failed to open driver\n");
exit(1);
}

// open the conversion stream
// Note the use of the ACM_STREAMOPENF_NONREALTIME flag. Without this
// some software compressors will report error 512 - not possible
HACMSTREAM hstr1;
HACMSTREAM hstr2;
mmr = acmStreamOpen(&hstr1,
had, // driver handle
&wfSrc, // source format
pwfDrv, // destination format
NULL, // no filter
NULL, // no callback
0, // instance data (not used)
0);//ACM_STREAMOPENF_NONREALTIME); // flags
if (mmr) {
printf("Failed to open a stream to do PCM to driver format conversion\n");
exit(1);
}


mmr = acmStreamOpen(&hstr2,
had, // driver handle
pwfDrv, // source format
&wfSrc,// destination format
NULL, // no filter
NULL, // no callback
0, // instance data (not used)
0);//ACM_STREAMOPENF_NONREALTIME); // flags
if (mmr) {
printf("Failed to open a stream to do PCM to driver format conversion\n");
exit(1);
}

// allocate a buffer for the result of the conversion.
// compute the output buffer size based on the average byte rate
// and add a bit for randomness
// the IMA_ADPCM driver fails the conversion without this extra space
DWORD dwDstBytes = pwfDrv->nAvgBytesPerSec * dwSrcSamples / wfSrc.nSamplesPerSec;
dwDstBytes = dwDstBytes ; // add a little room
unsigned char pDstData[800];

//#ifdef _DEBUG
// fill the dest buffer with zeroes just so we can see if anything got
// converted in the debugger
memset(pDstData, 0, dwDstBytes);
//#endif

// fill in the conversion info
ACMSTREAMHEADER strhdr1;
ACMSTREAMHEADER strhdr2;
memset(&strhdr1, 0, sizeof(strhdr1));
strhdr1.cbStruct = sizeof(strhdr1);
strhdr1.pbSrc = pSrcData; // the source data to convert
strhdr1.cbSrcLength = dwSrcBytes;
strhdr1.pbDst = pDstData;
strhdr1.cbDstLength = dwDstBytes;

// prep the header
mmr = acmStreamPrepareHeader(hstr1, &strhdr1, 0);

// convert the data
mmr = acmStreamConvert(hstr1, &strhdr1, 0);
if (mmr) {
printf("Failed to do PCM to driver format conversion\n");
exit(1);
}

printf("Converted OK\n");

printf("Source wave had %lu bytes\n", dwSrcBytes);
printf("Converted wave has %lu bytes\n", strhdr1.cbDstLengthUsed);
printf("Compression ratio is %f\n", (double) dwSrcBytes / (double) strhdr1.cbDstLengthUsed);


memset(&strhdr2, 0, sizeof(strhdr2));
strhdr2.cbStruct = sizeof(strhdr2);
strhdr2.pbSrc = pDstData; // the source data to convert
strhdr2.cbSrcLength = dwDstBytes;
strhdr2.pbDst = pSrcData;
strhdr2.cbDstLength = dwSrcBytes;

// prep the header
mmr = acmStreamPrepareHeader(hstr2, &strhdr2, 0);

// convert the data
mmr = acmStreamConvert(hstr2, &strhdr2, 0);
if (mmr) {
printf("Failed to do driver to PCM format conversion\n");
exit(1);
}
printf("Converted 2 OK\n");
mmr = acmStreamUnprepareHeader(hstr1, &strhdr1, 0);
// mmr = acmStreamUnprepareHeader(hstr2, &strhdr2, 0);
// close the stream and driver
mmr = acmStreamClose(hstr1, 0);
mmr = acmStreamClose(hstr2, 0);
mmr = acmDriverClose(had, 0);

// show the conversion stats
printf("Source wave had %lu bytes\n", dwDstBytes);
printf("Converted wave has %lu bytes\n", strhdr2.cbDstLengthUsed);
printf("Compression ratio is %f\n", (double) dwDstBytes / (double) strhdr2.cbDstLengthUsed);

getchar();

return 0;
}
Redspider 2000-09-04
  • 打赏
  • 举报
回复
没必要那么麻烦,WIN98的安装选项中,在附件里有一项音频压缩,选中那个就会给你安装上
一系列的压缩算法,包括G723.2、DSP GROUP、LHCODEC等
stone_fish 2000-09-04
  • 打赏
  • 举报
回复
Re Dev:
是的,不过注册表的修改比较麻烦,我有一个注册表的导出文件,可惜不能上载.你可以先装netmeeting,然后导出此项,以便使用!
Redspider 2000-09-03
  • 打赏
  • 举报
回复
WINDOWS下提供了一系列音频压缩的算法库,ACM(Audio Compress Manager),你无须自己
去实现的。甚至可以直接用压缩格式直接进行播放,连解压都可以省掉。

具体的话可以看一下VC里的三个例子,ACMAPP、CAPS、CONV,在SAMPLES目录里搜索一下。
我的EMAIL:redspider@163.net
1.算法是程序的灵魂,优秀的程序在对海量数据处理时,依然保持高速计算,就需要高效的数据结构和算法支撑。2.网上数据结构和算法的课程不少,但存在两个问题:1)授课方式单一,大多是照着代码念一遍,数据结构和算法本身就比较难理解,对基础好的学员来说,还好一点,对基础不好的学生来说,基本上就是听天书了2)说是讲数据结构和算法,但大多是挂羊头卖狗肉,算法讲的很少。 本课程针对上述问题,有针对性的进行了升级 3)授课方式采用图解+算法游戏的方式,让课程生动有趣好理解 4)系统全面的讲解了数据结构和算法, 除常用数据结构和算法外,还包括程序员常用10大算法:二分查找算法(非递归)、分治算法、动态规划算法、KMP算法、贪心算法、普里姆算法、克鲁斯卡尔算法、迪杰斯特拉算法、弗洛伊德算法、马踏棋盘算法。可以解决面试遇到的最短路径、最小生成树、最小连通图、动态规划等问题及衍生出的面试题,让你秒杀其他面试小伙伴3.如果你不想永远都是代码工人,就需要花时间来研究下数据结构和算法。教程内容:本教程是使用Java来讲解数据结构和算法,考虑到数据结构和算法较难,授课采用图解加算法游戏的方式。内容包括: 稀疏数组、单向队列、环形队列、单向链表、双向链表、环形链表、约瑟夫问题、栈、前缀、中缀、后缀表达式、中缀表达式转换为后缀表达式、递归与回溯、迷宫问题、八皇后问题、算法的时间复杂度、冒泡排序、选择排序、插入排序、快速排序、归并排序、希尔排序、基数排序(桶排序)、堆排序、排序速度分析、二分查找、插值查找、斐波那契查找、散列、哈希表、二叉树、二叉树与数组转换、二叉排序树(BST)、AVL树、线索二叉树、赫夫曼树、赫夫曼编码、多路查找树(B树B+树和B*树)、图、图的DFS算法和BFS、程序员常用10大算法、二分查找算法(非递归)、分治算法、动态规划算法、KMP算法、贪心算法、普里姆算法、克鲁斯卡尔算法、迪杰斯特拉算法、弗洛伊德算法马踏棋盘算法。学习目标:通过学习,学员能掌握主流数据结构和算法的实现机制,开阔编程思路,提高优化程序的能力。

19,468

社区成员

发帖
与我相关
我的任务
社区描述
VC/MFC 图形处理/算法
社区管理员
  • 图形处理/算法社区
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧