[原创] Android 使用 TCP 实时传递音频流并用 AAC 进行编解码

前言

  最近在做的项目需要在两个手机之间应用 TCP 实时传递音频流,为也节省流量还要使用 AAC 进行编解码。因此对 AAC 进行了相关的调查。由于对音频数据进行了 AAC 编码处理,因此在播放声音时,就要用到 AudioTrack

阅读前提

  • 有“一定” Android 开发经验(至少要独立写过 Android 代码。能独立解释问题。)(这里有吐槽,详见文末)

  • 简单了解 MediaCodec 的使用。

阅读本文您将学到

  • 如何在 Android 设备上录制声音
  • 如何使用 MediaCodec 将 PCM 编码成 AAC
  • 如何使用 MediaCodec 将 AAC 解码成 PCM
  • 如何在 Android 设备上播放 PCM
  • 如果将 PCM 转化成 WAV

音频相关基本知识

  • PCM(Pulse-code modulation)PCM 是把声音从模拟信号转化为数字信号的技术。

      PCM 数据就是音频的原始数据。无论何种音频格式,音频驱动程序最终处理的都是 PCM 数据。由于 PCM 数据未经音频编码,因此数据量非常大。有点类似于 BMP 图片文件的感觉。

      PCM 音频文件是无法被播放器直接播放的,因此它仅仅保存了音频的原始数据。PCM 文件中并没有记录所保存的 PCM 数据的编码信息,例如采样率,声道数,采样精度等,因此播放器就无法播放该 PCM 文件。
    
  • 采样率(Sample Rate):每秒采集声音的数量。单位:赫兹(Hz)。采样频率越高,音质越好。常用的音频采样频率有:8kHz、16kHz、44.1kHz、48kHz 等。对于 Android 而言,采用 44.1kHz 采样率兼容性会更好一些。

    常见采样率的应用场景:

    8kHz:电话
    22.05kHz:无线电广播
    44.1kHz:音频 CD,MP3 等
    48kHz:miniDV、数字电视、DVD、电影和专业音频。

      人耳能够感觉到的最高频率为 20kHz,要满足人耳的听觉要求,则需要每秒进行 40k 次采样,即 40kHz。我们常见的 CD 采样率就是 44.1kHz。

  • 声道数(Channel):一般表示声音录制时的音源数量或回放时相应的扬声器数量。常用的是单声道(Mono)和双声道(Stereo)。

  • 采样精度(Bit Depth):每个采样点用多少二进制位表示。单位:位(Bit)。位数越多声音质量就越好,当然数据量也越大。常见的位宽是:8 bit 或者 16 bit。主流采样精度为 16bit,在低质量的语音传输时候可以用 8bit。

  • 比特率)(Bitrate)(在音/视频行业里,也叫码流或码率):单位时间内传输送或处理的比特的数量。单位:bps(Bit Per Second)或 kbps。比特率越高,压缩比越小,声音质量越好,音频数据量也越大。

    常用音频码率值如下:

  • WAV(WAVE 文件):常见的音频格式。未经压缩的可播放的音频原始文件。该文件由两部分组成: WAV 头及 PCM 音频数据。也就是说给 PCM 文件添加上 WAV 头,播放器就可以直接播放该音频文件了。

  • AAC(Advanced Audio Coding)):常见的音频格式。是一种专为声音数据设计的有损文件压缩格式。是应用非常广泛的音频压缩格式。相较于 mp3,AAC 格式的音质更佳,文件更小。Android 硬件编码天生支持 AAC。我们采集的原始 PCM 音频数据,一般不直接用来网络传输,而是经过编码器压缩成 AAC,这样就提高了传输效率,节省了网络带宽。

关于 MediaCodec

  我之前介绍H264 基础知识时,也用到了 MediaCodec,但是那篇文件里并没有写代码相关的东西。由于用法都一样,在这里把代码也补一下吧。

  Android 在 API 16(Android 4.1) 后引入的音视频编解码 API,Android 应用层统一由 MediaCodec API 提供音视频编解码的功能,由参数配置来决定采用何种编解码算法、是否采用硬件编解码加速等。网传:“由于使用硬件编解码,兼容性有不少问题,据说 MediaCodec 坑比较多。”不过现在的主流设备 Android API 版本远高于 16,并且硬件条件也越来越好,因此在我手边的这些设备中并没有遇到所谓的兼容性问题。“MediaCodec 坑比较多”是真的,经常由于配置不对等导致无法播放的问题。

  MediaCodec 采用了基于环形缓冲区的「生产者-消费者」模型,异步处理数据。在 input 端,Client 是这个环形缓冲区「生产者」,MediaCodec 是「消费者」。在 output 端,MediaCodec 是这个环形缓冲区「生产者」,而 Client 则变成了「消费者」。

工作流程是这样的:

  1. Client 从 input 缓冲区队列申请 empty buffer [dequeueInputBuffer]
  2. Client 把需要编解码的数据拷贝到 empty buffer,然后放入 input 缓冲区队列 [queueInputBuffer]
  3. MediaCodec 从 input 缓冲区队列取一帧数据进行编解码处理
  4. 处理结束后,MediaCodec 将原始数据 buffer 置为 empty 后放回 input 缓冲区队列,将编解码后的数据放入到 output 缓冲区队列
  5. Client 从 output 缓冲区队列申请编解码后的 buffer [dequeueOutputBuffer]
  6. Client 对编解码后的 buffer 进行渲染/播放
  7. 渲染/播放完成后,Client 再将该 buffer 放回 output 缓冲区队列 [releaseOutputBuffer]

MediaCodec Working Flow

MediaCodec 基本使用流程:

1
2
3
4
5
6
7
8
9
10
11
- createEncoderByType/createDecoderByType
- configure
- start
- while(true) {
- dequeueInputBuffer
- queueInputBuffer
- dequeueOutputBuffer
- releaseOutputBuffer
}
- stop
- release

整体项目业务流程

发送端

录音(PCM) -> 使用 MediaCodec 编码(AAC)-> 发送 AAC 音频流

接收端

接收 AAC 音频流 -> 使用 MediaCodec 解码(PCM) -> 播放 PCM

整个流程还是比较清晰的,涉及到的技术点:

  • 录音
  • 使用 MediaCodec 对音频数据进行编解码(PCM <-> AAC)
  • 播放 PCM

注意:文中涉及到的所有源代码中均已删除涉密信息。不影响使用。直接复制源代码会显示缺少日志类或常量类,需要使用你自己的日志工具类来做替换,常量的话,其常量名已经很好的表达其含义了,可以手动改成你需要的值。

录音 + 编码(PCM -> AAC)

  关于 Android 录音很简单,网上的例子也很多,也没有什么坑,这里就不详细介绍了。大致说下流程,然后直接上代码。

  录音涉及三要素:采样率,声道数和采样精度。为了保证兼容性,推荐的配置是 44.1kHz、单通道、16 位精度。

录音流程

  • 创建 AudioRecord 对象,其中最小录音缓存参数可以通过 AudioRecord.getMinBufferSize 方法得到。如果设置的缓存容量过小,将会导致对象构造失败。建议将该参数设置成 AudioRecord.getMinBufferSize 返回结果的 2~4倍。
  • 初始化一个buffer,该buffer大于等于 AudioRecord 对象用于写声音数据的 buffer 大小。
  • 开始录音
  • 创建一个数据流,一边从 AudioRecord 中读取声音数据到初始化的 buffer,一边将 buffer 中数据导入数据流。
  • 关闭数据流
  • 停止录音

使用 MediaCodec 编码

  编码时需要用到和音频相关的参数有:采样率、声道数、码率(比特率)和 AAC 规格(AAC Profile)。常用的 AAC Profile 是 AAC-LC( 低复杂度规格(Low Complexity))。这种规格在中等码率的编码效率以及音质方面,都能找到平衡点。所谓中等码率,就是指:96kbps-192kbps 之间的码率。因此,如果要使用该规格,请尽可能把码率控制在之前说的那个区间内。

回音消除/自动增益/噪声抑制

  通常情况下,录制完的音频会有回音现象,可以通过一些设置来解决。

方法一:

  初始化 AudioRecord 时,使用 MediaRecorder.AudioSource.VOICE_COMMUNICATION 替换 MediaRecorder.AudioSource.MICVOICE_COMMUNICATION 会开启回音消除及自动增益从而改善录制效果。

方法二:

  启用对应的设置类完成“回音消除/自动增益/噪声抑制”。AcousticEchoCancelerAutomaticGainControlNoiseSuppressor。使用它们时要先判断设备是否支持对应的功能。详细用法见下文源码。

相关源码

MicRecord.java

说明:该类会边录音边进行 AAC 编码。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
import android.media.AudioRecord;
import android.media.MediaRecorder;
import android.media.audiofx.AcousticEchoCanceler;
import android.media.audiofx.AutomaticGainControl;
import android.media.audiofx.NoiseSuppressor;
import android.os.SystemClock;

import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.util.Objects;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

/**
* Created by Michael Leo <y@ho1ho.com><br>
* Date: 2020/02/11 09:06
* <p>
* Code is far away from bug with the animal protecting
* </p>
* <pre>
* ----------Dragon be here!----------/
* ┏┓ ┏┓
* ┏┛┻━━━┛┻┓
* ┃ ┃
* ┃ ━ ┃
* ┃ ┳┛ ┗┳ ┃
* ┃ ┃
* ┃ ┻ ┃
* ┃ ┃
* ┗━┓ ┏━┛
* ┃ ┃神兽保佑
* ┃ ┃代码无BUG!
* ┃ ┗━━━┓
* ┃ ┣┓
* ┃ ┏┛
* ┗┓┓┏━┳┓┏┛
* ┃┫┫ ┃┫┫
* ┗┻┛ ┗┻┛
* ━━━━━━神兽出没━━━━━━
* </pre>
*/
public class MicRecord {
private static final String TAG = MicRecord.class.getSimpleName();

private ExecutorService mThreadPool = Executors.newFixedThreadPool(1);
private boolean mIsRecording;
private AacEncoder mAacEncoder;
private AudioRecord mAudioRecord;

private int mSampleRate;
private int mChannelMask;
private int mAudioFormat;
private int mRecordBufferSize;

public MicRecord(int sampleRate, int bitrate, int channelCount, int channelMask, int audioFormat) {
this(sampleRate, bitrate, channelCount, channelMask, audioFormat,
AudioRecord.getMinBufferSize(sampleRate, channelMask, audioFormat) * 2);
}

public MicRecord(int sampleRate, int bitrate, int channelCount, int channelMask, int audioFormat, int recordBufferSize) {
CLog.i(TAG, "recordBufferSize=%d", recordBufferSize);
mSampleRate = sampleRate;
mChannelMask = channelMask;
mAudioFormat = audioFormat;
mRecordBufferSize = recordBufferSize;

mAacEncoder = new AacEncoder(sampleRate, bitrate, channelCount);

createAudioRecord();
initAdvancedFeatures();
}

private void createAudioRecord() {
mAudioRecord = new AudioRecord(
// MediaRecorder.AudioSource.MIC,
MediaRecorder.AudioSource.VOICE_COMMUNICATION,
mSampleRate,
mChannelMask,
mAudioFormat,
mRecordBufferSize); // https://blog.csdn.net/lavender1626/article/details/80394253
}

private void initAdvancedFeatures() {
if (AcousticEchoCanceler.isAvailable()) {
AcousticEchoCanceler aec = AcousticEchoCanceler.create(mAudioRecord.getAudioSessionId());
if (aec != null) {
aec.setEnabled(true);
}
}

if (AutomaticGainControl.isAvailable()) {
AutomaticGainControl agc = AutomaticGainControl.create(mAudioRecord.getAudioSessionId());
if (agc != null) {
agc.setEnabled(true);
}
}

if (NoiseSuppressor.isAvailable()) {
NoiseSuppressor nc = NoiseSuppressor.create(mAudioRecord.getAudioSessionId());
if (nc != null) {
nc.setEnabled(true);
}
}
}

public void stop() {
mIsRecording = false;

if (null != mAudioRecord) {
try {
mAudioRecord.stop();
} catch (Exception e) {
CLog.e(TAG, e);
}
}
}

public void release() {
stop();
if (null != mAudioRecord) {
try {
mAudioRecord.release();
} catch (Exception e) {
CLog.e(TAG, e);
}
mAudioRecord = null;
}

if (mAacEncoder != null) {
try {
mAacEncoder.close();
} catch (Exception e) {
CLog.e(TAG, e);
}
mAacEncoder = null;
}

if (mThreadPool != null) {
mThreadPool.shutdownNow();
}
}

public void doRecord(Callback callback) {
mAudioRecord.startRecording();
mIsRecording = true;

mThreadPool.execute(() -> {
try {
byte[] audioData = new byte[mRecordBufferSize];

BufferedOutputStream os = null;
BufferedOutputStream aacOs = null;
if (Constants.DEBUG_MODE) {
String outputFolder = Objects.requireNonNull(CustomApplication.getInstance().getExternalFilesDir(null)).getAbsolutePath() + File.separator + "leo-audio";
File folder = new File(outputFolder);
if (!folder.exists()) {
boolean mkdirStatus = folder.mkdirs();
CLog.i(TAG, "mkdir [%s] %s", outputFolder, mkdirStatus);
}
File file = new File(outputFolder, "original.pcm");
File aacFile = new File(outputFolder, "original.aac");
String filename = file.getAbsolutePath();
String aacFilename = aacFile.getAbsolutePath();
os = new BufferedOutputStream(new FileOutputStream(filename));
aacOs = new BufferedOutputStream(new FileOutputStream(aacFilename));
}

byte[] aacAudioData;
long st;
int readSize;
while (mIsRecording) {
readSize = mAudioRecord.read(audioData, 0, mRecordBufferSize);
if (AudioRecord.ERROR_INVALID_OPERATION != readSize) {
st = SystemClock.elapsedRealtime();
aacAudioData = mAacEncoder.encodePcmToAac(audioData);
CLog.i(TAG, "Encode audio cost=%d", SystemClock.elapsedRealtime() - st);
if (Constants.DEBUG_MODE) {
os.write(audioData);
aacOs.write(aacAudioData);
}
callback.onCallback(aacAudioData);
}
}
} catch (Exception e) {
CLog.e(TAG, e);
}
});
}

public boolean isRecording() {
return mIsRecording;
}

public interface Callback {
void onCallback(byte[] data);
}
}
AacEncoder.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
import android.media.MediaCodec;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;

/**
* Created by Michael Leo <y@ho1ho.com><br>
* Date: 2020/02/03 14:48
* <p>
* Code is far away from bug with the animal protecting
* </p>
* <pre>
* ----------Dragon be here!----------/
* ┏┓ ┏┓
* ┏┛┻━━━┛┻┓
* ┃ ┃
* ┃ ━ ┃
* ┃ ┳┛ ┗┳ ┃
* ┃ ┃
* ┃ ┻ ┃
* ┃ ┃
* ┗━┓ ┏━┛
* ┃ ┃神兽保佑
* ┃ ┃代码无BUG!
* ┃ ┗━━━┓
* ┃ ┣┓
* ┃ ┏┛
* ┗┓┓┏━┳┓┏┛
* ┃┫┫ ┃┫┫
* ┗┻┛ ┗┻┛
* ━━━━━━神兽出没━━━━━━
* </pre>
*/
public class AacEncoder {
private static final String TAG = AacEncoder.class.getSimpleName();

private MediaCodec.BufferInfo mBufferInfo;
private long mPresentationTimeUs = 0;
private ByteArrayOutputStream mOutputAacStream = new ByteArrayOutputStream();
private MediaCodec mAudioEncoder;

public AacEncoder(int sampleRate, int bitrate, int channelCount) {
try {
mAudioEncoder = MediaCodec.createEncoderByType(MediaFormat.MIMETYPE_AUDIO_AAC);
} catch (IOException e) {
CLog.e(TAG, e, "Init AacEncoder error.");
}

MediaFormat mediaFormat = MediaFormat.createAudioFormat(MediaFormat.MIMETYPE_AUDIO_AAC,
sampleRate, channelCount);

// mediaFormat.setString(MediaFormat.KEY_MIME, MediaFormat.MIMETYPE_AUDIO_AAC);
mediaFormat.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);
mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, bitrate);
// mediaFormat.setInteger(MediaFormat.KEY_CHANNEL_MASK, DEFAULT_AUDIO_FORMAT);
mediaFormat.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, 32 * 1024);
mAudioEncoder.configure(mediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
mAudioEncoder.start();

mBufferInfo = new MediaCodec.BufferInfo();
}

/**
* Encode pcm data to aac.<br>
* <br>
*
* @param pcmData PCM data.
*/
public byte[] encodePcmToAac(byte[] pcmData) throws Exception {
// See the dequeueInputBuffer method in document to confirm the timeoutUs parameter.
int inputBufferIndex = mAudioEncoder.dequeueInputBuffer(-1);
if (inputBufferIndex >= 0) {
ByteBuffer inputBuffer = mAudioEncoder.getInputBuffer(inputBufferIndex);
if (inputBuffer != null) {
inputBuffer.clear();
inputBuffer.put(pcmData);
// inputBuffer.limit(pcmData.length);
}
long pts = computePresentationTimeUs(mPresentationTimeUs);
mAudioEncoder.queueInputBuffer(inputBufferIndex, 0, pcmData.length, pts, 0);
mPresentationTimeUs += 1;
}

int outputBufferIndex = mAudioEncoder.dequeueOutputBuffer(mBufferInfo, 0);
while (outputBufferIndex >= 0) {
ByteBuffer outputBuffer = mAudioEncoder.getOutputBuffer(outputBufferIndex);
if (outputBuffer != null) {
int outAacDataSize = mBufferInfo.size;
// The length of ADTS header is 7.
int outAacDataSizeWithAdts = outAacDataSize + 7;
outputBuffer.position(mBufferInfo.offset);
outputBuffer.limit(mBufferInfo.offset + outAacDataSize);

// Add ADTS header to pcm data array which its length is pcm data length plus 7.
byte[] outAacDataWithAdts = new byte[outAacDataSizeWithAdts];
addAdtsToData(outAacDataWithAdts, outAacDataSizeWithAdts);

outputBuffer.get(outAacDataWithAdts, 7, outAacDataSize);
outputBuffer.position(mBufferInfo.offset);
mOutputAacStream.write(outAacDataWithAdts);
}

// CLog.i(TAG, outAacDataWithAdts.length + " bytes written: " + Arrays.toString(outAacDataWithAdts));

mAudioEncoder.releaseOutputBuffer(outputBufferIndex, false);
outputBufferIndex = mAudioEncoder.dequeueOutputBuffer(mBufferInfo, 0);
}

mOutputAacStream.flush();
byte[] outAacBytes = mOutputAacStream.toByteArray();
mOutputAacStream.reset();

return outAacBytes;
}

/**
* Add 7-bits ADTS header to aac audio data.<br>
* <br>
* https://blog.csdn.net/tx3344/article/details/7414543
* https://blog.csdn.net/jay100500/article/details/52955232
* https://wiki.multimedia.cx/index.php/MPEG-4_Audio#Sampling_Frequencies
*
* <br>
* profile
* 1: Main profile
* 2: Low Complexity profile(LC)
* 3: Scalable Sampling Rate profile(SSR)
* <br>
* sampling_frequency_index
* 0: 96000 Hz
* 1: 88200 Hz
* 2: 64000 Hz
* 3: 48000 Hz
* 4: 44100 Hz
* 5: 32000 Hz
* 6: 24000 Hz
* 7: 22050 Hz
* 8: 16000 Hz
* 9: 12000 Hz
* 10: 11025 Hz
* 11: 8000 Hz
* 12: 7350 Hz
* 13: Reserved
* 14: Reserved
* 15: frequency is written explictly
*
* <br>
* channel_configuration
* 0: Defined in AOT Specifc Config
* 1: 1 channel: front-center
* 2: 2 channels: front-left, front-right
* 3: 3 channels: front-center, front-left, front-right
* 4: 4 channels: front-center, front-left, front-right, back-center
* 5: 5 channels: front-center, front-left, front-right, back-left, back-right
* 6: 6 channels: front-center, front-left, front-right, back-left, back-right, LFE-channel
* 7: 8 channels: front-center, front-left, front-right, side-left, side-right, back-left, back-right, LFE-channel
* 8-15: Reserved
*
* @param outAacDataWithAdts The audio data with ADTS header.
* @param outAacDataLenWithAdts The length of audio data with ADTS header.
*/
private void addAdtsToData(byte[] outAacDataWithAdts, int outAacDataLenWithAdts) {
int profile = 2; // AAC LC. If you change this value, DO NOT forget to change KEY_AAC_PROFILE while config MediaCodec
int freqIdx = 4; // 4: 44.1KHz 8: 16Khz 11: 8Khz
int chanCfg = 1; // 1: single_channel_element 2: CPE(channel_pair_element)
outAacDataWithAdts[0] = (byte) 0xFF;
outAacDataWithAdts[1] = (byte) 0xF9;
outAacDataWithAdts[2] = (byte) (((profile - 1) << 6) + (freqIdx << 2) + (chanCfg >> 2));
outAacDataWithAdts[3] = (byte) (((chanCfg & 3) << 6) + (outAacDataLenWithAdts >> 11));
outAacDataWithAdts[4] = (byte) ((outAacDataLenWithAdts & 0x7FF) >> 3);
outAacDataWithAdts[5] = (byte) (((outAacDataLenWithAdts & 7) << 5) + 0x1F);
outAacDataWithAdts[6] = (byte) 0xFC;
}

/**
* Release sources.
*/
public void close() {
try {
if (mAudioEncoder != null) {
mAudioEncoder.stop();
mAudioEncoder.release();
}
if (mOutputAacStream != null) {
mOutputAacStream.flush();
mOutputAacStream.close();
}
} catch (Exception e) {
CLog.e(TAG, e, "close error.");
}
}

/**
* Calculate PTS.<br>
* <br>
* Actually, it doesn't make any error if you return 0 directly.
*
* @return The calculated presentation time in microseconds.
*/
public static long computePresentationTimeUs(long frameIndex) {
return frameIndex * 32000 * 1024 / 44100;
}
}

其中用到的其它类如下:

CustomApplication.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import android.content.Context;
import android.util.Log;

/**
* Author: Michael Leo
* Date: 19-7-17 下午7:32
*/
public class CustomApplication extends MultiDexApplication {

private static final String TAG = CustomApplication.class.getSimpleName();
private static CustomApplication mInstance;

@Override
public void onCreate() {
super.onCreate();
Log.i(TAG, "onCreate()");
mInstance = this;
// You codes add here.
}

public static CustomApplication etInstance() {
return mInstance;
}
}
Callback.java
1
2
3
public interface Callback {
void onCallback(byte[] data);
}

解码 + 播放

解码

  关于使用 MediaCodec 的用法,可以回看下“关于 MediaCodec”的说明。这里就不赘述了。另外使用 MediaCodec 进行编码和解码,从代码层面上来看非常相似。

​ 解码时需要用到的必要参数有:采样率,声道数,AAC 规格(AAC Profile) 以及 CSD-0(详见后文介绍)。当然,在实际使用时,通常还会设置一些其它参数。

  需要注意的时,解码时用到的 CSD-0 (详见后文介绍)并不是一个固定值,而是根据采样率,声道数和 AAC Profile 计算出来的(计算方法见源码注释)。网上很多代码都是直接给的值,并没有告诉你应该怎么计算。如果设置 CSD-0 与采样率,声道数和 AAC Profile 不匹配的话,会导致解码出错的。这也是最在开发过程中最常见的问题。

​ 解码时有一个大坑,就是可供设置的 MediaFormat 值非常多,但是我们并不需要一一指定。不恰当的设置会导致解码出错。所以大家遇到解码失败时,最大的可能就是参数设置问题。我在源码中注释的地方指名了一些常见的设置项,供大家参考。

​ 例如:如果编码 AAC 时指定了 ADTS 头,那么解码时就要将 MediaFormat.KEY_IS_ADTS 设置成 1

播放

​ 播放 PCM 时和音频相关的必要参数:采样率,采样精度,通道掩码(channelConfig)。

​ 通过 MediaCodec 数音频数据解码成 PCM 后,就可以直接使用 AudioTrack 类进行播放了。播放的代码很简单,没什么坑,就不详细讲解了,直接上代码。

相关源码

AacDecoder.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
import android.media.AudioManager;
import android.media.AudioTrack;
import android.media.MediaCodec;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;
import android.os.SystemClock;

import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.Objects;
import java.util.concurrent.atomic.AtomicLong;

/**
* Created by Michael Leo <y@ho1ho.com><br>
* Date: 2020/02/10 11:15
* <p>
* Code is far away from bug with the animal protecting
* </p>
* <pre>
* ----------Dragon be here!----------/
* ┏┓ ┏┓
* ┏┛┻━━━┛┻┓
* ┃ ┃
* ┃ ━ ┃
* ┃ ┳┛ ┗┳ ┃
* ┃ ┃
* ┃ ┻ ┃
* ┃ ┃
* ┗━┓ ┏━┛
* ┃ ┃神兽保佑
* ┃ ┃代码无BUG!
* ┃ ┗━━━┓
* ┃ ┣┓
* ┃ ┏┛
* ┗┓┓┏━┳┓┏┛
* ┃┫┫ ┃┫┫
* ┗┻┛ ┗┻┛
* ━━━━━━神兽出没━━━━━━
* </pre>
*/
public class AacDecoder {
private static final String TAG = AacDecoder.class.getSimpleName();

private BufferedOutputStream mAacOs;
private AtomicLong presentationTimeUs = new AtomicLong(0);

private AudioTrack mAudioTrack;
private MediaCodec mAudioDecoder;

public AacDecoder(int sampleRate, int channelConfig, int audioFormat, byte[] csd0, int trackBufferSize) {
CLog.i(TAG, "sampleRate=%d channelConfig=%d audioFormat=%d trackBufferSize=%d csd0=%s"
, sampleRate, channelConfig, audioFormat, trackBufferSize, JsonUtil.toHexadecimalString(csd0));

initAudioDecoder(sampleRate, csd0);
initAudioTrack(sampleRate, channelConfig, audioFormat, trackBufferSize);
}

public AacDecoder(int sampleRate, int channelConfig, int audioFormat, byte[] csd0) {
this(sampleRate, channelConfig, audioFormat, csd0,
AudioTrack.getMinBufferSize(sampleRate, channelConfig, audioFormat) * 2);
}

/**
* Write audio data to disk<br>
* <br>
* DEBUG ONLY
*
* @param audioData The audio data to be written.
*/
public void writeDataToDisk(byte[] audioData) {
try {
if (mAacOs != null) {
mAacOs.write(audioData);
} else {
CLog.e(TAG, "mAacOs is null");
}
} catch (Exception e) {
CLog.e(TAG,"You can ignore this message safely. writeDataToDisk error");
}
}

/**
* DEBUG ONLY
*/
public void closeOutputStream() {
if (mAacOs == null) {
return;
}
try {
CLog.w(TAG, "END-OF-AUDIO close stream.");
mAacOs.flush();
mAacOs.close();
} catch (Exception e) {
CLog.e(TAG, e, "closeOutputStream error");
}
}

/**
* DEBUG ONLY
*/
public void initOutputStream() {
CLog.w(TAG, "START-AUDIO init stream.");
String outputFolder = Objects.requireNonNull(CustomApplication.getInstance().getExternalFilesDir(null)).getAbsolutePath() + File.separator + "leo-audio";
File folder = new File(outputFolder);
if (!folder.exists()) {
boolean succ = folder.mkdirs();
if (!succ) {
CLog.e(TAG, "Can not create output file=%s", outputFolder);
}
}
File aacFile = new File(outputFolder, "received-original-" + SystemClock.elapsedRealtime() + ".aac");
final String aacFilename = aacFile.getAbsolutePath();

try {
mAacOs = new BufferedOutputStream(new FileOutputStream(aacFilename), 32 * 1024);
} catch (Exception e) {
CLog.e(TAG, e, "initOutputStream error.");
}
}

private void initAudioTrack(int sampleRate, int channelConfig, int audioFormat, int bufferSize) {
mAudioTrack = new AudioTrack(AudioManager.STREAM_MUSIC,
sampleRate,
channelConfig,
audioFormat,
bufferSize,
AudioTrack.MODE_STREAM);

mAudioTrack.play();
}

// https://juejin.im/post/5c36bbad6fb9a049d975676b
// https://stackoverflow.com/questions/56106877/how-to-decode-aac-formatmp4-audio-file-to-pcm-format-in-android
// https://www.jianshu.com/p/b30d6a4f745b
// https://blog.csdn.net/lavender1626/article/details/80431902
// http://sohailaziz05.blogspot.com/2014/06/mediacodec-decoding-aac-android.html
private void initAudioDecoder(int sampleRate, byte[] csd0) {
try {
// String folder = Objects.requireNonNull(CustomApplication.instance.getExternalFilesDir(null)).getAbsolutePath() + File.separator + "leo-audio";
// File mFilePath = new File(folder, "original.aac");

// mMediaExtractor = new MediaExtractor();
// mMediaExtractor.setDataSource(mFilePath.getAbsolutePath());

// MediaFormat mediaFormat = mMediaExtractor.getTrackFormat(0);
// String mime = mediaFormat.getString(MediaFormat.KEY_MIME);
// if (mime.startsWith("audio")) {
// mMediaExtractor.selectTrack(0);

// mediaFormat.setString(MediaFormat.KEY_MIME, MediaFormat.MIMETYPE_AUDIO_AAC);
// mediaFormat.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1);
// mediaFormat.setInteger(MediaFormat.KEY_SAMPLE_RATE, AacEncoder.DEFAULT_SAMPLE_RATES);
// mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, AacEncoder.DEFAULT_BIT_RATES);
// mediaFormat.setInteger(MediaFormat.KEY_CHANNEL_MASK, CHANNEL_IN);
// mediaFormat.setInteger(MediaFormat.KEY_IS_ADTS, 1);
// mediaFormat.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);

mAudioDecoder = MediaCodec.createDecoderByType(MediaFormat.MIMETYPE_AUDIO_AAC);
MediaFormat mediaFormat = MediaFormat.createAudioFormat(MediaFormat.MIMETYPE_AUDIO_AAC,
sampleRate,
1);
mediaFormat.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);
mediaFormat.setInteger(MediaFormat.KEY_IS_ADTS, 1);

// ByteBuffer key
// AAC Profile 5bits | SampleRate 4bits | Channel Count 4bits | Others 3bits(Normally 0)
// Example: AAC LC,44.1Khz,Mono. Separately values: 2,4,1.
// Convert them to binary value: 0b10, 0b100, 0b1
// According to AAC required, convert theirs values to binary bits:
// 00010 0100 0001 000
// The corresponding hex value:
// 0001 0010 0000 1000
// So the csd_0 value is 0x12,0x08
// https://developer.android.com/reference/android/media/MediaCodec
// AAC CSD: Decoder-specific information from ESDS
ByteBuffer csd_0 = ByteBuffer.wrap(csd0);
// Set ADTS decoder information.
mediaFormat.setByteBuffer("csd-0", csd_0);

mAudioDecoder.configure(mediaFormat, null, null, 0);
} catch (IOException e) {
CLog.e(TAG, e, "initAudioDecoder error.");
}

if (mAudioDecoder == null) {
CLog.e(TAG, "mAudioDecoder is null");
return;
}
// Start MediaCodec. Waiting to receive data.
mAudioDecoder.start();
}

public void decodeAndPlay(byte[] audioData) {
long st = SystemClock.elapsedRealtime();
try {
MediaCodec.BufferInfo bufferInfo = new MediaCodec.BufferInfo();

ByteBuffer inputBuffer;
ByteBuffer outputBuffer;

// See the dequeueInputBuffer method in document to confirm the timeoutUs parameter.
int inputIndex = mAudioDecoder.dequeueInputBuffer(-1);
CLog.v(TAG, "inputIndex=%d", inputIndex);
if (inputIndex < 0) {
return;
}
inputBuffer = mAudioDecoder.getInputBuffer(inputIndex);
if (inputBuffer != null) {
// Clear exist data.
inputBuffer.clear();
// Put pcm audio data to encoder.
inputBuffer.put(audioData);
// inputBuffer.limit(audioData.length);
// CLog.i(TAG, "Rcv aac[%d]", audioData.length);
}
// int sampleSize = mMediaExtractor.readSampleData(inputBuffer, 0);
// if (sampleSize > 0) {
mAudioDecoder.queueInputBuffer(inputIndex, 0, audioData.length, AacEncoder.computePresentationTimeUs(presentationTimeUs.getAndIncrement()) /*mMediaExtractor.getSampleTime()*/, 0);
// mMediaExtractor.advance(); // MediaExtractor. Move to next sample.
// } else {
// mAudioDecoder.queueInputBuffer(inputIndex, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
// isFinish = true;
// }

// Start decoding and get output index
int outputIndex = mAudioDecoder.dequeueOutputBuffer(bufferInfo, 0);
CLog.d(TAG, "outputIndex=%d", outputIndex);

byte[] chunkPCM;
// Get decoded data in bytes
while (outputIndex >= 0) {
outputBuffer = mAudioDecoder.getOutputBuffer(outputIndex);
chunkPCM = new byte[bufferInfo.size];
if (outputBuffer != null) {
outputBuffer.get(chunkPCM);
} else {
CLog.e(TAG, "outputBuffer is null");
}
// Must clear decoded data before next loop. Otherwise, you will get the same data while looping.
if (chunkPCM.length > 0) {
if (Constants.DEBUG_MODE) {
CLog.d(TAG, "PCM data[%d]", chunkPCM.length);
}
// Play decoded audio data in PCM
mAudioTrack.write(chunkPCM, 0, chunkPCM.length);
}
mAudioDecoder.releaseOutputBuffer(outputIndex, false);
// Get data again.
outputIndex = mAudioDecoder.dequeueOutputBuffer(bufferInfo, 0);
}
} catch (Exception e) {
CLog.e(TAG, "You can ignore this message safely. decodeAndPlay error");
} finally {
long ed = SystemClock.elapsedRealtime();
CLog.d(TAG, "Decode=%dms", ed - st);
}
}

public void close() {
try {
if (mAudioTrack != null) {
mAudioTrack.stop();
}

if (mAudioDecoder != null) {
mAudioDecoder.stop();
}
} catch (Exception e) {
CLog.e(TAG, e, "close error.");
}
}

public void release() {
try {
close();
if (mAudioTrack != null) {
mAudioTrack.stop();
mAudioTrack.release();
mAudioTrack = null;
}

if (mAudioDecoder != null) {
mAudioDecoder.stop();
mAudioDecoder.release();
mAudioDecoder = null;
}
} catch (Exception e) {
CLog.e(TAG, e, "release error.");
}
}

public AudioTrack getAudioTrack() {
return mAudioTrack;
}

public void setAudioTrack(AudioTrack mAudioTrack) {
this.mAudioTrack = mAudioTrack;
}

public MediaCodec getAudioDecoder() {
return mAudioDecoder;
}

public void setAudioDecoder(MediaCodec mAudioDecoder) {
this.mAudioDecoder = mAudioDecoder;
}
}

Socket 相关源码

​ 我使用的是 Netty 做为网络传递层。其优点太多,大家可以自行谷歌/百度。由于是演示用代码,因此代码量比较少也比较简单,就是 Netty 的基本用法。

TcpClient.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
import android.util.Log;

import com.ho1ho.audioexample.utils.others.ByteUtil;

import java.util.Arrays;

import io.netty.bootstrap.Bootstrap;
import io.netty.buffer.Unpooled;
import io.netty.channel.Channel;
import io.netty.channel.ChannelInitializer;
import io.netty.channel.ChannelOption;
import io.netty.channel.ChannelPipeline;
import io.netty.channel.EventLoopGroup;
import io.netty.channel.nio.NioEventLoopGroup;
import io.netty.channel.socket.nio.NioSocketChannel;
import io.netty.handler.logging.LogLevel;
import io.netty.handler.logging.LoggingHandler;

/**
* Created by Michael Leo <y@ho1ho.com><br>
* Date: 2020/02/05 13:35
* <p>
* Code is far away from bug with the animal protecting
* </p>
* <pre>
* ----------Dragon be here!----------/
* ┏┓ ┏┓
* ┏┛┻━━━┛┻┓
* ┃ ┃
* ┃ ━ ┃
* ┃ ┳┛ ┗┳ ┃
* ┃ ┃
* ┃ ┻ ┃
* ┃ ┃
* ┗━┓ ┏━┛
* ┃ ┃神兽保佑
* ┃ ┃代码无BUG!
* ┃ ┗━━━┓
* ┃ ┣┓
* ┃ ┏┛
* ┗┓┓┏━┳┓┏┛
* ┃┫┫ ┃┫┫
* ┗┻┛ ┗┻┛
* ━━━━━━神兽出没━━━━━━
* </pre>
*/
public class TcpClient {
private static final String TAG = TcpClient.class.getSimpleName();
// public static String HOST = "127.0.0.1";
public static int PORT = 9999;

public Bootstrap bootstrap;// = getBootstrap();
public Channel channel;// = getChannel(HOST, PORT);

public void initClient(String host) {
bootstrap = getBootstrap();
channel = getChannel(host, PORT);
}

public Bootstrap getBootstrap() {
EventLoopGroup group = new NioEventLoopGroup();
Bootstrap b = new Bootstrap();
b.group(group)
.channel(NioSocketChannel.class);
b.handler(new ChannelInitializer<Channel>() {
@Override
protected void initChannel(Channel ch) throws Exception {
ChannelPipeline pipeline = ch.pipeline();
pipeline.addLast(new LoggingHandler(LogLevel.DEBUG));
pipeline.addLast("handler", new TcpClientHandler());
}
});
b.option(ChannelOption.SO_KEEPALIVE, true);
return b;
}

public Channel getChannel(String host, int port) {
Channel channel = null;
try {
channel = bootstrap.connect(host, port).sync().channel();
} catch (Exception e) {
e.printStackTrace();
Log.e(TAG, String.format("Connect to Server(IP=%s, PORT=%d) failed.", host, port));
return null;
}
return channel;
}

public void sendMsg(String msg) throws Exception {
if (channel != null) {
channel.writeAndFlush(msg).sync();
} else {
Log.w(TAG, "Send data failed. Channel is uninitialized.");
}
}

public void sendData(byte[] bytes) throws Exception {
// Log.i(TAG, "Sending data 1: " + Arrays.toString(bytes));
if (channel != null) {
byte[] all = ByteUtil.mergeBytes(ByteUtil.int2Bytes(bytes.length), bytes);
Log.i(TAG, "Sending data [" + all.length + "|" + bytes.length + "]: " + Arrays.toString(all));
channel.writeAndFlush(Unpooled.wrappedBuffer(all)).sync();
} else {
Log.w(TAG, "Channel Channel is uninitialized.");
}
}
}

TcpClientHandler.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import android.util.Log;

import io.netty.channel.ChannelHandlerContext;
import io.netty.channel.SimpleChannelInboundHandler;

/**
* Created by Michael Leo <y@ho1ho.com><br>
* Date: 2020/02/05 13:36
* <p>
* Code is far away from bug with the animal protecting
* </p>
* <pre>
* ----------Dragon be here!----------/
* ┏┓ ┏┓
* ┏┛┻━━━┛┻┓
* ┃ ┃
* ┃ ━ ┃
* ┃ ┳┛ ┗┳ ┃
* ┃ ┃
* ┃ ┻ ┃
* ┃ ┃
* ┗━┓ ┏━┛
* ┃ ┃神兽保佑
* ┃ ┃代码无BUG!
* ┃ ┗━━━┓
* ┃ ┣┓
* ┃ ┏┛
* ┗┓┓┏━┳┓┏┛
* ┃┫┫ ┃┫┫
* ┗┻┛ ┗┻┛
* ━━━━━━神兽出没━━━━━━
* </pre>
*/
public class TcpClientHandler extends SimpleChannelInboundHandler<Object> {

private static final String TAG = TcpClientHandler.class.getSimpleName();

@Override
protected void channelRead0(ChannelHandlerContext ctx, Object msg) throws Exception {
Log.w(TAG, "Client received msg from server: " + msg);
}
}

TcpServer.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
/**
* Created by Michael Leo <y@ho1ho.com><br>
* Date: 2020/02/05 13:30
* <p>
* Code is far away from bug with the animal protecting
* </p>
* <pre>
* ----------Dragon be here!----------/
* ┏┓ ┏┓
* ┏┛┻━━━┛┻┓
* ┃ ┃
* ┃ ━ ┃
* ┃ ┳┛ ┗┳ ┃
* ┃ ┃
* ┃ ┻ ┃
* ┃ ┃
* ┗━┓ ┏━┛
* ┃ ┃神兽保佑
* ┃ ┃代码无BUG!
* ┃ ┗━━━┓
* ┃ ┣┓
* ┃ ┏┛
* ┗┓┓┏━┳┓┏┛
* ┃┫┫ ┃┫┫
* ┗┻┛ ┗┻┛
* ━━━━━━神兽出没━━━━━━
* </pre>
*/

import android.util.Log;

import io.netty.bootstrap.ServerBootstrap;
import io.netty.channel.ChannelInitializer;
import io.netty.channel.ChannelOption;
import io.netty.channel.ChannelPipeline;
import io.netty.channel.EventLoopGroup;
import io.netty.channel.nio.NioEventLoopGroup;
import io.netty.channel.socket.SocketChannel;
import io.netty.channel.socket.nio.NioServerSocketChannel;
import io.netty.handler.logging.LogLevel;
import io.netty.handler.logging.LoggingHandler;

public class TcpServer {
private static final String TAG = TcpServer.class.getSimpleName();

// private static final String IP = "127.0.0.1";
private static final int PORT = 9999;
protected static final int BOSS_GROUP_SIZE = Runtime.getRuntime().availableProcessors() * 2;
protected static final int WORKER_GROUP_SIZE = 4;

private static TcpServerHandler handler = new TcpServerHandler();
private static final EventLoopGroup bossGroup = new NioEventLoopGroup(BOSS_GROUP_SIZE);
private static final EventLoopGroup workerGroup = new NioEventLoopGroup(WORKER_GROUP_SIZE);

public static void run() throws Exception {
ServerBootstrap b = new ServerBootstrap();
b.group(bossGroup, workerGroup);
b.channel(NioServerSocketChannel.class);
b.childOption(ChannelOption.SO_KEEPALIVE, true);
b.childHandler(new ChannelInitializer<SocketChannel>() {
@Override
public void initChannel(SocketChannel ch) throws Exception {
ChannelPipeline pipeline = ch.pipeline();
pipeline.addLast(new LoggingHandler(LogLevel.DEBUG));
pipeline.addLast("messageDecoder", new CustomDecoder());
pipeline.addLast(handler);
}
});

b.bind(PORT).sync();
Log.i(TAG, "Server started.");
}

protected static void shutdown() {
workerGroup.shutdownGracefully();
bossGroup.shutdownGracefully();
}

public static TcpServerHandler getHandler() {
return handler;
}
}

TcpServerHandler.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
import android.util.Log;

import com.ho1ho.audioexample.MainActivity;
import com.ho1ho.audioexample.utils.AacDecoder;

import java.nio.charset.StandardCharsets;

import io.netty.buffer.ByteBuf;
import io.netty.buffer.ByteBufUtil;
import io.netty.channel.ChannelHandlerContext;
import io.netty.channel.SimpleChannelInboundHandler;

/**
* Created by Michael Leo <y@ho1ho.com><br>
* Date: 2020/02/05 13:34
* <p>
* Code is far away from bug with the animal protecting
* </p>
* <pre>
* ----------Dragon be here!----------/
* ┏┓ ┏┓
* ┏┛┻━━━┛┻┓
* ┃ ┃
* ┃ ━ ┃
* ┃ ┳┛ ┗┳ ┃
* ┃ ┃
* ┃ ┻ ┃
* ┃ ┃
* ┗━┓ ┏━┛
* ┃ ┃神兽保佑
* ┃ ┃代码无BUG!
* ┃ ┗━━━┓
* ┃ ┣┓
* ┃ ┏┛
* ┗┓┓┏━┳┓┏┛
* ┃┫┫ ┃┫┫
* ┗┻┛ ┗┻┛
* ━━━━━━神兽出没━━━━━━
* </pre>
*/
public class TcpServerHandler extends SimpleChannelInboundHandler<Object> {

private static final String TAG = TcpServerHandler.class.getSimpleName();

private AacDecoder mDecoder;

public TcpServerHandler() {
mDecoder = new AacDecoder(MainActivity.DEFAULT_SAMPLE_RATES,
MainActivity.CHANNEL_OUT,
MainActivity.DEFAULT_AUDIO_FORMAT,
MainActivity.AUDIO_CSD_0);
}

@Override
protected void channelRead0(ChannelHandlerContext ctx, Object msg) throws Exception {
// mIsPlaying = true;
ByteBuf bb = (ByteBuf) msg;
// int dataLen = bb.readInt();
// byte[] audioData = new byte[dataLen];
// bb.readBytes(audioData);

byte[] audioData = ByteBufUtil.getBytes(bb);

// Log.w(TAG, "Received audio data[" + audioData.length + "]: " + Arrays.toString(audioData));

if ("START-AUDIO".equals(new String(audioData, StandardCharsets.UTF_8))) {
Log.e(TAG, "START-AUDIO");
mDecoder.initOutputStream();
return;
}

try {
mDecoder.writeDataToDisk(audioData);
mDecoder.decodeAndPlay(audioData);
} catch (Exception e) {
e.printStackTrace();
}

// Play pcm data directly.
// mAudioTrack.write(data, 0, data.length);

// ctx.channel().writeAndFlush("server accepted msg: " + msg);
}

@Override
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) throws Exception {
Log.w(TAG, "exceptionCaught!", cause);
ctx.close();
}

}

CustomDecoder.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import java.util.List;

import io.netty.buffer.ByteBuf;
import io.netty.channel.ChannelHandlerContext;
import io.netty.handler.codec.ByteToMessageDecoder;

public class CustomDecoder extends ByteToMessageDecoder {

@Override
protected void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) {
int bufLen = in.readableBytes();
if (bufLen < 4) {
return;
}
in.markReaderIndex();
int dataLen = in.readInt();
if (in.readableBytes() < dataLen) {
in.resetReaderIndex();
return;
}
out.add(in.readBytes(dataLen));
}
}

其它相关源码

说明:由于是演示用代码,我并没有做动态权限申请,请知悉。大家测试时,需要手机授予个权限就行了。

MainActivity.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
import android.content.Intent;
import android.media.AudioFormat;
import android.os.Bundle;
import android.os.Handler;
import android.os.HandlerThread;
import android.util.Log;
import android.view.View;
import android.widget.EditText;

import androidx.appcompat.app.AppCompatActivity;

import com.ho1ho.audioexample.others.Callback;
import com.ho1ho.audioexample.tcp.TcpClient;
import com.ho1ho.audioexample.tcp.TcpServer;
import com.ho1ho.audioexample.utils.AudioPlay;
import com.ho1ho.audioexample.utils.MicRecord;
import com.ho1ho.audioexample.utils.others.PcmToWavUtil;

import java.io.File;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
import java.util.Objects;

public class MainActivity extends AppCompatActivity {

private static final String TAG = MainActivity.class.getSimpleName();

// Audio sample rate. The standard value is 44100 for all devices.
// However, some devices are still support the following sample rates like: 11025, 16000, 22050 and etc.
private static final int[] SUPPORT_SAMPLE_RATES = {8000, 11025, 16000, 22050, 44100, 48000};
// Bitrate. Kilobits per second.
private static final int[] SUPPORT_BITRATES = {16000, 32000, 64000, 96000, 128000, 192000, 256000};

public static final int DEFAULT_SAMPLE_RATES = SUPPORT_SAMPLE_RATES[2];
public static final int DEFAULT_BIT_RATES = SUPPORT_BITRATES[0];
public static final int DEFAULT_CHANNEL_COUNT = 1;
public static final int DEFAULT_AUDIO_FORMAT = AudioFormat.ENCODING_PCM_16BIT;

public static final int CHANNEL_IN = AudioFormat.CHANNEL_IN_MONO;
public static final int CHANNEL_OUT = AudioFormat.CHANNEL_OUT_MONO;
public static final byte[] AUDIO_CSD_0 = new byte[]{(byte) 0x14, (byte) 0x08}; // 12,08| 14,08

private MicRecord mMicRecord;
private AudioPlay mAudioPlay;

@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
Log.e(TAG, "CacheDir =" + getCacheDir().getAbsoluteFile());
Log.e(TAG, "ExternalCacheDir=" + getExternalCacheDir().getAbsoluteFile());
Log.e(TAG, "FilesDir =" + getFilesDir().getAbsoluteFile());
Log.e(TAG, "ExternalFilesDir=" + getExternalFilesDir(null).getAbsoluteFile());
}

public void convertPcm2Wav() {
String outputFolder = Objects.requireNonNull(getExternalFilesDir(null)).getAbsolutePath() + File.separator + "leo-audio";
File file = new File(outputFolder, "original.pcm");
File wavFile = new File(outputFolder, "original.wav");

try {
PcmToWavUtil.pcmToWav(file, wavFile, 1, DEFAULT_SAMPLE_RATES, 16);
} catch (IOException e) {
e.printStackTrace();
}
}

public void onRecordClick(View view) {
mMicRecord = new MicRecord(DEFAULT_SAMPLE_RATES,
DEFAULT_BIT_RATES,
DEFAULT_CHANNEL_COUNT,
CHANNEL_IN,
DEFAULT_AUDIO_FORMAT);

sendAudioDataToServer("START-AUDIO".getBytes(StandardCharsets.UTF_8));
// sendAudioDataToServer(AUDIO_CSD_0);
mMicRecord.doRecord(new Callback() {
@Override
public void onCallback(byte[] aacAudioData) {
Log.w(TAG, " Sending audio data[" + aacAudioData.length + "]: " + Arrays.toString(aacAudioData));
sendAudioDataToServer(aacAudioData);
}
});

}

private void releaseAllResources() {
if (mMicRecord != null) {
mMicRecord.stop();
}
if (mAudioPlay != null) {
mAudioPlay.stop();
}
}

public void onStopClick(View view) {
Log.i(TAG, "onStopClick");
try {
sendAudioDataToServer("END-OF-AUDIO".getBytes(StandardCharsets.UTF_8));
} catch (Exception e) {
e.printStackTrace();
}
releaseAllResources();
convertPcm2Wav();
}

// ====================================================
// ====================================================
// ====================================================

public void onPlayAacClick(View view) {
mAudioPlay = new AudioPlay(DEFAULT_SAMPLE_RATES,
CHANNEL_OUT,
DEFAULT_AUDIO_FORMAT, AUDIO_CSD_0);
mAudioPlay.stop();
mAudioPlay.playAac();
}

public void onPlayPcmClick(View view) {
mAudioPlay = new AudioPlay(DEFAULT_SAMPLE_RATES,
CHANNEL_OUT,
DEFAULT_AUDIO_FORMAT, AUDIO_CSD_0);
mAudioPlay.stop();
mAudioPlay.playPcm();
}

// ====================================================
// ====================================================
// ====================================================

public void initTcpServer() {
new Handler().post(() -> {
try {
TcpServer.run();
} catch (Exception e) {
e.printStackTrace();
}
});
}

public void onSendDataClick(View view) {
sendAudioDataToServer("Hello World".getBytes(StandardCharsets.UTF_8));
}

private HandlerThread ht;
private TcpClient client;

public void sendAudioDataToServer(byte[] data) {
if (client == null) {
Log.e(TAG, "Client is null");
return;
}
// Log.i(TAG, "Sending data 0: " + Arrays.toString(data));
new Handler(ht.getLooper()).post(() -> {
try {
client.sendData(data);
} catch (Exception e) {
e.printStackTrace();
}
});
}

public void onStartServerClick(View view) {
initTcpServer();
// TcpServer.getHandler().initOutputStream();
}

public void onConnectServerClick(View view) {
client = new TcpClient();
String serverIp = ((EditText) findViewById(R.id.etSvrIp)).getText().toString();
client.initClient(serverIp);

ht = new HandlerThread("send-data");
ht.start();
Log.w(TAG, "Connected to server.");
}

// ====================================================
// ====================================================
// ====================================================

public void onCaptureClick(View view) {
startActivity(new Intent(this, CaptureImage.class));
}

}

PcmToWavUtil.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

/**
* Created by Michael Leo <y@ho1ho.com><br>
* Date: 2020/02/03 17:05
* <p>
* Code is far away from bug with the animal protecting
* </p>
* <pre>
* ----------Dragon be here!----------/
* ┏┓ ┏┓
* ┏┛┻━━━┛┻┓
* ┃ ┃
* ┃ ━ ┃
* ┃ ┳┛ ┗┳ ┃
* ┃ ┃
* ┃ ┻ ┃
* ┃ ┃
* ┗━┓ ┏━┛
* ┃ ┃神兽保佑
* ┃ ┃代码无BUG!
* ┃ ┗━━━┓
* ┃ ┣┓
* ┃ ┏┛
* ┗┓┓┏━┳┓┏┛
* ┃┫┫ ┃┫┫
* ┗┻┛ ┗┻┛
* ━━━━━━神兽出没━━━━━━
* </pre>
*/
public class PcmToWavUtil {
/**
* Size of buffer used for transfer, by default
*/
private static final int TRANSFER_BUFFER_SIZE = 10 * 1024;

/**
* @param pcmData Raw PCM data
* @param numChannels Channel count. mono = 1, stereo = 2
* @param sampleRate Sample rate
* @param bitPerSample Bits per sample. Example: 8bits, 16bits
* @return Wave data
*/
public static byte[] pcmToWav(byte[] pcmData, int numChannels, int sampleRate, int bitPerSample) {
byte[] wavData = new byte[pcmData.length + 44];
byte[] header = wavHeader(pcmData.length, numChannels, sampleRate, bitPerSample);
System.arraycopy(header, 0, wavData, 0, header.length);
System.arraycopy(pcmData, 0, wavData, header.length, pcmData.length);
return wavData;
}

// ====================================================

/**
* @param pcmLen The length of PCM
* @param numChannels Channel count. mono = 1, stereo = 2
* @param sampleRate Sample rate
* @param bitPerSample Bits per sample. Example: 8bits, 16bits
* @return Wave header
*/
public static byte[] wavHeader(int pcmLen, int numChannels, int sampleRate, int bitPerSample) {
byte[] header = new byte[44];
// ChunkID, RIFF, 4 bytes respectively.
header[0] = 'R';
header[1] = 'I';
header[2] = 'F';
header[3] = 'F';
// ChunkSize, pcmLen + 36, 4 bytes respectively.
long chunkSize = pcmLen + 36;
header[4] = (byte) (chunkSize & 0xff);
header[5] = (byte) ((chunkSize >> 8) & 0xff);
header[6] = (byte) ((chunkSize >> 16) & 0xff);
header[7] = (byte) ((chunkSize >> 24) & 0xff);
// Format, WAVE, 4 bytes respectively
header[8] = 'W';
header[9] = 'A';
header[10] = 'V';
header[11] = 'E';
// Subchunk1 ID, 'fmt ', 4 bytes
header[12] = 'f';
header[13] = 'm';
header[14] = 't';
header[15] = ' ';
// Subchunk1 Size, 16, 4 bytes
header[16] = 16;
header[17] = 0;
header[18] = 0;
header[19] = 0;
// AudioFormat, pcm = 1, 2bytes
header[20] = 1;
header[21] = 0;
// NumChannels, mono = 1, stereo = 2, 2 bytes
header[22] = (byte) numChannels;
header[23] = 0;
// SampleRate, 4 bytes
header[24] = (byte) (sampleRate & 0xff);
header[25] = (byte) ((sampleRate >> 8) & 0xff);
header[26] = (byte) ((sampleRate >> 16) & 0xff);
header[27] = (byte) ((sampleRate >> 24) & 0xff);
// ByteRate = SampleRate * NumChannels * BitsPerSample / 8, 4 bytes
long byteRate = sampleRate * numChannels * bitPerSample / 8;
header[28] = (byte) (byteRate & 0xff);
header[29] = (byte) ((byteRate >> 8) & 0xff);
header[30] = (byte) ((byteRate >> 16) & 0xff);
header[31] = (byte) ((byteRate >> 24) & 0xff);
// BlockAlign = NumChannels * BitsPerSample / 8, 2 bytes
header[32] = (byte) (numChannels * bitPerSample / 8);
header[33] = 0;
// BitsPerSample, 2 bytes
header[34] = (byte) bitPerSample;
header[35] = 0;
// Subhunk2ID, data, 4 bytes
header[36] = 'd';
header[37] = 'a';
header[38] = 't';
header[39] = 'a';
// Subchunk2Size, 4 bytes
header[40] = (byte) (pcmLen & 0xff);
header[41] = (byte) ((pcmLen >> 8) & 0xff);
header[42] = (byte) ((pcmLen >> 16) & 0xff);
header[43] = (byte) ((pcmLen >> 24) & 0xff);

return header;
}

/**
* @param input raw PCM data
* limit of file size for wave file: < 2^(2*4) - 36 bytes (~4GB)
* @param output file to encode to in wav format
* @param channelCount number of channels: 1 for mono, 2 for stereo, etc.
* @param sampleRate sample rate of PCM audio
* @param bitsPerSample bits per sample, i.e. 16 for PCM16
* @throws IOException in event of an error between input/output files
* @see <a href="http://soundfile.sapp.org/doc/WaveFormat/">soundfile.sapp.org/doc/WaveFormat</a>
*/
static public void pcmToWav(File input, File output, int channelCount, int sampleRate, int bitsPerSample) throws IOException {
final int inputSize = (int) input.length();

try (OutputStream encoded = new FileOutputStream(output)) {
// WAVE RIFF header
writeToOutput(encoded, "RIFF"); // chunk id
writeToOutput(encoded, 36 + inputSize); // chunk size
writeToOutput(encoded, "WAVE"); // format

// SUB CHUNK 1 (FORMAT)
writeToOutput(encoded, "fmt "); // subchunk 1 id
writeToOutput(encoded, 16); // subchunk 1 size
writeToOutput(encoded, (short) 1); // audio format (1 = PCM)
writeToOutput(encoded, (short) channelCount); // number of channelCount
writeToOutput(encoded, sampleRate); // sample rate
writeToOutput(encoded, sampleRate * channelCount * bitsPerSample / 8); // byte rate
writeToOutput(encoded, (short) (channelCount * bitsPerSample / 8)); // block align
writeToOutput(encoded, (short) bitsPerSample); // bits per sample

// SUB CHUNK 2 (AUDIO DATA)
writeToOutput(encoded, "data"); // subchunk 2 id
writeToOutput(encoded, inputSize); // subchunk 2 size
copy(new FileInputStream(input), encoded);
}
}

/**
* Writes string in big endian form to an output stream
*
* @param output stream
* @param data string
* @throws IOException
*/
public static void writeToOutput(OutputStream output, String data) throws IOException {
for (int i = 0; i < data.length(); i++)
output.write(data.charAt(i));
}

public static void writeToOutput(OutputStream output, int data) throws IOException {
output.write(data & 0xff);
output.write(data >> 8 & 0xff);
output.write(data >> 16 & 0xff);
output.write(data >> 24 & 0xff);
}

public static void writeToOutput(OutputStream output, short data) throws IOException {
output.write(data & 0xff);
output.write(data >> 8 & 0xff);
}

public static long copy(InputStream source, OutputStream output)
throws IOException {
return copy(source, output, TRANSFER_BUFFER_SIZE);
}

public static long copy(InputStream source, OutputStream output, int bufferSize) throws IOException {
long read = 0L;
byte[] buffer = new byte[bufferSize];
for (int n; (n = source.read(buffer)) != -1; read += n) {
output.write(buffer, 0, n);
}
return read;
}
}

AudioPlay.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
package com.ho1ho.audioexample.utils;

import android.media.AudioManager;
import android.media.AudioTrack;
import android.media.MediaCodec;
import android.media.MediaExtractor;
import android.media.MediaFormat;
import android.util.Log;

import com.ho1ho.audioexample.CustomApplication;

import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.Arrays;
import java.util.Objects;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

/**
* Created by Michael Leo <y@ho1ho.com><br>
* Date: 2020/02/11 10:03
* <p>
* Code is far away from bug with the animal protecting
* </p>
* <pre>
* ----------Dragon be here!----------/
* ┏┓ ┏┓
* ┏┛┻━━━┛┻┓
* ┃ ┃
* ┃ ━ ┃
* ┃ ┳┛ ┗┳ ┃
* ┃ ┃
* ┃ ┻ ┃
* ┃ ┃
* ┗━┓ ┏━┛
* ┃ ┃神兽保佑
* ┃ ┃代码无BUG!
* ┃ ┗━━━┓
* ┃ ┣┓
* ┃ ┏┛
* ┗┓┓┏━┳┓┏┛
* ┃┫┫ ┃┫┫
* ┗┻┛ ┗┻┛
* ━━━━━━神兽出没━━━━━━
* </pre>
*/
public class AudioPlay {
private static final String TAG = AudioPlay.class.getSimpleName();

private ExecutorService mThreadPool = Executors.newFixedThreadPool(1);

private static int mTrackBufferSize;
private AudioTrack mAudioTrack;
private MediaCodec mAudioDecoder;
// private AacDecoder mAacDecoder;
private MediaExtractor mMediaExtractor;

private boolean mIsPlaying;

private int mSampleRate;
private int mChannelMask;
private int mAudioFormat;
private byte[] mCsd0;

public AudioPlay(int sampleRate, int channelMask, int audioFormat, byte[] csd0) {
this(sampleRate, channelMask, audioFormat,
AudioTrack.getMinBufferSize(sampleRate, channelMask, audioFormat) * 2, csd0);
}

public AudioPlay(int sampleRate, int channelMask, int audioFormat, int trackBufferSize, byte[] csd0) {
Log.e(TAG, "trackBufferSize=" + trackBufferSize);

mSampleRate = sampleRate;
mChannelMask = channelMask;
mAudioFormat = audioFormat;
mTrackBufferSize = trackBufferSize;
mCsd0 = csd0;

// mAacDecoder = new AacDecoder(sampleRate,
// channelMask,
// audioFormat,
// AUDIO_CSD_0,
// bufferSize);
}

private void initAudioTrack() {
mAudioTrack = new AudioTrack(AudioManager.STREAM_MUSIC,
mSampleRate,
mChannelMask,
mAudioFormat,
mTrackBufferSize,
AudioTrack.MODE_STREAM);

mAudioTrack.play();
}

// https://juejin.im/post/5c36bbad6fb9a049d975676b
// https://stackoverflow.com/questions/56106877/how-to-decode-aac-formatmp4-audio-file-to-pcm-format-in-android
// https://www.jianshu.com/p/b30d6a4f745b
// https://blog.csdn.net/lavender1626/article/details/80431902
// http://sohailaziz05.blogspot.com/2014/06/mediacodec-decoding-aac-android.html
private void initAudioDecoder() {
try {
String folder = Objects.requireNonNull(CustomApplication.instance.getExternalFilesDir(null)).getAbsolutePath() + File.separator + "leo-audio";
File mFilePath = new File(folder, "original.aac");

mMediaExtractor = new MediaExtractor();
mMediaExtractor.setDataSource(mFilePath.getAbsolutePath());

MediaFormat format = mMediaExtractor.getTrackFormat(0);
String mime = format.getString(MediaFormat.KEY_MIME);
if (mime.startsWith("audio")) {
mMediaExtractor.selectTrack(0);

// format.setString(MediaFormat.KEY_MIME, MediaFormat.MIMETYPE_AUDIO_AAC);
// format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1);
// format.setInteger(MediaFormat.KEY_SAMPLE_RATE, AacEncoder.DEFAULT_SAMPLE_RATES);
// format.setInteger(MediaFormat.KEY_BIT_RATE, AacEncoder.DEFAULT_BIT_RATES);
// format.setInteger(MediaFormat.KEY_CHANNEL_MASK, CHANNEL_IN);
// format.setInteger(MediaFormat.KEY_IS_ADTS, 1);
// format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);

mAudioDecoder = MediaCodec.createDecoderByType(mime);//创建Decode解码器
format = MediaFormat.createAudioFormat(MediaFormat.MIMETYPE_AUDIO_AAC, mSampleRate, 1);

ByteBuffer csd_0 = ByteBuffer.wrap(mCsd0);
format.setByteBuffer("csd-0", csd_0);

mAudioDecoder.configure(format, null, null, 0);
} else {
return;
}
} catch (IOException e) {
e.printStackTrace();
}

if (mAudioDecoder == null) {
Log.e("MA", "mAudioDecoder is null");
return;
}
mAudioDecoder.start();
}

public void playPcm() {
mIsPlaying = true;

initAudioTrack();

String folder = Objects.requireNonNull(CustomApplication.instance.getExternalFilesDir(null)).getAbsolutePath() + File.separator + "leo-audio";
File pcmFile = new File(folder, "original.pcm");

mThreadPool.execute(new Runnable() {
@Override
public void run() {
byte[] pcmData = new byte[mTrackBufferSize];
try (BufferedInputStream is = new BufferedInputStream(new FileInputStream(pcmFile))) {
while (true) {
int readSize = is.read(pcmData, 0, pcmData.length);
if (readSize > 0) {
mAudioTrack.write(pcmData, 0, pcmData.length);
} else {
break;
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
});
}

public void playAac() {
mIsPlaying = true;

initAudioDecoder();
initAudioTrack();

mThreadPool.execute(new Runnable() {
@Override
public void run() {
try {
boolean isFinish = false;
MediaCodec.BufferInfo decodeBufferInfo = new MediaCodec.BufferInfo();
while (!isFinish && mIsPlaying) {
int inputIndex = mAudioDecoder.dequeueInputBuffer(10_000);
Log.w(TAG, "inputIndex=" + inputIndex);
if (inputIndex < 0) {
isFinish = true;
}
ByteBuffer inputBuffer = mAudioDecoder.getInputBuffer(inputIndex);
inputBuffer.clear();
int sampleSize = mMediaExtractor.readSampleData(inputBuffer, 0);
byte[] sampleData = new byte[inputBuffer.remaining()];
inputBuffer.get(sampleData);
Log.i(TAG, "Sample aac data[" + sampleData.length + "]=" + Arrays.toString(sampleData));
if (sampleSize > 0) {
mAudioDecoder.queueInputBuffer(inputIndex, 0, sampleSize, 0 /*mMediaExtractor.getSampleTime()*/, 0);
mMediaExtractor.advance();
} else {
mAudioDecoder.queueInputBuffer(inputIndex, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
isFinish = true;
}

int outputIndex = mAudioDecoder.dequeueOutputBuffer(decodeBufferInfo, 10_000);
Log.e(TAG, "outputIndex=" + outputIndex);

ByteBuffer outputBuffer;
byte[] chunkPCM;
while (outputIndex >= 0) {
outputBuffer = mAudioDecoder.getOutputBuffer(outputIndex);
chunkPCM = new byte[decodeBufferInfo.size];
outputBuffer.get(chunkPCM);
outputBuffer.clear();
if (chunkPCM.length > 0) {
Log.i(TAG, "PCM data[" + chunkPCM.length + "]=" + Arrays.toString(chunkPCM));
mAudioTrack.write(chunkPCM, 0, decodeBufferInfo.size);
}
mAudioDecoder.releaseOutputBuffer(outputIndex, false);
outputIndex = mAudioDecoder.dequeueOutputBuffer(decodeBufferInfo, 10_000);
}
}
} catch (Exception e) {
e.printStackTrace();
} finally {
stop();
}
}
});
}

public void stop() {
mIsPlaying = false;

if (mAudioDecoder != null) {
try {
mAudioDecoder.stop();
} catch (Exception e) {
e.printStackTrace();
}
}
if (mAudioTrack != null) {
try {
mAudioTrack.stop();
} catch (Exception e) {
e.printStackTrace();
}
}

// if (mThreadPool != null) {
// mThreadPool.shutdown();
// }
}

public void release() {
stop();

if (mAudioDecoder != null) {
try {
mAudioDecoder.release();
} catch (Exception e) {
e.printStackTrace();
}
mAudioDecoder = null;
}
if (mAudioTrack != null) {
try {
mAudioTrack.release();
} catch (Exception e) {
e.printStackTrace();
}
mAudioTrack = null;
}

if (mMediaExtractor != null) {
mMediaExtractor.release();
mMediaExtractor = null;
}

if (mThreadPool != null) {
mThreadPool.shutdownNow();
mThreadPool = null;
}
}

public boolean isPlaying() {
return mIsPlaying;
}
}

ByteUtil.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
package com.ho1ho.audioexample.utils.others;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;

/**
* Author: Michael Leo
* Date: 19-8-30 下午4:20
*/
public class ByteUtil {

private static ByteBuffer buffer = ByteBuffer.allocate(8);

public static byte int2Byte(int x) {
return (byte) x;
}

public static byte[] byte2Bytes(byte b) {
return new byte[]{b};
}

/**
* Force convert int value in byte array.
*
* @param val The val will be converted to byte.
* @return The byte array.
*/
public static byte[] intAsByteAndForceToBytes(int val) {
return byte2Bytes(int2Byte(val));
}

public static int byte2Int(byte b) {
return b & 0xFF;
}

public static int bytes2Int(byte[] b) {
return b[3] & 0xFF |
(b[2] & 0xFF) << 8 |
(b[1] & 0xFF) << 16 |
(b[0] & 0xFF) << 24;
}

public static int bytes2IntLE(byte[] b) {
return (b[3] & 0xFF) << 24 |
(b[2] & 0xFF) << 16 |
(b[1] & 0xFF) << 8 |
(b[0] & 0xFF) << 0;
}

public static int bytes2Int(byte[] b, int index) {
return b[index + 3] & 0xFF |
(b[index + 2] & 0xFF) << 8 |
(b[index + 1] & 0xFF) << 16 |
(b[index + 0] & 0xFF) << 24;
}

public static int bytes2IntLE(byte[] b, int index) {
return b[index] & 0xFF |
(b[index + 1] & 0xFF) << 8 |
(b[index + 2] & 0xFF) << 16 |
(b[index + 3] & 0xFF) << 24;
}

public static byte[] int2Bytes(int a) {
return new byte[]{
(byte) ((a >> 24) & 0xFF),
(byte) ((a >> 16) & 0xFF),
(byte) ((a >> 8) & 0xFF),
(byte) (a & 0xFF)
};
}

public static byte[] intLE2Bytes(int a) {
return new byte[]{
(byte) ((a >> 0) & 0xFF),
(byte) ((a >> 8) & 0xFF),
(byte) ((a >> 16) & 0xFF),
(byte) ((a >> 24) & 0xFF)
};
}

public static void bytes2Short(byte[] b, short s, int index) {
b[index + 1] = (byte) (s >> 8);
b[index + 0] = (byte) (s >> 0);
}

public static void bytes2ShortLE(byte[] b, short s, int index) {
b[index + 0] = (byte) (s >> 8);
b[index + 1] = (byte) (s >> 0);
}

public static short bytes2Short(byte[] b, int index) {
return (short) (((b[index + 0] << 8) | b[index + 1] & 0xff));
}

public static short bytes2ShortLE(byte[] b, int index) {
return (short) (((b[index + 1] << 8) | b[index + 0] & 0xff));
}

public static byte[] short2Bytes(short s) {
byte[] targets = new byte[2];
for (int i = 0; i < 2; i++) {
int offset = (targets.length - 1 - i) * 8;
targets[i] = (byte) ((s >>> offset) & 0xff);
}
return targets;
}

public static byte[] shortLE2Bytes(short s) {
byte[] targets = new byte[2];
for (int i = 0; i < 2; i++) {
targets[i] = (byte) ((s >>> i * 8) & 0xff);
}
return targets;
}

public static short bytes2Short(byte[] b) {
return bytes2Short(b, 0);
}

public static short bytes2ShortLE(byte[] b) {
return bytes2ShortLE(b, 0);
}

public static byte[] long2Bytes(long x) {
buffer.putLong(0, x);
return buffer.array();
}

public static long bytes2Long(byte[] bytes) {
buffer.put(bytes, 0, bytes.length);
buffer.flip();//need flip
return buffer.getLong();
}

public static byte[] getBytes(byte[] data, int start, int end) {
byte[] ret = new byte[end - start];
for (int i = 0; (start + i) < end; i++) {
ret[i] = data[start + i];
}
return ret;
}

public static byte[] readInputStream(InputStream inStream) {
ByteArrayOutputStream outStream = null;
try {
outStream = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
byte[] data = null;
int len = 0;
while ((len = inStream.read(buffer)) != -1) {
outStream.write(buffer, 0, len);
}
data = outStream.toByteArray();
return data;
} catch (IOException e) {
return null;
} finally {
try {
if (outStream != null) {
outStream.close();
}
if (inStream != null) {
inStream.close();
}
} catch (IOException e) {
return null;
}
}
}

public static InputStream readByteArr(byte[] b) {
return new ByteArrayInputStream(b);
}


public static boolean isEqual(byte[] s1, byte[] s2) {
int slen = s1.length;
if (slen == s2.length) {
for (int index = 0; index < slen; index++) {
if (s1[index] != s2[index]) {
return false;
}
}
return true;
}
return false;
}

public static String getString(byte[] s1, String encode, String err) {
try {
return new String(s1, encode);
} catch (UnsupportedEncodingException e) {
return err;
}
}

public static String getString(byte[] s1, String encode) {
return getString(s1, encode, null);
}

public static String bytes2HexString(byte[] b) {
String result = "";
for (int i = 0; i < b.length; i++) {
result += Integer.toString((b[i] & 0xff) + 0x100, 16).substring(1);
}
return result;
}

public static int hexString2Int(String hexString) {
return Integer.parseInt(hexString, 16);
}

public static String int2Binary(int i) {
return Integer.toBinaryString(i);
}

// public static byte[] mergeBytes(byte[] b1, byte[] b2) {
// byte[] b3 = new byte[b1.length + b2.length];
// System.arraycopy(b1, 0, b3, 0, b1.length);
// System.arraycopy(b2, 0, b3, b1.length, b2.length);
// return b3;
// }

public static byte[] mergeByte(byte... bs) {
byte[] result = new byte[bs.length];
System.arraycopy(bs, 0, result, 0, result.length);
return result;
}

public static byte[] mergeBytes(byte[]... byteList) {
int lengthByte = 0;
for (byte[] bytes : byteList) {
lengthByte += bytes.length;
}
byte[] allBytes = new byte[lengthByte];
int countLength = 0;
for (byte[] b : byteList) {
System.arraycopy(b, 0, allBytes, countLength, b.length);
countLength += b.length;
}
return allBytes;
}

public static void main(String[] args) {
System.err.println(isEqual(new byte[]{1, 2}, new byte[]{1, 2}));
System.err.println(bytes2HexString(new byte[]{0, 0, 1, 0}));
System.err.println(bytes2Int(new byte[]{0, 0, 1, 0}));
System.err.println(bytes2IntLE(new byte[]{0, 0, 1, 0}));
// System.err.println(JsonUtil.toHexadecimalString(int2Bytes(6867)));
// System.err.println(JsonUtil.toHexadecimalString(intLE2Bytes(6867)));
}
}

AndroidManifest.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools"
package="com.ho1ho.audioexample">

<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />

<application
android:name=".CustomApplication"
android:allowBackup="false"
android:icon="@mipmap/ic_launcher"
android:label="@string/app_name"
android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true"
android:theme="@style/AppTheme"
tools:ignore="GoogleAppIndexingWarning">

<activity android:name=".MainActivity">
<intent-filter>
<action android:name="android.intent.action.MAIN" />

<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
</activity>
</application>

</manifest>

关于 CSD (Codec Specific Data)

根据 AAC Profile,采样率和声道数计算 CSD

CSD(Codec Specific Data) 一共占用两个字节,共计 16 位二进制。
计算 CSD 需要用到三个必要参数:AAC Profile(占用 5 bits),采样率(占用 4 bits)和声道数(占用 4 bits),还有预留关键字(占用 3 bits),其值为0。
例如:音频参数 AAC-LC,44.1Khz,Mono。对应的十进制值是 2, 4, 1(详见上文 AacEncoder.java 中的注释)。
转换成二进制值是:10, 100, 1
再转换成它们各自应该占用的位数:00010,0100,0001,000
重新将它们排列成 4 位一组:0001,0010,0000,1000
再将其转换成对应的十六进制: 0x1,0x2,0x0,0x8
从而得到 CSD 值为:0x12,0x08

1
2
3
4
5
6
7
8
9
10
11
12
public byte[] getAudioEncodingCsd0(int aacProfile, int sampleRate, int channelCount) {
int freqIdx = getSampleFrequencyIndex(sampleRate);

ByteBuffer csd = ByteBuffer.allocate(2);
csd.put(0, (byte) (aacProfile << 3 | freqIdx >> 1));
csd.put(1, (byte) ((freqIdx & 0x01) << 7 | channelCount << 3));

byte[] csd0 = new byte[2];
csd.get(csd0);
csd.clear();
return csd0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// 将采样率转换成 Frequency Index
public int getSampleFrequencyIndex(int sampleRate) {
switch (sampleRate) {
case 7350:
return 12;
case 8000:
return 11;
case 11025:
return 10;
case 12000:
return 9;
case 16000:
return 8;
case 22050:
return 7;
case 24000:
return 6;
case 32000:
return 5;
case 44100:
return 4;
case 48000:
return 3;
case 64000:
return 2;
case 88200:
return 1;
case 96000:
return 0;
default:
return -1;
}
}

根据 CSD 计算 AAC Profile,采样率和声道数

1
2
3
int profile = (csd[0] >> 3) & 0x1F; // AAC Profile
int freqIdx = ((csd[0] & 0x7) << 1) | ((csd[1] >> 7) & 0x1); // Frequency index 可转换成采样率
int chanCfg = (csd[1] >> 3) & 0xF; // 声道数

其它说明

关于上述演示代码的说明

​ 上述代码是我最早写的一个 Demo 代码,并未做优化,而且录音和解码放到了同一线程。在做单机测试时不会有问题,不过做多机实时通话时,会有 1 秒左右的延迟。下文会对延迟是如何产生的进行详细的说明。

注意:在实际的项目中,我已经对上述代码进行了优化和重构,并且将录音,编码,发送数据放到了不同的线程来处理。但是由于种种原因不方便贴出修改后的代码,望大家谅解。不过大家可以自行优化一下。

发送实时语音的延迟问题

​ 起初在实际项目中,测试双向实时通话功能时,发现大约有 1 秒多的延迟,经过调查找到了产生延迟的原因。

​ 以微信实时语音的配置为例,采样率 8000,双声道,采样精度 16 位的情况下,通过 AudioRecord.read() 方法获取 PCM 数据时,每次取出的 PCM 长度为 1280 个字节,首次获取数据所需要的时间最长,大约 250 ms,以后每次获取 PCM 数据耗时 40ms 左右(三星 Galaxy S7 edge-G9350)。如果再按如下配置编码成 AAC 的话:采样率 8000,码率 16000,双声道,AAC 规格(AAC Profile)为 AAC-LC,AudioRecord.getMinBufferSize 值为 1280 个字节,大约每获取三次 PCM 数据,才能成功编码出一次 AAC 数据,编码出的 AAC 数据长度大约是 260 个字节左右。也就是说大约每 120 ms,才能生成一次 AAC 数据。

​ 将 AAC 数据实时传递给另一手机进行编码成 PCM 并播放时(A 手机只要编码成了 AAC 数据就实时发给 B 手机),大约每接收到 6 次 AAC 数据才能播放出声音。也就是说至少要录音 18 次所生成的 PCM 数据编码成 AAC 后才能用于播放,18 次录音大约总耗时 250 + 17*40 = 930 ms。最后再加上双方的编解码时间(使用 MediaCodec 对 PCM 进行编解码的时间都很短,并不会产生明显的性能问题)和网络传输时间(测试网络为内网,网速极佳),实测产生约 1.2 秒左右的延迟。

​ 到这里就比较清楚了,由于录音次数过多导致了延迟。于是我又做了另一个实验,不对 PCM 进行编码,而且直接发送 PCM 数据的话,实测时,延迟一下子就降到了 500ms 左右。不过由于没有对 PCM 编码,导致每次发送的数据量增大了很多(之前 PCM 与 AAC 的压缩比为 16 倍左右)。之后要解决的问题就是如何使用其它方式来压缩 PCM 数据了。

吐槽

  这里我必须吐下槽某些人。我之前在调查时,就看到有些人评论别人的文章。看到别人发的代码里有些日志类等没有发,就会问“为什么 Copy 代码后无法编译啊?”大哥,每个人的日志类等一些无关业务的代码,基本都会封装下,想编译过,你不会自己简单改一下啊。人家文章里代码明明就写的已经非常的详细了,只要把代码 Copy 到项目里,真的只要简单修改下日志等最基本的东西代码就可以用,非得向人家要整个工程。非得给你发完整工程你才会用吗?要是这样的话,那你还看别人代码干什么。再说了,就算给你发完整工程,要是 gradle 版本你没有,SDK 版本你没有,是不是你也得问为什么代码报错啊无法编译啊。

  实在看不惯这样的人,你要是懒到都不愿意 Copy 代码或者是真的不知道应该改哪,那在你发问前还是先好好学习下 Android 基础开发吧。)

  我们看别人的文章时,应该是在找灵感,知道了自己少做了什么,应该做什么。而不是上来就要现成的。人家不是你的老师不是你的父母,没有任何义务把所有东西都给你。人家把最主要的知识教给你就行了,举一反三就得要靠自己了。就像考试一样,不是试卷上的每一道题你之前都做过,要是有你没见过的类型,难道你还会告诉老师自己不及格的原因是试卷上的题你没有给我讲过吗?

参考文献

坚持原创及高品质技术分享,您的支持将鼓励我继续创作!