-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
モーフィング機能を追加する #713
Open
qryxip
wants to merge
62
commits into
VOICEVOX:main
Choose a base branch
from
qryxip:add-morphing
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
モーフィング機能を追加する #713
Changes from all commits
Commits
Show all changes
62 commits
Select commit
Hold shift + click to select a range
aa80388
`to_wav`を移動
qryxip 883803a
モーフィング機能を追加する
qryxip 06556c9
`Permission`に`StyleId`を持たせる
qryxip 21e0715
Minor refactor
qryxip a31bd96
voicevox_core.hをアップデート
qryxip 2acd5e8
`readonly`をやめる
qryxip 1162102
`Permission` → `MorphablePair`
qryxip 82260ca
[skip ci] `MorphablePair` → `MorphableTargets`
qryxip ae080b5
[skip ci] Minor refactor
qryxip 26a72e0
[skip ci] Minor refactor
qryxip 599f6ad
[skip ci] Merge branch 'main' into add-morphing
qryxip 58d6d7d
snapshots.tomlを更新
qryxip 66be03f
`mingw-w64-x86_64-clang`をインストール
qryxip ccd3c81
`windows-x86-cpu`の`can_skip_in_simple_test`を外す
qryxip 459a881
KyleMayes/install-llvm-actionを使う
qryxip 7205c39
`i686-pc-windows-msvc`からClangのインストールを外してみる
qryxip 471264d
Revert "`windows-x86-cpu`の`can_skip_in_simple_test`を外す"
qryxip 58f6f90
Revert "`i686-pc-windows-msvc`からClangのインストールを外してみる"
qryxip 8ff5a5e
sample.vvmを更新
qryxip 706fdac
`morphable_targets`の単体テスト
qryxip e21c61c
`24000` → `DEFAULT_SAMPLING_RATE`
qryxip f53fa11
FIXMEを追加
qryxip 57a81f3
内部メソッド名変更
qryxip 0b896ea
`Morph` → `SpeakerFeature`
qryxip c8c85b0
`to_wav`を移動
qryxip e1f94b1
FIXMEコメント変更
qryxip 38b8732
"WARNING"を消す
qryxip e283209
voicevox_core.hをアップデート
qryxip bdf874f
C API実装
qryxip 51e22bf
Merge branch 'main' into add-morphing
qryxip 503f035
Python APIの実装
qryxip 9c70222
`morph_rate`を`f32`から`f64`に
qryxip 27a4c7a
Java APIの実装
qryxip 5b014ec
docstringを書く
qryxip 13dcca2
スペクトログラムの計算を修正
qryxip 6d2eb80
`SpeakerFeatureException`の追加
qryxip dbbf89c
不要な`todo!`分岐を削除
qryxip f04380e
`Synthesizer`のimplを`morph`側に寄せる
qryxip 03d5055
FIXME追加
qryxip 9c41398
`synthesis_morphing`のテスト
qryxip 8891bb8
`MorphableTargets` → `MorphableStyles`
qryxip 8904be2
スペクトログラムをndarrayで捌く
qryxip 4444c69
Minor refactor
qryxip bdb7c3b
C APIでも16通りテストする
qryxip b6d81fa
Merge branch 'main' into add-morphing
qryxip 3710699
Merge branch 'main' into add-morphing
qryxip 5035740
テストを更新
qryxip 998977d
Merge branch 'main' into add-morphing
qryxip 2443754
Merge branch 'main' into add-morphing
qryxip 1925685
Merge branch 'main' into add-morphing
qryxip 2827328
TODOコメントを更新
qryxip 6eb6e40
Merge branch 'main' into add-morphing
qryxip 6f86e8a
Merge branch 'main' into add-morphing
qryxip cfe60f0
Merge branch 'main' into add-morphing
qryxip c25cde9
Merge branch 'main' into add-morphing
qryxip d55a3a8
Merge branch 'main' into add-morphing
qryxip 6bb862c
Merge branch 'main' into add-morphing
qryxip 3b839d4
Merge branch 'main' into add-morphing
qryxip 191ccea
Merge branch 'main' into add-morphing
qryxip 3b8f429
Fix a test
qryxip 0ddd0e4
Merge branch 'main' into add-morphing
qryxip 97b2e81
fixup! Fix a test
qryxip File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
use std::io::{Cursor, Write as _}; | ||
|
||
use az::{Az as _, Cast}; | ||
use num_traits::Float; | ||
|
||
use crate::{synthesizer::DEFAULT_SAMPLING_RATE, AudioQueryModel}; | ||
|
||
pub(crate) fn to_wav<T: Float + From<i16> + From<f32> + Cast<i16>>( | ||
wave: &[T], | ||
audio_query: &AudioQueryModel, | ||
) -> Vec<u8> { | ||
// TODO: https://github.com/VOICEVOX/voicevox_core/issues/762 | ||
|
||
let volume_scale = *audio_query.volume_scale(); | ||
let output_stereo = *audio_query.output_stereo(); | ||
let output_sampling_rate = *audio_query.output_sampling_rate(); | ||
|
||
// TODO: 44.1kHzなどの対応 | ||
qryxip marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
let num_channels: u16 = if output_stereo { 2 } else { 1 }; | ||
let bit_depth: u16 = 16; | ||
let repeat_count: u32 = (output_sampling_rate / DEFAULT_SAMPLING_RATE) * num_channels as u32; | ||
let block_size: u16 = bit_depth * num_channels / 8; | ||
|
||
let bytes_size = wave.len() as u32 * repeat_count * 2; | ||
let wave_size = bytes_size + 44; | ||
|
||
let buf: Vec<u8> = Vec::with_capacity(wave_size as usize); | ||
let mut cur = Cursor::new(buf); | ||
|
||
cur.write_all("RIFF".as_bytes()).unwrap(); | ||
cur.write_all(&(wave_size - 8).to_le_bytes()).unwrap(); | ||
cur.write_all("WAVEfmt ".as_bytes()).unwrap(); | ||
cur.write_all(&16_u32.to_le_bytes()).unwrap(); // fmt header length | ||
cur.write_all(&1_u16.to_le_bytes()).unwrap(); //linear PCM | ||
cur.write_all(&num_channels.to_le_bytes()).unwrap(); | ||
cur.write_all(&output_sampling_rate.to_le_bytes()).unwrap(); | ||
|
||
let block_rate = output_sampling_rate * block_size as u32; | ||
|
||
cur.write_all(&block_rate.to_le_bytes()).unwrap(); | ||
cur.write_all(&block_size.to_le_bytes()).unwrap(); | ||
cur.write_all(&bit_depth.to_le_bytes()).unwrap(); | ||
cur.write_all("data".as_bytes()).unwrap(); | ||
cur.write_all(&bytes_size.to_le_bytes()).unwrap(); | ||
|
||
for &value in wave { | ||
let v = num_traits::clamp( | ||
value * <T as From<_>>::from(volume_scale), | ||
-T::one(), | ||
T::one(), | ||
); | ||
let data = (v * <T as From<_>>::from(0x7fff)).az::<i16>(); | ||
for _ in 0..repeat_count { | ||
cur.write_all(&data.to_le_bytes()).unwrap(); | ||
} | ||
} | ||
|
||
cur.into_inner() | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,15 +1,18 @@ | ||
mod acoustic_feature_extractor; | ||
pub(crate) mod audio_file; | ||
mod full_context_label; | ||
mod kana_parser; | ||
mod model; | ||
mod mora_list; | ||
mod morph; | ||
pub(crate) mod open_jtalk; | ||
|
||
pub(crate) use self::acoustic_feature_extractor::OjtPhoneme; | ||
pub(crate) use self::audio_file::to_wav; | ||
pub(crate) use self::full_context_label::{ | ||
extract_full_context_label, mora_to_text, FullContextLabelError, | ||
}; | ||
pub(crate) use self::kana_parser::{create_kana, parse_kana, KanaParseError}; | ||
pub use self::model::{AccentPhraseModel, AudioQueryModel, MoraModel}; | ||
pub use self::model::{AccentPhraseModel, AudioQueryModel, MoraModel, MorphableTargetInfo}; | ||
pub(crate) use self::mora_list::mora2text; | ||
pub use self::open_jtalk::FullcontextExtractor; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
リリース時以外スキップされる
windows-x86-cpu
(i686-pc-windows-msvc
)に対しても必要。あるとビルドできるし無いとビルドできない。can_skip_in_simple_test
を外して確かめた。