-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Japanese Lorem sentences look more natural #1918
Conversation
Hm, it seems that each word is easier to understand if there is space because it is a sequence of random words. % ruby -rfaker -e 'p Faker::VERSION'
"2.10.1"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.sentence'
"好き きょだい 出版 超音波。"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.sentences'
["約する 殻 書き方。", "察知 そあく 割り箸。", "いちだい むらさきいろ 太る。"]
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.paragraph'
"誘惑 しずむ 量。 騎兵 全日本 けいむしょ。 きんく こうせい 飽くまでも。"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.paragraphs'
["長唄 かん こうおつ。 既に 頂く えきびょう。 旧姓 金星 はなはだ。", "平安 あう まもる。 とうさん しずむ れつあく 。 ひきざん 不思議 伐採。", "弥生 退く 地面。 つなひき よくげつ 乗せる。 輸出 ぶっきょう 見当たる。"]
IMHO, it seems better to keep separating words in Japanese with spaces (わかち書き). |
I can see how it looks like わかち書き (wakachigaki) because of all the hiragana-only words. They were added in #900 from a Kanji learning app, which means they may have originally been furigana. Perhaps I could remove all of these furigana words if they contribute to the awkwardness of the text? The reason I am proposing to remove the spaces is because lorem ipsum text is supposed to look like real-world text without meaningful content, so that designers can use it to design layouts. https://en.wikipedia.org/wiki/Lorem_ipsum
As far as I can tell, Japanese real-world text does not use wakachigaki unless it is targeted at non-native speakers. I do realise that most of us use Faker to generate dummy data for our tests, and that the spaces don't make much difference in that use case. I won't pursue the matter any further if you think this is YAGNI 😃 |
Thanks for the explanation. Actually, I don't have a strong opinion on this. Leave it to a maintainers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that lorem text should look closer to real Japanese gibberish.
Changes look good to be, but can you also add a Japanese locale test to ensure spaces are removed?
@@ -75,6 +75,7 @@ def test_ja_lorem_methods | |||
assert Faker::Lorem.words.is_a? Array | |||
assert Faker::Lorem.words(number: 1000) | |||
assert Faker::Lorem.words(number: 10_000, supplemental: true) | |||
assert_not_match(/ /, Faker::Lorem.paragraph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also add a Japanese locale test to ensure spaces are removed?
@Zeragamba
Sorry, I assumed that this would be enough to cover the changes. Could you specify what else it needs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! Sorry, I thought that was another file. :derp:
All good then!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! ❤️
Issue #1917
#1917
Description:
This pull request makes Japanese
Faker::Lorem
sentences look more natural by:This affects the following methods:
Faker::Lorem.sentence
Faker::Lorem.sentences
Faker::Lorem.paragraph
Faker::Lorem.paragraphs