Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Japanese Lorem sentences look more natural #1918

Merged
merged 1 commit into from
Jan 27, 2020

Conversation

rastamhadi
Copy link
Contributor

Issue #1917

#1917

Description:

This pull request makes Japanese Faker::Lorem sentences look more natural by:

  • removing half-width spaces between Japanese sentences.
  • removing full-width spaces between Japanese words.

This affects the following methods:

  • Faker::Lorem.sentence
  • Faker::Lorem.sentences
  • Faker::Lorem.paragraph
  • Faker::Lorem.paragraphs

@rastamhadi rastamhadi requested a review from koic January 23, 2020 03:27
@koic
Copy link
Member

koic commented Jan 23, 2020

Hm, it seems that each word is easier to understand if there is space because it is a sequence of random words.

% ruby -rfaker -e 'p Faker::VERSION'
"2.10.1"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.sentence'
"好き きょだい 出版 超音波。"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.sentences'
["約する 殻 書き方。", "察知 そあく 割り箸。", "いちだい むらさきいろ 太る。"]
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.paragraph'
"誘惑 しずむ 量。 騎兵 全日本 けいむしょ。 きんく こうせい 飽くまでも。"
% ruby -rfaker -e 'Faker::Config.locale = "ja"; p Faker::Lorem.paragraphs'
["長唄 かん こうおつ。 既に 頂く えきびょう。 旧姓 金星 はなはだ。", "平安 あう まもる。 とうさん しずむ れつあく 。 ひきざん 不思議 伐採。", "弥生 退く 地面。 つなひき よくげつ 乗せる。 輸出 ぶっきょう 見当たる。"]

IMHO, it seems better to keep separating words in Japanese with spaces (わかち書き).

@rastamhadi
Copy link
Contributor Author

rastamhadi commented Jan 23, 2020

I can see how it looks like わかち書き (wakachigaki) because of all the hiragana-only words. They were added in #900 from a Kanji learning app, which means they may have originally been furigana. Perhaps I could remove all of these furigana words if they contribute to the awkwardness of the text?

The reason I am proposing to remove the spaces is because lorem ipsum text is supposed to look like real-world text without meaningful content, so that designers can use it to design layouts.

https://en.wikipedia.org/wiki/Lorem_ipsum

In publishing and graphic design, Lorem ipsum is a placeholder text commonly used to demonstrate the visual form of a document or a typeface without relying on meaningful content.

As far as I can tell, Japanese real-world text does not use wakachigaki unless it is targeted at non-native speakers.

I do realise that most of us use Faker to generate dummy data for our tests, and that the spaces don't make much difference in that use case. I won't pursue the matter any further if you think this is YAGNI 😃

@koic
Copy link
Member

koic commented Jan 23, 2020

Thanks for the explanation. Actually, I don't have a strong opinion on this. Leave it to a maintainers.

Copy link
Contributor

@Zeragamba Zeragamba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that lorem text should look closer to real Japanese gibberish.

Changes look good to be, but can you also add a Japanese locale test to ensure spaces are removed?

@@ -75,6 +75,7 @@ def test_ja_lorem_methods
assert Faker::Lorem.words.is_a? Array
assert Faker::Lorem.words(number: 1000)
assert Faker::Lorem.words(number: 10_000, supplemental: true)
assert_not_match(/ /, Faker::Lorem.paragraph)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add a Japanese locale test to ensure spaces are removed?

@Zeragamba
Sorry, I assumed that this would be enough to cover the changes. Could you specify what else it needs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Sorry, I thought that was another file. :derp:

All good then!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! ❤️

@rastamhadi rastamhadi removed the request for review from koic January 24, 2020 03:54
@vbrazo vbrazo merged commit 4a0d6c5 into faker-ruby:master Jan 27, 2020
@rastamhadi rastamhadi deleted the rastamhadi-patch-1 branch September 26, 2021 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants