Skip to content
This repository has been archived by the owner on Jan 10, 2024. It is now read-only.

Improve quality of string output #17

Open
Fresheyeball opened this issue Sep 3, 2015 · 4 comments
Open

Improve quality of string output #17

Fresheyeball opened this issue Sep 3, 2015 · 4 comments

Comments

@Fresheyeball
Copy link

So using the following inside a loop

kind := fuzzString()
log.Println(kind)

outputs the following to the console

2015/09/03 19:04:44 ª漠Ȯʜ摽OĹ/JǼ乇C
2015/09/03 19:04:44 ǟJ翾
2015/09/03 19:04:44 J?ǻ
2015/09/03 19:04:44 (v獔s赭İĹ
2015/09/03 19:04:44 鄗彸湦軈2Ħ攩V螱
2015/09/03 19:04:44 婦_僤Ⱥ^ȡ莯yŀ亶h鍡
2015/09/03 19:04:44 鼈ǃ綺z盚ɴ憤答ǘ騮r冔J<ȈWp靌_
2015/09/03 19:04:44 禽ÊÉ3鏡碞`ɭ筹2溏#Q:M
2015/09/03 19:04:44 mĿį窌橛鑼ȠQD貧Ũroz踮穰"
2015/09/03 19:04:44 o增aʇŊL腙ɣS[ŠŪæ羑/甴q漥嚬
2015/09/03 19:04:44 úưn鉋囔敘匿
2015/09/03 19:04:44 彧森ČÅ
2015/09/03 19:04:44 ǿƲ鯱ı&ʧŞ嵔6Ÿ
2015/09/03 19:04:44 ¾U儾蓝ɽÙŝ鏁
2015/09/03 19:04:44 "绖ʚ跋拘瀤Ȍ}FWȗǻkU
2015/09/03 19:04:44 |Y&
2015/09/03 19:04:44 
2015/09/03 19:04:44 2躡溹
2015/09/03 19:04:44 
2015/09/03 19:04:44 Ğŋ歺饽i1%ʇƺF池睺嶙ĉ
2015/09/03 19:04:44 馬ƫ椹vHıÕ莍垺噽q具
2015/09/03 19:04:44 6
2015/09/03 19:04:44 ƓH(ĴƆ4t4嬚宧1
2015/09/03 19:04:44 G-ª,
2015/09/03 19:04:44 Ʀ焬陱ʃȌ飠e俿ɟ躁彞Ȱd鉛 5聳炬血
2015/09/03 19:04:44 鏶牏ª.濔挹U忘Eʤe黍
2015/09/03 19:04:44 ǩ歹ȊǾz肌憌.ʪ谌O蟆WƤ
2015/09/03 19:04:44 嵤贂Ƶm1赉
2015/09/03 19:04:44 涉
2015/09/03 19:04:44 岲憏蓹
2015/09/03 19:04:44 ʧȜP珐芫汋鰢NJ詺^涍ȷ饤劮
2015/09/03 19:04:44 秤柃SŘɎh巽貗g扆¢W醦4ɖ

This does not appear to be a good random distribution of characters. Also empty strings are outputted more than once, so duplicates are not being addressed. Ideally this library has a consistent random distribution of characters, no duplicates, and interleaved white space (also absent).

@lavalamp
Copy link
Contributor

lavalamp commented Sep 3, 2015

I think making the chance of getting an empty string configurable is desirable.

The character distribution is tricky; ideally we'd want to test characters of all UTF-8 encoding forms; that's why it is how it is. I agree spaces would be desirable.

As far as duplicates-- right now there's just a small chance (5% IIRC) of it outputting a blank string. That's on purpose, since presumably there's a lot of different places to store these strings, and we eventually want to try a blank in all of them.

@Fresheyeball
Copy link
Author

I agree the distribution is tricky, all UTF-8 is desirable. I just think it can be done much better, perhaps interleaving different sources for randomness. Using a wikipedia article as a source, as well as other texts in addition to pure UTF8 could be useful. I've outputted strings from QuickCheck and generally had more desirable results. Perhaps something in there can lead the way.

As for blank strings and dealing with duplicates. I think ideally each instance of fuzz should maintain internal state to prevent duplicates. This way the odds of encountering an empty sting can be 100% or 0% while ensuring that it does not appear more than once (running the same input more than once is a wasted test).

Just brainstorming, I think it would be desirable to configure the presence of empty stings (I generally like them), but also the guaranteed presence of special strings like \n. I'd sleep better knowing that those are always present in the input of tests.

More variance on string length as well would be appreciated.

@Fresheyeball
Copy link
Author

I ran a sample from QuickCheck if you are interested:

����������(50 tests)����������}�{O"(�
����������(51 tests)����������OLÍ�F�fÅpù�KÏG'e}j
v^ ésÈ.Rfp(Ì��ð��P�L@�dRj
����������(52 tests)����������L�bdXC�(rüLgY�;9A�aPáb%o×�
����������(53 tests)������������a
����������(54 tests)����������4åÕ�[7jT��ç|¶od
[y  �"�~2�ý�"1w��ÄA
����������(55 tests)����������PE¢�Øf[�_���
����������(56 tests)����������a��<s\#6��4\.·/H.Pbw�u0
����������(57 tests)����������ñºKi�F94¿�\��1$­
����������(58 tests)�����������93�\ Ææ�=R×�ò!g�Öt��Ï
����������(59 tests)����������]-'5�Q@kúH�M�%óRz=MM/�¹¾ì�qv��<Þ��$:ñ-�îR�7NUL=/
����������(60 tests)����������>%f~A~�%8IE�å�~!2#�LJiðhî÷�¢B¯Æ�q��RZfb��=3c~4óuÄ>¹3�
����������(61 tests)����������O�GÀX�
����������(62 tests)����������7_��wu pÙNçÁ;@R
����������(63 tests)����������So¿D1;YÕC8�N�ggu�Llpz!�é=�g%þ�
����������(64 tests)����������+
����������(65 tests)����������­f+ôO�%¼�_7Æ×w!
����������(66 tests)����������Â�¶a
l�à
����������(67 tests)����������LAÞxbj�L�*^GcG#_oV�~¯I�"�1y¢��DÆ3
�Z2£2b��`Wp7JX{�µ
����������(68 tests)����������^Ñ95���ü
C��ûß���ñ+�½¡¬)�::qüó�Ü'$ 3IÎ]0nRÎDO
����������(69 tests)����������i'�LAÓ¡s§kÔ)p�þ�zIÁwvT�X
����������(70 tests)�����������
����������(71 tests)����������~��s% ã�(�m7&,_f�g÷
0f7"MNX3JFýÇS/���':'�%��òåG[�¾��
����������(72 tests)�����������mà�£Ø#3�Ú6á :Ùk�Zzj\$��B
P�·T,�F
����������(73 tests)����������R��¥�?ww�Z�ã
����������(74 tests)����������4âWÊd¯)îEB�ß�Z�6Ú�]1ß��<Oy�¨Ã�1.�-�åp�Á\¾·Ub3�g��¨
����������(75 tests)�����������iS8
�'<£u^
�IKA�&¹Q(N³�5�A×û÷A4Wj/Fz6
����������(76 tests)�����������S��â"AtÀ�s6���Zj���Ò!1�ùæû�&BK=#b«Úû[DìØ!
����������(77 tests)����������  C¯Y·��üè�;�ä�
����������(78 tests)����������Äa>{P��h.3$¤�ñbrgr.y�
gk9�9�\H�ÑQ�%r�(&,ò
G�ngä7fó
����������(79 tests)������������&/r33^%Bm)V�%z[N� ¢9ÝH��/·mL"
k�Q�Ca_kÛ�L�FK1#G�ö�K06 c¬�{[
����������(80 tests)����������7ö�Q�#ry1L

@Fresheyeball Fresheyeball changed the title String output is low quality Improve quality of string output Sep 4, 2015
@Fresheyeball
Copy link
Author

Great lib by the way. I'm doing all my testing with gofuzz

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants