Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string and char questions #309

Open
miguel-negrao opened this issue May 4, 2023 · 1 comment
Open

string and char questions #309

miguel-negrao opened this issue May 4, 2023 · 1 comment

Comments

@miguel-negrao
Copy link

miguel-negrao commented May 4, 2023

I have some questions about strings and chars.

gen::character sometimes returns the null character \0. Is this expected ? [edit] ignore, it does not, a char generator does.

Most of the characters in the strings generated by gen::string are not printable. Is this expected ?

�򁓰ӈi#5¢o�kO�)���'���9�'�<E�5��8��'�
string: '>�<0� �M*����oo��l�|M�c'
�&���Z�'('d���0�`�����'
string: 'w]��3��k��(0�/%����ߧ#'߷�=����;���9�27�'
R�����Ŭ4L�����'��r�
string: '�'Ž���m��6(�ӂ���V��uP̦'$0���7� `e����!3a��0�������'
string: '�F*�$
              ��yӍnF��� ��v�|P��S$Y�:O�"U�,�|F����s�4+� 1=��Z����Bz��7����'
string: '�����9�,��@\��&/p20=���1�
Z [��&����+5�5����f�"���(�M�&�+�����'
string: '����*
              :g\���W�7�$���?ݹ?1�(T���0ܻ-ǨI��'
string: '��_��"�q
��#@�[2��|���

Is the generator generating valid ASCII chars, or any char at all (full 255 values) ?

thanks.

@jonathon-bell
Copy link
Contributor

jonathon-bell commented Aug 26, 2023

I believe this behavior is by design; see here.

Generating strings of specific character classes is easy, however:

/**
 * Returns a generator that yields elements of the specified string with equal
 * probability.
 *
 * @param  s  A list of characters to draw from.
 *
 * @return A generator that produces elements of 's' with equal probability.
 */
inline rc::Gen<char> character(std::string_view s)
{
  assert(!s.empty());

  return s.size() == 1 ? rc::gen::just     (s.front())
                       : rc::gen::elementOf(s);
}

/**
 * Returns a generator that yields strings of elements of the given string.
 *
 * @param  s  A list of characters to draw from.
 *
 * @return A generator that produces strings of elements of 's'.
 */
inline rc::Gen<std::string> string(std::string_view s)
{
  return rc::gen::container<std::string>(character(s));
}

const auto blank  = " \t"s;
const auto space  = " \f\n\r\t\v"s;
const auto digit  = "0123456789"s;
const auto lower  = "abcdefghijklmnopqrstuvwxyz"s;
const auto upper  = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"s;
const auto punct  = "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"s;
const auto cntrl  = "\00\01\02\03\04\05\06\07\10\11\12\13\14\15\16\17\20\21\22\23\24\25\26\27\30\31\32\33\34\35\36\37\x7F"s;
const auto alpha  = lower + upper;
const auto alnum  = alpha + digit;
const auto graph  = alnum + punct;
const auto print  = graph + ' ';
const auto xdigit = digit + "abcdefABCDEF";

TEST_CASE("blah blah...")
{
  /**
   * Returns 'true' if every character of the given string satisfies the given
   * predicate.
   */
  auto f = [](std::string_view s, int (predicate)(int))
  {
    return std::ranges::all_of(s, predicate);
  };

  check("alnum",  [f](){return f(*string(alnum ), std::isalnum );});
  check("alpha",  [f](){return f(*string(alpha ), std::isalpha );});
  check("blank",  [f](){return f(*string(blank ), std::isblank );});
  check("cntrl",  [f](){return f(*string(cntrl ), std::iscntrl );});
  check("digit",  [f](){return f(*string(digit ), std::isdigit );});
  check("graph",  [f](){return f(*string(graph ), std::isgraph );});
  check("lower",  [f](){return f(*string(lower ), std::islower );});
  check("print",  [f](){return f(*string(print ), std::isprint );});
  check("punct",  [f](){return f(*string(punct ), std::ispunct );});
  check("space",  [f](){return f(*string(space ), std::isspace );});
  check("upper",  [f](){return f(*string(upper ), std::isupper );});
  check("xdigit", [f](){return f(*string(xdigit), std::isxdigit);});
}

This example shows that a string can be thought of as a representation of a character class from which both characters, and thus strings, can then be generated. From this perspective, string concatenation acts as a set theoretic union of character classes. Moreover, by repetition of certain characters in the the string can be used to affect the frequency with which they occur in generated strings.

Hope this helps,

Jonathon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants