Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Console: .ReadBool, .ReadInt, .ReadLong and so on #64621

Open
AlexGames73 opened this issue Feb 1, 2022 · 31 comments
Open

[API Proposal]: Console: .ReadBool, .ReadInt, .ReadLong and so on #64621

AlexGames73 opened this issue Feb 1, 2022 · 31 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Console
Milestone

Comments

@AlexGames73
Copy link

AlexGames73 commented Feb 1, 2022

Background and motivation

Many lovers of C# like me used console class for programming competitions and faced to a problem like 'TimeLimit' or 'OutOfMemory' verdicts. It is related to very slowly reading from console because console reads whole line in RAM (like in Python), it is their main problem at all in programming compretitions.
I want to extend console class with 'read' methods (like 'cin' in C++), which will significantly optimize reading from the console and increase the popularity of using the language in competitions.

API Proposal

namespace System
{
    public static class Console
    {
        public static string ReadToken(params char[] skipChars)
        {
            var hashSet = skipChars.ToHashSet();
            var c = (char) In.Read();
            while (hashSet.Contains(c))
            {
                c = (char) In.Read();
            }
        
            var sb = new StringBuilder();
            while (!hashSet.Contains(c))
            {
                sb.Append(c);
                c = (char) In.Read();
            }

            return sb.ToString();
        }

        public static string ReadToken() => ReadToken(' ', '\n', '\r');
        public static bool ReadBool() => bool.Parse(ReadToken());
        public static decimal ReadDecimal() => decimal.Parse(ReadToken());
        public static double ReadDouble() => double.Parse(ReadToken());
        public static float ReadFloat() => float.Parse(ReadToken());
        public static int ReadInt() => int.Parse(ReadToken());
        public static long ReadLong() => long.Parse(ReadToken());
    }
}

API Usage

var n = Console.ReadLong();
var arr = new int[n];
for (var i = 0; i < n; i++)
{
    arr[i] = Console.ReadInt();
}

Alternative Designs

No response

Risks

No response

@AlexGames73 AlexGames73 added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Feb 1, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Feb 1, 2022
@AlexGames73
Copy link
Author

I tag area-System.Console command: @jeffhandley @adamsitnik @jozkee

@ghost
Copy link

ghost commented Feb 1, 2022

Tagging subscribers to this area: @dotnet/area-system-console
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

Many lovers of C# like me used console class for programming competitions and faced to a problem like 'TimeLimit' or 'OutOfMemory' verdicts. It is related to very slowly reading from console because console reads whole line in RAM (like in Python), it is their main problem at all in programming compretitions.
I want to extend console class with 'read' methods (like 'cin' in C++), which will significantly optimize reading from the console and increase the popularity of using the language in competitions.

API Proposal

namespace System
{
    public static class Console
    {
        public static string ReadToken(params char[] skipChars)
        {
            var hashSet = skipChars.ToHashSet();
            var c = (char) In.Read();
            while (hashSet.Contains(c))
            {
                c = (char) In.Read();
            }
        
            var sb = new StringBuilder();
            while (!hashSet.Contains(c))
            {
                sb.Append(c);
                c = (char) In.Read();
            }

            return sb.ToString();
        }

        public static string ReadToken() => ReadToken(' ', '\n', '\r');
        public static bool ReadBool() => bool.Parse(ReadToken());
        public static decimal ReadDecimal() => decimal.Parse(ReadToken());
        public static double ReadDouble() => double.Parse(ReadToken());
        public static float ReadFloat() => float.Parse(ReadToken());
        public static int ReadInt() => int.Parse(ReadToken());
        public static long ReadLong() => long.Parse(ReadToken());
    }
}

API Usage

var n = Console.ReadLong();
var arr = new int[n];
for (var i = 0; i < n; i++)
{
    arr[i] = Console.ReadInt();
}

Alternative Designs

No response

Risks

No response

Author: AlexGames73
Assignees: -
Labels:

api-suggestion, area-System.Console, untriaged

Milestone: -

@Frassle
Copy link
Contributor

Frassle commented Feb 1, 2022

I suppose there could be an argument for dotnet including something like Java's Scanner class. But it shouldn't be specific to Console.

@huoyaoyuan
Copy link
Member

There can be a class parsing such values from any TextReader. It will serve like fscanf in C.

@madelson
Copy link
Contributor

madelson commented Feb 5, 2022

Is this useful outside of coding competitions? I wonder if this could just be a NuGet package.

@tannergooding
Copy link
Member

Also worth noting that if something like this were taken then ReadBool, ReadInt, ReadLong, ReadFloat, etc are all the incorrect name: https://docs.microsoft.com/en-us/dotnet/standard/design-guidelines/general-naming-conventions#avoiding-language-specific-names

As per the docs, the name is Boolean, Int32, Int64, Single, etc. All matching the name of the corresponding type in System

@AraHaan
Copy link
Member

AraHaan commented Feb 8, 2022

As for reading the things, I would rather have them use the type's TryParse methods so that way if they cant be decoded properly they can throw directly from the Console class itself.

I also agree with tanner, the names would have to be ReadBoolean, ReadInt32/ReadUInt32, ReadInt64/ReadUInt64, ReadSingle, etc.

@AlexGames73
Copy link
Author

I agree with you all that naming is important thing, but unfortunately, class Console prohibits the use of UInt32 and UInt64 as return type of methods.

Also i think that TryParse method (may be generic method) are not about reading datas, it is about parsing something from parameter of method.

I was thinking more about creating methods ReadBoolean, ReadInt32, ReadInt64, ReadSingle, etc. (ReadUInt32 and ReadUInt64 only if prohibits will have been declined), or about creating separate api like Scanner in Java.

@AraHaan
Copy link
Member

AraHaan commented Feb 9, 2022

I am pretty sure that UInt32 and UInt64 is ok anywhere I think where it is needed if there is no other way to represent something). For me, perhaps I want it to directly read in some unsigned number someone imputted into console which cannot be downcasted to a signed one as it might overflow (which is sometimes not good at all).

Besides, if we are going to do this much changes into Console for additional Read methods, we might just as well go all the way with the built in types to C#.

@epeshk
Copy link
Contributor

epeshk commented Feb 15, 2022

Programming competition judge systems disallows any external references. Solution on competitions usually is just a single file of code. So it's useless to have this API in a NuGet package.

Of course, this shouldn't be limited to Console.

i think it's nice to have these APIs even if only practical use case are competitions, because:

  1. It's not only about performance. .ReadInt32() is simpler to write and read than int.Parse(Console.ReadLine().Split()[0]). Also these APIs will be useful when input format is not strict about whitespaces and line breaks.
  2. It will simplify participating in programming contests for .NET developers
  3. It will helps to promote .NET to competitive programmers
  4. It will force developers of judge systems to upgrade to modern versions of .NET and languages from .NET Framework and Mono (because of participant's requests to supports these APIs)

@tannergooding
Copy link
Member

Programming competition judge systems disallows any external references

It is not the job or role of the BCL to appease limitations of programming competitions. Many languages (such as Rust) have very small standard libraries and it is the expectation that users pull in external dependencies even for things that some other languages consider "core". Even for the case of something like the math APIs for C/C++, some implementations (Unix) have it be an explicit separate reference (libm).

That doesn't mean it shouldn't or couldn't be included; its just not something that we typically consider as a driving reason to do it.

@epeshk
Copy link
Contributor

epeshk commented Feb 15, 2022

Many languages (such as Rust) have very small standard libraries and it is the expectation that users pull in external dependencies even for things that some other languages consider "core".

On other side C++/Java/Go/Kotlin have APIs like this.

BCL already not so small, e.g. includes JSON parsing APIs. Plain-text parsing is useful for competitions as JSON parsing useful for web apps and microservices.

@adamsitnik
Copy link
Member

System.Console already exposes 89 public method and properties. I don't believe that we should add more, especially since we might introduce a new Terminal oriented abstraction for it (#52374).

Another thing are delimiters. In the provided sample implementation the code handles a space and a new line. But how about tab? How about other delimiters?

What would be the expected output of trying to parse a line without whitespaces that contain data like this: "123true456false"?

@adamsitnik adamsitnik removed the untriaged New issue has not been triaged by the area owner label Feb 15, 2022
@adamsitnik adamsitnik added this to the Future milestone Feb 15, 2022
@epeshk
Copy link
Contributor

epeshk commented Feb 15, 2022

How about other delimiters?

For typical programming contest input any whitespace char (char.IsWhitespace) is good delimiter IMHO. Don't know about other use cases

expected output of trying to parse a line without whitespaces that contain data like this: "123true456false"

FormatException if it is treated as Int32/Boolean/etc and "123true456false" for ReadToken (naming from original proposal)

@epeshk
Copy link
Contributor

epeshk commented Feb 15, 2022

For me, personally, it's so sad to see how students give up C# to C++ and Java for competitions only due to lack of these APIs.

No, it's not possible to fix judge systems, because there are too many of them.

Only known workaround is using prewritten boilerplate code, but it's not always allowed on on-site contests (where all code must be written at the contest time)

@AlexGames73
Copy link
Author

AlexGames73 commented Feb 16, 2022

What would be the expected output of trying to parse a line without whitespaces that contain data like this: "123true456false"?

This case is incorrect, since even you will not be able to answer the question of what tokens are present here (123/true/456/false [Int32/Boolean/Int32/Boolean] or 123t/rue/456/false [String/String/Int64/Boolean]).
Of course, we can make method based on regex expressions, but it will be slower than usual...

@AlexGames73
Copy link
Author

By the way, this API will help companies to test hired employees to .NET Developer position, because automatic testing systems less wasteful and more efficiency than "interview tasks".

@epeshk
Copy link
Contributor

epeshk commented Feb 16, 2022

this API will help companies to test hired employees to .NET Developer position

They could ask to write a method with a prewritten signature.

@davidfowl
Copy link
Member

davidfowl commented Feb 27, 2022

Oh we want the scanner class from Java https://docs.oracle.com/javase/7/docs/api/java/util/Scanner.html. This shouldn't be tied to the console APIs.

PS: This is very useful in programming contests when reading input 😄 . FWIW it's still painful in C#.

C++

int a, b;
cin >> a >> b;
Scanner scanner = new Scanner(System.in);
int a, b;
a = scanner.NextInt();
b = scanner.NextInt();

C#

string line = Console.ReadLine();
int[] vals = line.Split(' ').Select(int.Parse);
int a = vals[0];
int b = vals[1];

@tannergooding
Copy link
Member

The Utf8Parser already works like a "scanner". This probably just needs a Utf16Parser as well (CC. @GrabYourPitchforks)

@davidfowl
Copy link
Member

The Utf8Parser already works like a "scanner". This probably just needs a Utf16Parser as well (CC. @GrabYourPitchforks)

It's a really hard to use API that works over a buffer. We need something that works over a TextReader.

@AlexGames73
Copy link
Author

AlexGames73 commented Feb 28, 2022

The Utf8Parser already works like a "scanner". This probably just needs a Utf16Parser as well (CC. @GrabYourPitchforks)

Anyway you have to read whole line to use the Utf8Parser (or it is not so obvious).

@tannergooding
Copy link
Member

It's a really hard to use API that works over a buffer. We need something that works over a TextReader.

Console would be the same ;)

That's a question of extending the Utf8Parser and theoretical Utf16Parser to better support other streams or buffers rather than updating Console specifically.

@deeprobin
Copy link
Contributor

I am not sure if such an API is really useful.
But if this is in consideration we should also add TryRead* methods similar to int.TryParse.

@davidfowl
Copy link
Member

This API is extremely useful and should be based on StreamReader

@epeshk
Copy link
Contributor

epeshk commented Dec 15, 2022

This API could be also implemented over ISpanParsable<TSelf> interface.

As an example, I've draft-implemented both versions based on ISpanParsable<TSelf> and Utf8Parser in https://github.com/epeshk/epeshk.text

Maybe this will inspire someone for a better API proposal. Or at least will be useful as copy-paste code

@AraHaan
Copy link
Member

AraHaan commented Dec 15, 2022

Hmm I think a generic terminal class that implements what console can do currently (and possibly add more to it) would be great.

It could also simplify the console class to just this as well:

// console becomes an alias to terminal to prevent existing code from breaking.
public sealed class Console : Terminal // or whatever modifiers console uses currently.
{
}

@epeshk
Copy link
Contributor

epeshk commented Dec 15, 2022

Note that the Java Scanner is based on regular expressions and does not performs well on simple cases (like just reading integers delimited by whitespaces). Java StreamTokenizer is more efficient, but has ugly API.

For my use case, something simple as this is sufficient: not top speed, but faster than int.Parse(Console.ReadLine()) and zero allocation. And this code is simple enough to just rewrite it from scratch when necessary, e.g. on programming contest without internet access.

public class TextScanner
{
  StreamReader input = new StreamReader(Console.OpenStandardInput(), bufferSize: 16384);
  char[] buffer = new char[4096];

  public int ReadInt()
  {
    var length = PrepareToken();
    return int.Parse(buffer.AsSpan(0, length));
  }

  private int PrepareToken()
  {
    int length = 0;
    bool readStart = false;
    while (true)
    {
      int ch = input.Read();
      if (ch == -1)
        break;

      if (char.IsWhiteSpace((char)ch))
      {
        if (readStart) break;
        continue;
      }

      readStart = true;
      buffer[length++] = (char)ch;
    }

    return length;
  }
}

@monoman
Copy link
Contributor

monoman commented Jan 13, 2023

One thing that seems to be overlooked here is about Culture-dependent parsing (for integers it is less of an issue).
If only support for the 'common' subset ('+', '-' and digits for integers, '.' as decimal separator and 'e' to separate exponents for floating point, english-only but case-insensitive 'true|false' for boolean) I would advise to have optimized extension methods for StreamReader/TextReader and Pipes

monoman added a commit to interlockledger/interlockledger-commons that referenced this issue Jan 17, 2023
…et/runtime#64621

Incomplete unit tests added for this
Also some methods to import statically for the same reading features on the Console.In
@monoman
Copy link
Contributor

monoman commented Jan 17, 2023

Did some implementation bits to toy around at: https://github.com/interlockledger/interlockledger-commons/blob/main/InterlockLedger.Commons/Extensions/System.IO/TextReaderExtensions.cs

Use as

using System.IO;

int firstvalue = Console.In.ReadInt32();
int secondvalue = Console.In.ReadInt16();

or import statically https://github.com/interlockledger/interlockledger-commons/blob/main/InterlockLedger.Commons/Extensions/System/ConsoleExtras.cs

using static System.ConsoleExtras;

int firstvalue = ReadInt32();
int secondvalue = ReadInt16();

Forgot to implement ReadBoolean(), maybe tomorrow

Uses System.Numerics.INumber so only works in C# 11/.NET 7.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Console
Projects
None yet
Development

No branches or pull requests