-
-
Notifications
You must be signed in to change notification settings - Fork 3
Home
- Introduction
- Quick Start
- Supported Data Streams
- Customization
- Type Describers
- Source Generators
- Performance
- Contributing
Cesil is a modern .NET library for reading and writing Delimiter-Separated Values (DSVs), the most common of which is Comma-Separated Values (CSVs).
Cesil supports reading and writing, in synchronous and asychronous ways, static and dynamic data types. Cesil requires .NET Core 3.0+.
- Install the latest Cesil off of Nuget.
- Add
using Cesil;
to your C# file - Use one of the
EnumerateXXX(...)
orWriteXXX(...)
methods onCesilUtils
to read or write
Continue reading for the more configurable ways to use Cesil.
Using a convient method:
using Cesil;
// ...
using(TextReader reader = /* some TextReader */)
{
IEnumerable<MyType> rows = CesilUtils.Enumerate<MyType>(reader);
}
In a more explicit, and configurable, way using explicit configuration and options.
using Cesil;
// ...
Options myOptions = /* ... */
IBoundConfiguration<MyType> myConfig = Configuration.For<MyType>(myOptions);
using(TextReader reader = /* ... */)
using(IReader<MyType> csv = myConfig.CreateReader(reader))
{
IEnumerable<MyType> rows = csv.EnumerateAll();
}
For more detail, see Reading.
Using a convient method:
using Cesil;
// ...
using(TextReader reader = /* some TextReader */)
{
IAsyncEnumerable<MyType> rows = CesilUtils.EnumerateAsync<MyType>(reader);
}
In a more explicit, and configurable, way using explicit configuration and options.
using Cesil;
// ...
Options myOptions = /* ... */
IBoundConfiguration<MyType> myConfig = Configuration.For<MyType>(myOptions);
using(TextReader reader = /* ... */)
await using(IAsyncReader<MyType> csv = myConfig.CreateAsyncReader(reader))
{
IAsyncReader<MyType> rows = csv.EnumerateAllAsync();
}
For more detail, see Reading.
Using a convient method:
using Cesil;
// ...
IEnumerable<MyType> myRows = /* ... */
using(TextWriter writer = /* .. */)
{
CesilUtilities.Write(myRows, writer);
}
In a more explicit, and configurable, way using explicit configuration and options.
using Cesil;
// ...
IEnumerable<MyType> myRows = /* ... */
Options myOptions = /* ... */
IBoundConfiguration<MyType> myConfig = Configuration.For<MyType>(myOptions);
using(TextWriter writer = /* ... */)
using(IWriter<MyType> csv = myConfig.CreateWriter(writer))
{
csv.WriteAll(myRows);
}
For more detail, see Writing.
Using a convient method:
using Cesil;
// ...
// IAsyncEnumerable<MyType> will also work
IEnumerable<MyType> myRows = /* ... */
using(TextWriter writer = /* .. */)
{
await CesilUtilities.WriteAsync(myRows, writer);
}
In a more explicit, and configurable, way using explicit configuration and options.
using Cesil;
// ...
// IAsyncEnumerable<MyType> will also work
IEnumerable<MyType> myRows = /* ... */
Options myOptions = /* ... */
IBoundConfiguration<MyType> myConfig = Configuration.For<MyType>(myOptions);
using(TextWriter writer = /* ... */)
await using(IWriter<MyType> csv = myConfig.CreateAsyncWriter(writer))
{
await csv.WriteAllAsync(myRows);
}
For more detail, see Writing.
Cesil can read and write to a number of different "stream" types, classes or interfaces that conceptually model streams of data.
For synchronously reading with the IReader<TRow>
interface, Cesil supports:
-
ReadOnlySequence<byte>
with an accompanyingEncoding
ReadOnlySequence<char>
TextReader
For asynchronously reading with the IAsyncReader<TRow>
interface, Cesil supports:
-
PipeReader
with an accompanyingEncoding
TextReader
For synchronously writing with the IWriter<TRow>
interface, Cesil supports:
-
IBufferWriter<byte>
with an accompanyingEncoding
IBufferWriter<char>
TextWriter
For asynchronously writing with the IAsyncWriter<TRow>
interface, Cesil supports:
-
PipeWriter
with an accompanyingEncoding
TextWriter
By default when dealing with concrete types, Cesil will use Options.Default
which assumes:
- You're working with CSVs
- The value delimiter is a
,
- The value delimiter is a
- Rows end in
\r\n
- Cells can be escaped with
"
- Within a cell,
"
can be escaped with another"
- Headers are optional, and thus automatically detected, when reading
- Headers are always written
- Types are (de)serialized in keeping with "normal" .NET conventions
- This is covered in more detail in the Default Type Describer
- The final row will NOT be terminated with a new line when writing
- Allocations come out of
MemoryPool<char>.Shared
- Comments are not supported
- Write buffering is enabled, but no size hint is given
- Read buffering (which is always enabled) is not given a size hint
- Dynamically read rows are disposed when the
IReader<TRow>
orIAsyncReader<TRow>
that last returned them is disposed - Whitespace is preserved
When dealing with dynamic
(ie. using configurations obtained via Configuration.ForDynamic
), Cesil will us Options.DynamicDefault
which is identical to Options.Default
except that headers are not optional when reading, and are assumed to be present.
Every method on Configuration
accepts an optional Options, which will be used instead of the defaults documented above if set. Custom Options can be built with an OptionsBuilder. Options are immutable, thread-safe, and can be safely used by many readers or writers at the same time.
Cesil's default Options will use TypeDescribers.Default
, as noted above. This ITypeDesciber
is a shared instance of DefaultTypeDescriber which implements "normal" .NET conventions around (de)serializing.
You can read more about what Cesil considers "normal" here, but in brief:
- Requires any constructed types have a parameter-less constructor
- Public properties are (de)serialized, provided they have public setters (for reading) and getters (for writing), and their type has a default formatter and a default parser.
- Any properties or fields with
DataMemberAttribute
are (de)serialized, subject to the same setter/getter and type rules as above - Name, Order, and IsRequired on
DataMemberAtribute
are respected - Any properties or fields with
IgnoreDataMemberAttribute
are ignored -
ShouldSerializeXXX()
andResetXXX()
methods are discovered and used for (de)serialized properties
The above behavior applies to concrete types, when (de)serializing in a dynamic context Cesil falls back to that behavior for types that don't participate in the Dynamic Language Runtime.
DefaultTypeDescriber
is extensible, with a number of virtual methods documented here. Cesil actually works in terms of the ITypeDescriber
interface, which allows for behaviors that are completely unrelated to the "normal" .NET way.
Cesil can be used without runtime code generation using Source Generators, which were added in C# 9.
To use them, you must add the Cesil.SourceGenerator nuget package, and attach various attributes to the types you want to read and write. Most of the customization options provided by Cesil can be achieved with Cesil.SourceGenerator, although there are some additional restrictions (primarily not being able to use delegates, and not being able to ignore member accessibility) due to the nature of source generators.
A simple example for reading a type:
using System;
using Cesil;
namespace Foo
{
[GenerateDeserializer]
public class ReadMe
{
[DeserializerMember(ParserType=typeof(ReadMe), ParserMethodName=nameof(ForInt))]
public int Bar { get; set; }
[DeserializerMember(ParserType=typeof(ReadMe), ParserMethodName=nameof(ForString))]
public string Fizz = """";
private DateTime _Hello;
[DeserializerMember(Name=""Hello"", ParserType=typeof(ReadMe), ParserMethodName=nameof(ForDateTime))]
public void SomeMtd(DateTime dt)
{
_Hello = dt;
}
public DateTime GetHello()
=> _Hello;
public ReadMe() { }
public static bool ForInt(ReadOnlySpan<char> data, in ReadContext ctx, out int val)
=> int.TryParse(data, out val);
public static bool ForString(ReadOnlySpan<char> data, in ReadContext ctx, out string val)
{
val = new string(data);
return true;
}
public static bool ForDateTime(ReadOnlySpan<char> data, in ReadContext ctx, out DateTime val)
=> DateTime.TryParse(data, out val);
}
}
And one for writing a type:
using System;
using System.Buffers;
using Cesil;
namespace Foo
{
[GenerateSerializer]
public class WriteMe
{
[SerializerMember(FormatterType=typeof(WriteMe), FormatterMethodName=nameof(ForInt))]
public int Bar { get; set; }
[SerializerMember(FormatterType=typeof(WriteMe), FormatterMethodName=nameof(ForString))]
public string Fizz = """";
[SerializerMember(Name=""Hello"", FormatterType=typeof(WriteMe), FormatterMethodName=nameof(ForDateTime))]
public DateTime SomeMtd() => new DateTime(2020, 11, 15, 0, 0, 0);
public WriteMe() { }
public static bool ForInt(int val, in WriteContext ctx, IBufferWriter<char> buffer)
{
var span = buffer.GetSpan(100);
if(!val.TryFormat(span, out var written))
{
return false;
}
buffer.Advance(written);
return true;
}
public static bool ForString(string val, in WriteContext ctx, IBufferWriter<char> buffer)
{
var span = buffer.GetSpan(val.Length);
val.AsSpan().CopyTo(span);
buffer.Advance(val.Length);
return true;
}
public static bool ForDateTime(DateTime val, in WriteContext ctx, IBufferWriter<char> buffer)
{
var span = buffer.GetSpan(4);
if(!val.Year.TryFormat(span, out var written))
{
return false;
}
buffer.Advance(written);
return true;
}
}
}
Both examples include custom parsers or formatters, but this is just for demonstration's sake. If you do not specify parsers or formatters, a default parser or ][default formatter|Default Formatters]] will be used if on exists.
More details can be read in Source Generators.
Cesil's focuses on ease of use and flexibility, but performance is important - especially as .NET is increasingly adopted for more performance critical code. A set of benchmarks can be found Cesil's repo that cover the following cases:
- Synchronously writing rows a with single column of a built in type (
int
,string
,Uri
, etc.) - Synchronously writing rows that columns for each built-in type
- Synchronously reading rows with a single column of a built in type
- Synchronously reading rows with columns for each built-in type
- Dynamic versions of the read and write benchmarks above
The benchmarks for static operations use CsvHelper as a baseline comparison and the dynamic benchmarks compare to Cesil's static equivalent. Each benchmark has an InitializeAndTest()
method that assures, in DEBUG builds, that the output of the compared variants is identical.
A sample run of all benchmarks (including some ones for internal implementation details) can be found in the repo. A select subset of those results for static operations is shown below.
Dynamic operations aim to be no more than 3x slower than their static equivalent, but this naturally varies based on exactly is done with the returned dynamic
rows.
Cesil is open source, under the MIT license.
Cesil intentionally exploits new functionality found in C# 8 and .NET Core 3.0+. Early frameworks are not supported, by design.
Further reading: