Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some events cannot be parsed after upgrading to 4.0.1 #337

Closed
tdabasinskas opened this issue Nov 15, 2017 · 11 comments
Closed

Some events cannot be parsed after upgrading to 4.0.1 #337

tdabasinskas opened this issue Nov 15, 2017 · 11 comments

Comments

@tdabasinskas
Copy link

Hello,

I have been using 3.0.15 version for some time and it was able to parse a calendar from stream just fine. Anyhow, after upgrading to the most recent 4.0.1 version, parsing of the same calendar is failing on some specific event:

System.Runtime.Serialization.SerializationException: Could not parse line: 'DESCRIPTION:aaa aaaaa:\n   aaaaa aaaa aaaaa aaaaaaa (aaaaa\, aaaa\, aaaaa\, TP) aaa aaa aaaaa aaaa\;\n        aaa aaaaa aaaa aaaa aaaa aaaaa aaaaaa aaaaaa (aa aaaaaa\, aaaaaaa)\;\n    aaaaa aaaaaa aaa aaa aaaaa aaaaaa aa\, aaaaaa aaaaaa\, aaa.\;\n\naaaa aaaa aaaaa aaa aa aaa aaa aaaaa aaaaaa aa aaa aaaa aaaaa`a aaaaa aaa. '
   at CalendarProperty Ical.Net.Serialization.SimpleDeserializer.ParseContentLine(SerializationContext context, string input) in C:\git\ical.net\net-core\Ical.Net\Ical.Net\Serialization\SimpleDeserializer.cs:line 126
   at IEnumerable<ICalendarComponent> Ical.Net.Serialization.SimpleDeserializer.Deserialize(TextReader reader)+MoveNext() in C:\git\ical.net\net-core\Ical.Net\Ical.Net\Serialization\SimpleDeserializer.cs:line 78
   at IEnumerable<TResult> System.Linq.Enumerable.OfTypeIterator<TResult>(IEnumerable source)+MoveNext()
   at void System.Collections.Generic.List<T>.AddEnumerable(IEnumerable<T> enumerable)
   at void System.Collections.Generic.List<T>.InsertRange(int index, IEnumerable<T> collection)
   at CalendarCollection Ical.Net.CalendarCollection.Load(TextReader tr) in C:\git\ical.net\net-core\Ical.Net\Ical.Net\CalendarCollection.cs:line 34
   at Calendar Ical.Net.Calendar.Load(TextReader tr) in C:\git\ical.net\net-core\Ical.Net\Ical.Net\Calendar.cs:line 29
   at void PROJECT.Components.MaintenancesComponent+<>c__DisplayClass4_0+<<InvokeAsync>b__0>d.MoveNext() in C:\PATH\PROJECT\src\Components\MaintenancesComponent.cs:line 69
   at void Polly.Policy+<>c__DisplayClass152_0<TResult>+<<ExecuteAsync>b__0>d.MoveNext() in C:\projects\polly\src\Polly.Shared\PolicyAsync.cs
   at void Polly.RetrySyntaxAsync+<>c__DisplayClass21_0+<<WaitAndRetryAsync>b__1>d.MoveNext() in C:\projects\polly\src\Polly.Shared\Retry\RetrySyntaxAsync.cs:line 467
   at async Task<TResult> Polly.Retry.RetryEngine.ImplementationAsync<TResult>(Func<Context, CancellationToken, Task<TResult>> action, Context context, CancellationToken cancellationToken, IEnumerable<ExceptionPredicate> shouldRetryExceptionPredicates, IEnumerable<ResultPredicate<TResult>> shouldRetryResultPredicates, Func<IRetryPolicyState<TResult>> policyStateFactory, bool continueOnCapturedContext) in C:\projects\polly\src\Polly.Shared\Retry\RetryEngineAsync.cs:line 31
   at async Task<TResult> Polly.Retry.RetryEngine.ImplementationAsync<TResult>(Func<Context, CancellationToken, Task<TResult>> action, Context context, CancellationToken cancellationToken, IEnumerable<ExceptionPredicate> shouldRetryExceptionPredicates, IEnumerable<ResultPredicate<TResult>> shouldRetryResultPredicates, Func<IRetryPolicyState<TResult>> policyStateFactory, bool continueOnCapturedContext) in C:\projects\polly\src\Polly.Shared\Retry\RetryEngineAsync.cs:line 27
   at async Task<TResult> Polly.Policy.ExecuteAsync<TResult>(Func<Context, CancellationToken, Task<TResult>> action, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext) in C:\projects\polly\src\Polly.Shared\PolicyAsync.cs:line 557
   at void Polly.Wrap.PolicyWrapEngine+<>c__DisplayClass8_0<TResult>+<<ImplementationAsync>b__0>d.MoveNext() in C:\projects\polly\src\Polly.Shared\Wrap\PolicyWrapEngineAsync.cs
   at async Task<TResult> Polly.Caching.CacheEngine.ImplementationAsync<TResult>(IAsyncCacheProvider<TResult> cacheProvider, ITtlStrategy ttlStrategy, Func<Context, string> cacheKeyStrategy, Func<Context, CancellationToken, Task<TResult>> action, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, Action<Context, string> onCacheGet, Action<Context, string> onCacheMiss, Action<Context, string> onCachePut, Action<Context, string, Exception> onCacheGetError, Action<Context, string, Exception> onCachePutError) in C:\projects\polly\src\Polly.Shared\Caching\CacheEngineAsync.cs
   at async Task<TResult> Polly.Wrap.PolicyWrapEngine.ImplementationAsync<TResult>(Func<Context, CancellationToken, Task<TResult>> func, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, Policy outerPolicy, Policy innerPolicy) in C:\projects\polly\src\Polly.Shared\Wrap\PolicyWrapEngineAsync.cs
   at async Task<IViewComponentResult> PROJECT.Components.MaintenancesComponent.InvokeAsync() in C:\PATH\PROJECT\src\Components\MaintenancesComponent.cs:line 59

The stack trace include the line it it seems to be failing on, but I have replaced the actual text with a letters, leaving all the formatting and symbols as is.

I would guess it has something to do with the large number of new lines or the apostrophe symbol.

Reverting back to 3.0.15 version solves the issue.

Thanks in advance.

@rianjs
Copy link
Collaborator

rianjs commented Nov 17, 2017

In the future, could you provide the full ics text?

@rianjs
Copy link
Collaborator

rianjs commented Nov 17, 2017

I can't repro your problem. This test passes:

public void TestDescription()
{
    const string testCase = @"BEGIN:VCALENDAR
PRODID:-//github.com/rianjs/ical.net//NONSGML ical.net 4.0//EN
VERSION:4.0
BEGIN:VEVENT
DTEND:20171117T114135
DTSTAMP:20171117T154135Z
DTSTART:20171117T104135
DESCRIPTION:aaa aaaaa:\n   aaaaa aaaa aaaaa aaaaaaa (aaaaa\, aaaa\, aaaaa\, TP) aaa aaa aaaaa aaaa\;\n        aaa aaaaa aaaa aaaa aaaa aaaaa aaaaaa aaaaaa (aa aaaaaa\, aaaaaaa)\;\n    aaaaa aaaaaa aaa aaa aaaaa aaaaaa aa\, aaaaaa aaaaaa\, aaa.\;\n\naaaa aaaa aaaaa aaa aa aaa aaa aaaaa aaaaaa aa aaa aaaa aaaaa`a aaaaa aaa. 
SEQUENCE:0
UID:46ee3437-cb6f-40f7-8a7a-49b177908396
END:VEVENT
END:VCALENDAR";
    var deserialized = Calendar.Load(testCase);
    var e = deserialized.Events.First();
    Assert.AreEqual(new CalDateTime(2017,11,17,10,41,35), e.Start);
    Assert.AreEqual(new CalDateTime(2017,11,17,11,41,35), e.End);
    Assert.AreEqual("aaa aaaaa:\n   aaaaa aaaa aaaaa aaaaaaa (aaaaa, aaaa, aaaaa, TP) aaa aaa aaaaa aaaa;\n        aaa aaaaa aaaa aaaa aaaa aaaaa aaaaaa aaaaaa (aa aaaaaa, aaaaaaa);\n    aaaaa aaaaaa aaa aaa aaaaa aaaaaa aa, aaaaaa aaaaaa, aaa.;\n\naaaa aaaa aaaaa aaa aa aaa aaa aaaaa aaaaaa aa aaa aaaa aaaaa`a aaaaa aaa. ", e.Description);
}

Whitespace matters in ical text, so your stack trace is insufficient. If you have some ics text, I can try again.

@tdabasinskas
Copy link
Author

tdabasinskas commented Nov 20, 2017

Hello @rianjs,

I'm still having the same issue. Below is the content from the ICS file:

BEGIN:VEVENT
DTSTAMP:20171120T124856Z
DTSTART;TZID=Europe/Helsinki:20160707T110000
DTEND;TZID=Europe/Helsinki:20160707T140000
SUMMARY:Some summary
UID:20160627T123608Z-182847102@atlassian.net
DESCRIPTION:Key points:\n•	Some text (text\
 , text\, text\, TP) some text\;\n•	some tex
 t Some text (Text\, Text)\;\n•	Some tex
 t some text\, some text\, text.\;\n\nsome te
 xt some tex‘t some text. 
ORGANIZER;X-CONFLUENCE-USER-KEY=ff801df01547101c6720006;CN=Some
 user;CUTYPE=INDIVIDUAL:mailto:some.mail@domain.com
CREATED:20160627T123608Z
LAST-MODIFIED:20160627T123608Z
ATTENDEE;X-CONFLUENCE-USER-KEY=ff8080ef1df01547101c6720006;CN=Some
 text;CUTYPE=INDIVIDUAL:mailto:some.mail@domain.com
SEQUENCE:1
X-CONFLUENCE-SUBCALENDAR-TYPE:other
TRANSP:TRANSPARENT
STATUS:CONFIRMED
END:VEVENT

I think it might actually be related to having unicode bullets () inside the DESCRIPTION (sorry for not showing them before, as they were not visible inside the stack trace).

Thank you.

@rianjs
Copy link
Collaborator

rianjs commented Nov 20, 2017

The unicode bullets and apostrophes are fine. It's actually the hidden tabs (\t). It looks like this was formatted text, so a newline + tab character might be a paragraph. (Alternatively it could be bad encoding from another ical engine, because "folds" are \r\n + a space or tab character, but some engines use \n by mistake.) Whatever is doing the conversion isn't translating \t into "\t" the way it is with newlines ("\n").

With respect to serialization, this falls under the Content Line rules. The code that implements that section is in SimpleDeserializer.cs @chescock quoted the relevant parts of the spec when he built up the regex, but he didn't quote this piece from the end of that section. It's describing allowed characters in a content line:

CONTROL       = %x00-08 / %x0A-1F / %x7F
     ; All the controls except HTAB

HTAB refers to ASCII character 9 which is \t. So it's not allowed in a content line property. I'd suggest normalizing the Description text to encode the tab literally, like it does with the newlines.

@tdabasinskas
Copy link
Author

Thanks for the detailed answer, @rianjs.

Unfortunately, the calendar is being taken from Atlassian Confluence, so I have no control on how people are formatting event contents there. The only option, though, could be to edit the stream contents before passing it to your library, but that, I guess, would put some extra overhead.

If there's no way to resolve this, I will probably going to stick with 3.0.15, which works fine even with the incorrect chars.

@rianjs
Copy link
Collaborator

rianjs commented Nov 20, 2017

It's up to you. I would recommend normalizing the stream before funneling it to ical.net if you can. 3.0.15 is the end of the line for v3, it won't get receive any more updates because it can't. (Under the hood, v3 and v4 are radically different from a deserialization perspective: v3 used ANTLR; v4 doesn't.)

I guess I'd start with a no-op abstraction between the two. Some ICalendarNormalizer that returns the stream unchanged at first. Then thread in the tab normalization logic, which could then be extended to other normalization operations should they be needed in the future. The overhead shouldn't be noticeable. We were serializing and deserializing hundreds of events (most of them with recurrence rules) inside click handlers in a desktop application, and you couldn't really notice the lag. ical.net is safe to use AsParallel() as well.

@rianjs
Copy link
Collaborator

rianjs commented Nov 20, 2017

I just talked to @chescock, and he thinks it's a bug, so one of us will fix it, and I'll update the thread with a nuget version that you can find it in. Basically, the regex character range doesn't extend to tabs, but should, because it's considered WSP (whitespace) as defined by another RFC (turtles all the way down!).

@tdabasinskas
Copy link
Author

tdabasinskas commented Nov 20, 2017

It is good to hear that and I will be looking forward for the new release.

Thanks!

@rianjs
Copy link
Collaborator

rianjs commented Nov 20, 2017

Fixed in 4.0.3:

https://www.nuget.org/packages/Ical.Net/4.0.3

@rianjs rianjs closed this as completed Nov 20, 2017
@tdabasinskas
Copy link
Author

tdabasinskas commented Nov 21, 2017

I've just upgraded to the latest Nuget and there are indeed NO issues.

Thank you, @rianjs!

@rianjs
Copy link
Collaborator

rianjs commented Nov 21, 2017

Excellent!

rianjs added a commit that referenced this issue Nov 22, 2017
…ake sure NetStandard DLLs are playing nicely together #337
rianjs added a commit that referenced this issue Nov 22, 2017
…ake sure NetStandard DLLs are playing nicely together #337
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants