Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catching invalid format for CBOR.decode #14

Open
grzuy opened this issue Oct 3, 2019 · 6 comments
Open

Catching invalid format for CBOR.decode #14

grzuy opened this issue Oct 3, 2019 · 6 comments

Comments

@grzuy
Copy link
Contributor

grzuy commented Oct 3, 2019

CBOR.decode can raise several error types when fed with invalid cbor input.

require "cbor"
require 'securerandom'

errors = {}

1_000_000.times do |i|
  begin
    CBOR.decode(SecureRandom.random_bytes(64))
  rescue => ex
    errors[ex.class.name] = ex
  end
end

pp errors

returns

{"CBOR::MalformedFormatError"=>
  #<CBOR::MalformedFormatError: extra bytes follow after a deserialized object>,
 "EOFError"=>#<EOFError: end of buffer reached>,
 "TypeError"=>#<TypeError: can't convert CBOR::Simple into an exact number>,
 "URI::InvalidURIError"=>#<URI::InvalidURIError: bad URI(is not URI?): 2>,
 "FloatDomainError"=>#<FloatDomainError: NaN>,
 "RegexpError"=>
  #<RegexpError: invalid multibyte character: /\x1Er\x13w\xBCx0N:n\xD3aw{\x98\xB1Q\x1D\xC0\u06FC5\xF0/>}

Do you think it's possible for the CBOR decoder to either provide some sort fo CBOR.valid?(input) method, or for CBOR.decode(input) to always return the same exception when input is detected to be invalid CBOR?

This was raised in cedarcode/cose-ruby#40.

Thank you in advance!

@grzuy
Copy link
Contributor Author

grzuy commented Oct 4, 2019

or for CBOR.decode(input) to always return the same exception when input is detected to be invalid CBOR?

I guess already used CBOR::MalformedFormatError could be a possible candidate?

@grzuy
Copy link
Contributor Author

grzuy commented Oct 4, 2019

For what is worth, these are the occurrences...

{"EOFError"=>100332,
 "CBOR::MalformedFormatError"=>895855,
 "TypeError"=>3734,
 "URI::InvalidURIError"=>29,
 "FloatDomainError"=>1,
 "RegexpError"=>3}

Script:

require "cbor"
require 'securerandom'

errors = {}

1_000_000.times do |i|
  begin
    CBOR.decode(SecureRandom.random_bytes(64))
  rescue => ex
    errors[ex.class.name] ||= 0
    errors[ex.class.name] += 1
  end
end

pp errors

@cabo
Copy link
Owner

cabo commented Oct 4, 2019

I believe it is useful to know whether the CBOR item ended prematurely (EOFError), some other kind of malformedness occurred, or a type-specific error occurred (obviously on a Tag, e.g., URI::InvalidURIError). What is your motivation for wanting to turn this into Kernighan's car?

@cabo
Copy link
Owner

cabo commented Oct 4, 2019

BTW, errors = Hash.new(0) saves having to do the errors[ex.class.name] ||= 0.

@cabo
Copy link
Owner

cabo commented Oct 5, 2019

But the number of exceptions you found actually points to a genuine bug: An invalid tag (e.g., a URI tag with an invalid URI) is not presented to the application as such, but causes an exception that denies the application access to the entire piece of CBOR. This is great for purists, but not the best strategy for interoperability.

@grzuy
Copy link
Contributor Author

grzuy commented Oct 7, 2019

What is your motivation

Hi @cabo,

Just trying to understand what could be the best way to gracefully handle wrong/corrupt cbor data on at the application level when data fed by the user agent.

I guess that by doing

begin
  CBOR.decode(input)
rescue CBOR::MalformedFormatError, EOFError, TypeError
  # handle error
end

one would already catch > 99% of the cases, so that may be enough.

The thing is that from the perspective of the caller, EOFError and TypeError are quite generic and they don't even let you know if the error is coming from CBOR or from someplace else.

It crossed my mind that it might be desirable from the perspective of the caller to be able to do something like:

begin
  CBOR.decode(input)
rescue CBOR::MalformedFormatError
  # handle error
end

or

begin
  CBOR.decode(input)
rescue CBOR::DecodingError
  # handle error
end

or in case it makes more sense to have several different possible decoding errors, all those could inherit from a CBOR-generic CBOR::Error and you could just:

begin
  CBOR.decode(input)
rescue CBOR::Error
  # handle error
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants