Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: streaming parser support #39

Closed
tamerh opened this issue Nov 19, 2018 · 6 comments
Closed

Request: streaming parser support #39

tamerh opened this issue Nov 19, 2018 · 6 comments

Comments

@tamerh
Copy link

tamerh commented Nov 19, 2018

Hi,

I have relatively big compressed xml files with different schemas. And i used antlr with java for this purpose and worked well. I was wondering is participle also suitable for this ? For example can i iterate through particular xml elements while parsing?

Thanks

@alecthomas
Copy link
Owner

Hi there. You could definitely write an XML parser in Participle, but it would probably be simpler to use a dedicated XML parser like etree?

@alecthomas
Copy link
Owner

For example can i iterate through particular xml elements while parsing?

If this is asking whether Participle supports streaming parsing, then the answer is no. Participle returns a full parse tree from a single call.

@tamerh
Copy link
Author

tamerh commented Nov 20, 2018

Hi thanks for clarifying and suggestion. Yes i need streaming parsing. I have tried with go encoding/xml package but unfortunately it is slow and i guess etree library also on top of the encoding/xml package.that's why i looked alternatives.

So i am planning to write my own parser/lexer similarly mentioned in this article although i wouldn't prefer.
thanks again.

@alecthomas alecthomas changed the title parsing xml Request: streaming parser support Nov 20, 2018
@alecthomas
Copy link
Owner

alecthomas commented Dec 17, 2018

I wonder if Participle could be modified so that if it encounters a field that's a channel it will stream the results into that channel. That would be relatively straightforward to implement too.

eg. Given the grammar:

type XML struct {
  Tokens []*Token `@@*`
}

If it was converted to:

type XML struct {
  Tokens chan *Token `@@*`
}

Then the parser could detect this and stream back Tokens as they're parsed.

This might be a bit too much magic though.

@alecthomas
Copy link
Owner

alecthomas commented Dec 17, 2018

Hmm, maybe it could be restricted to the top-level. eg.

type Token struct {
}

func main() {
  parser := participle.MustBuild(chan *Token)
  
  tokens := make(chan *Token)
  err := parser.Parse(r, &tokens)
  for token := range tokens {
    // ...
  }
}

I'm not sure how it would deal with errors though. The parser might need Error() error method.

@alecthomas
Copy link
Owner

See README for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants