Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow loading and parsing multiple posts #22

Open
gaconnet opened this issue Oct 27, 2015 · 6 comments
Open

Allow loading and parsing multiple posts #22

gaconnet opened this issue Oct 27, 2015 · 6 comments

Comments

@gaconnet
Copy link

Hi. It's nice to see a frontmatter library written in python. Thanks for writing it!

How do you feel about supporting a way to load & parse multiple posts in one go, or perhaps even in a streaming fashion via an iterator or coroutine?

As motivation, consider a single markdown file that you would like to transform into a sequence of <section> tags to insert into reveal.js, and you want your transformation pipeline to transform metadata attributes into html data attributes for things like custom slide transitions:

# python-frontmatter
an introduction

---
transition: zoom

---
load and parse files (or just text) with YAML front matter.

---
transition: concave
background: linear-gradient(45deg, #f06, yellow)

---
now with streams of documents!

I built a little standalone parser for this on my own, but I thought it might be nice if this cool library did it.

@eyeseast
Copy link
Owner

I think the way I'd do this is with multiple files, or by splitting text and parsing strings with frontmatter.loads. We do this with a lot for @frontlinepbs projects, usually with metalsmyth (which needs a better name) and Tarbell.

@gaconnet
Copy link
Author

I agree that splitting it before it comes into frontmatter would be a fine way to go, but it seems unfortunate that such a splitter would need to duplicate some of the parsing work of frontmatter and would also need to be configured separately if either frontmatter or the splitter were to ever support custom delimiters (such as in gray-matter).

I think that having a simple interface to parse a stream of posts opens many interesting opportunities. For example, a non-programmer uses prose to edit a single file that goes into GitHub or a Gist and then a post-commit hook transforms the single file into a multi-page slideshow. In addition to parsing a single file as a stream, a streaming parser enables diverse command-line invocations such as parse-frontmatter prologue.md - epilogue.md and collect-interesting-files | parse-frontmatter (both examples fictional; assume that the fictional binaries both do something interesting).

I'm happy to have the splitter be a separate tool though. I just wanted to point out these opportunities here. I'll also give metalsmyth a try.

If you're still not sold on the idea then feel fee to close this issue whenever you feel the time is right. :)

@eyeseast
Copy link
Owner

Just ran into a situation that matches exactly this approach, so I'm going to reopen and reconsider.

@brainstorm
Copy link

brainstorm commented Feb 10, 2018

I'm in a similar situation (not online though) where I want to migrate from jekyll to blogdown and I want to change a couple of metadata attributes for all posts:

#!/usr/bin/env python

import os
from pathlib import Path
import datetime
import frontmatter

posts_root = os.environ['HOME'] / Path('dev/brainblog/content/post')

for post in posts_root.iterdir():
    fname_date = post.name[0:10] # capture the "2018-02-08" "timestamp" from the post filename
    tstamp = datetime.datetime.strptime(fname_date, "%Y-%m-%d").timestamp()
    utc_time = datetime.datetime.utcfromtimestamp(tstamp)
    utc_string = utc_time.strftime("%Y-%m-%dT%H:%M:%S.%f+00:00 (UTC)")
    with post.open() as f:
        post = frontmatter.load(f)
        if post.get('date') is not None:
            post.__setitem__('date', utc_string)
            post.__setitem__('modified', utc_string)
            frontmatter.dump(post, f)
            #print(post.metadata)

But apparently I cannot frontmatter.dump against the same post/filehandle:

Traceback (most recent call last):
  File "/Users/romanvg/bin/markdown_datetime.py", line 20, in <module>
    frontmatter.dump(post, f)
  File "/Users/romanvg/.miniconda/lib/python3.5/site-packages/frontmatter/__init__.py", line 155, in dump
    fd.write(content.encode(encoding))
TypeError: write() argument must be str, not bytes

How would you change such metadata attributes and serialize them "in-place"?

@brainstorm
Copy link

Nevermind, just opened/closed the object with different modes. Thanks for your lib! ;)

    with post.open('r') as f:
        post_fm = frontmatter.load(f)
        if not post_fm.get('date'):
            post_fm.__setitem__('date', utc_string)
            post_fm.__setitem__('modified', utc_string)
            post_str = frontmatter.dumps(post_fm)
            f.close()

            with post.open('w') as f:
                f.write(post_str)

@lekhnath
Copy link

lekhnath commented Jan 4, 2024

Just ran into a situation that matches exactly this approach, so I'm going to reopen and reconsider.

It's been around 8 years since you've commented. I also ran into a situation where I need to support this. Any help will be appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants