Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Element access to a slice of a list returns undefined #564

Closed
Algy opened this issue Aug 26, 2024 · 3 comments · Fixed by #565
Closed

Element access to a slice of a list returns undefined #564

Algy opened this issue Aug 26, 2024 · 3 comments · Fixed by #565

Comments

@Algy
Copy link

Algy commented Aug 26, 2024

Description

Accessing an element of a slice of an array seems impossible.

Reproduction steps

Let us see this example:

{% set messages = ["first", "second", "third"] %}
FIRST: {{ messages[0] }}
{% set messages = messages[1:] %}
SECOND: {{ messages[0] }}

Result:

FIRST: first
SECOND: 

A simpler case goes like:

{{ [1,2,3][1:][0] }}

which should print "2". However, it returns undefined thus a blank string.

You can easily reproduce them in the minijinja playgorund.

What did you expect

It should work as if I accessed an element of a sub-list. It seems it works in the original Jinja2 written in Python.

@mitsuhiko
Copy link
Owner

This is happening because slicing returns an iterator and not a sequence. You can solve this by using |list:

{{ ([1,2,3][1:]|list)[0] }}

This is not how it works in Jinja2 but I wonder if it's reasonable to stringify this as it seems a bit of a weird edge case. However this also makes me wonder if it should not be possible to index into an iterable. Since they are (usually) restartable in MiniJinja that wouldn't be the worst in the world.

@Algy
Copy link
Author

Algy commented Aug 26, 2024

I can see why you make the slice operation return an iterator (or iterable?) instead of what is expected in python jinja2, at least for the sake of performance in Rust.

You might wonder why this issue (even practically) matters. Let me explain the reason how I came across this edge case issue. This is exactly how the "chat template" of Llama 3.1 works, which is one of most famous open-source LLMs. A chat template is an initial template to feed from which a LLM auto-generate more texts. To optimize the LLM inference system, I'm using Minijinja to render it instead of using plain python jinja2 for better system throughput.

However, the actual use of jinja template is quite biased into the python library.

For example of the chat template of Llama 3.1 above (you can see this in the value of the field chat_template in tokenizer_config.json in its huggingface repository):

...
{%- if messages[0]['role'] == 'system' %}
    {%- set system_message = messages[0]['content']|trim %}
    {%- set messages = messages[1:] %}
{%- else %}
    {%- set system_message = "" %}
{%- endif %}

...

{#- Custom tools are passed in a user message with some extra guidance #}
{%- if tools_in_user_message and not tools is none %}
    {#- Extract the first user message so we can plug it in here #}
    {%- if messages | length != 0 %}
        {%- set first_user_message = messages[0]['content']|trim %}
        {%- set messages = messages[1:] %}
    {%- else %}
        {{- raise_exception("Cannot put tools in the first user message when there's no first user message!") }}
{%- endif %} 
{%- endif %}
...

And you'll see the point why the access of a sliced list actually mattered in a jinja2 template.

@mitsuhiko
Copy link
Owner

Will work in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants