Skip to content

Add list.index #1703

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 14, 2023
Merged

Add list.index #1703

merged 3 commits into from
Apr 14, 2023

Conversation

virendrakabra14
Copy link
Contributor

This reuses LLVMList::find_item_position in llvm_utils.cpp. Part of #941.

Issues

  1. Segmentation fault in list.remove #1701 is common to both list.remove and list.index.
  2. tests/errors/test_list_index.py doesn't appear to give an error, but putting the same snippet at the end of integration_tests/test_list_index.py correctly outputs ValueError: The list does not contain the element: 0. (Is this related to the above point?)

CPython allows specifying a range for finding the first occurrence as list.index(x[, start[, end]]). This would be easier to implement using the existing implementations of list-slicing and index. Could someone please point to where we implement such overloads for data-structure methods?

@czgdp1807
Copy link
Collaborator

This would be easier to implement using the existing implementations of list-slicing and index.

It won't be efficient because slicing a list will create a copy which is not needed at all. I will start the loop from max(0, start) and go till min(end, len(x)). Its the same scene as list.count. Add a new node for list.index as, ListIndex(expr list, expr x, expr? start, expr? end) and then implement in LLVM directly. Later on we can use IntrinsicFunction infrastructure.

For long term, How to use IntrinsicFunction for implementing methods like, list.index, list.count, etc.?

  1. Register ListIndex in the same way as we do for Sin, Cos in https://github.com/lcompilers/lpython/blob/main/src/libasr/pass/intrinsic_function_registry.h.
  2. Now inside namespace ListIndex while writing the instantiate_ListIndex you can create a Function depending on the input arguments. Its like overloading (mangling the function's name) depending on the presence of start and end arguments. The do loop can be generated inside the Function's body using ASR APIs directly.

Point 2 is for the far future.

In fact if I think more you don't need to add ListIndex as a new ASR node. Just use IntrinsicFunction and register list.index in the same way as we do for Sin, Cos. Then you can do visit_IntrinsicFunction in asr_to_llvm.cpp and implement it there directly (like check the ID of the intrinsic and generate the LLVM implementation for that intrinsic). In https://github.com/lcompilers/lpython/blob/main/src/libasr/pass/intrinsic_function.cpp we can make the presence of instantiate_function optional by adding an if check around,

ASRUtils::impl_function instantiate_function =
ASRUtils::IntrinsicFunctionRegistry::get_instantiate_function(x->m_intrinsic_id);
Vec<ASR::ttype_t*> arg_types;
arg_types.reserve(al, x->n_args);
for( size_t i = 0; i < x->n_args; i++ ) {
arg_types.push_back(al, ASRUtils::expr_type(x->m_args[i]));
}
*current_expr = instantiate_function(al, x->base.base.loc,
global_scope, arg_types, new_args, x->m_value);

Also, you can ignore implementing instantiate_ListIndex and do not register it in the following map,

static const std::map<int64_t, impl_function>& intrinsic_function_by_id_db = {
{static_cast<int64_t>(ASRUtils::IntrinsicFunctions::LogGamma),
&LogGamma::instantiate_LogGamma},
{static_cast<int64_t>(ASRUtils::IntrinsicFunctions::Sin),
&Sin::instantiate_Sin},
{static_cast<int64_t>(ASRUtils::IntrinsicFunctions::Cos),
&Cos::instantiate_Cos},
{static_cast<int64_t>(ASRUtils::IntrinsicFunctions::Abs),
&Abs::instantiate_Abs}
};

Also modify the following by returning a nullptr for those IDs which are not present in intrinsic_function_by_id_db.

static inline impl_function get_instantiate_function(int64_t id) {
return intrinsic_function_by_id_db.at(id);
}

By following the above approach you won't be required to add new ASR nodes for each intrinsic feature of Python language like, list.index, dict.get, list.remove, etc. Later on once ASR is mature enough we can add instantiate_ListIndex, instantiate_DictGet, etc and make using LLVM implementations optional.

cc: @certik What do you say?

@czgdp1807
Copy link
Collaborator

If @certik agrees and if you want, I can try the above approach here in your PR and we will see if its doable or not. If it is then we can remove a large number of unnecessary nodes from ASR and just register everything in https://github.com/lcompilers/lpython/blob/main/src/libasr/pass/intrinsic_function.cpp.

Copy link
Contributor

@certik certik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @czgdp1807's plan is the way to go I think. However, this PR is small and clean, so I think it's ok to merge it and then do the refactoring in a new PR.

@czgdp1807 I'll leave the decision up to you. If it's easy to do the redesign here, then let's do it, otherwise in a new PR.

@czgdp1807
Copy link
Collaborator

If it's easy to do the redesign here, then let's do it,

I am doing it here. Let's see how it goes.

@czgdp1807
Copy link
Collaborator

@virendrakabra14 I have pushed the changes (which I described above). Please update your branch locally.

@certik Please let me know if they look good to you. I will apply them to LFortran as well. Initially keeping IntrinsicFunction in sync between LPython/LFortran will help in avoiding problems. I will not apply list.index changes, just the API changes in IntrinsicFunctionRegistry namespace will be applied to LFortran.

@czgdp1807 czgdp1807 marked this pull request as ready for review April 13, 2023 18:31
Copy link
Contributor

@certik certik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks pretty good, thanks!

4, nullptr, 0));
return ASR::make_IntrinsicFunction_t(al, loc,
static_cast<int64_t>(ASRUtils::IntrinsicFunctions::ListIndex),
args.p, args.size(), 0, to_type, compile_time_value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that create_ListIndex is to be used when we need to check the argument types and compute compile time value (if available), but if we already have everything ready (such as in some transformation passes), we just use make_IntrinsicFunction_t directly?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. All the error checking related to an intrinsic function resides in one place. Also there is no restriction on using make_IntrinsicFunction_t directly (its a public API and if you few feel you can avoid calling, create_ListIndex or any other create function then feel free to do so).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later we might need to add some checks to verify() to ensure all arguments to IntrinsicFunctions are correct. For now this is good.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can easily do this by registering verify_args function in IntrinsicFunctionRegistry. We can define, LogGamma::verify_args, Abs::verify_args, etc. Then, in asr_verify.cpp we can call these methods in the same way as we call instantiate_function in intrinsic_function.cpp.

std::string org = ASRUtils::type_to_str_python(list_type);
err(
"Type mismatch in 'index', the types must be compatible "
"(found: '" + fnd + "', expected: '" + org + "')", loc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit worried that we have a frontend dependency here (ASRUtils::type_to_str_python).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Yes. I would do get_type_code here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now use get_type_code since its just an error message so our frontend agnostic type codes should convey the error with same clarity.

@certik
Copy link
Contributor

certik commented Apr 14, 2023

I think this is ok to merge. I have some design questions, but I think we can iterate on this after it is merged.

@czgdp1807
Copy link
Collaborator

I will make some updates here before merging.

@czgdp1807 czgdp1807 marked this pull request as draft April 14, 2023 15:46
czgdp1807 added a commit to czgdp1807/lfortran that referenced this pull request Apr 14, 2023
@czgdp1807 czgdp1807 marked this pull request as ready for review April 14, 2023 16:22
@czgdp1807 czgdp1807 enabled auto-merge (squash) April 14, 2023 16:24
@czgdp1807 czgdp1807 merged commit f6bcbd2 into lcompilers:main Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants