Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressing TODOs for Instance struct #533

Merged
merged 22 commits into from
Mar 6, 2021

Conversation

georgemitenkov
Copy link
Collaborator

@georgemitenkov georgemitenkov commented Mar 2, 2021

This PR addresses certain TODOs from 2da2bc4 and would help with instance struct integration (see main PR)

Scalar integration:

  • Add instance type for CodegenVarType visitor
  • Emit instance type prior to function declarations (NOT NEEDED)
  • Add instance members query via helper struct for the kernel nrn_state generation
  • Clean-up member creation

Simple direct index vectorization:

  • Make the kernel for loop vectorized

Other:
- [ ] Update tests to reflect the changes and test the kernel Will be done in separate PR

pramodk and others added 8 commits February 27, 2021 10:59
  - nmodl.yaml file is more for language constructs
  - InstanceStruct is specific for code generation and hence
    move it to codegen.yaml
 - instance structure now contains all global variables
 - instance structure now contains index variables for ions
 - nrn_state kernel now has all variables converted to instance
 - InstanceVarHelper added to query variable and it's location

nmodl_to_json helper added in main.cpp
added --vector-width CLI option
@bbpbuildbot
Copy link
Collaborator

Can one of the admins verify this patch?

@pramodk pramodk force-pushed the pramodk/llvm-instance-type branch from 8b1d186 to b9b2061 Compare March 5, 2021 18:58
@pramodk
Copy link
Contributor

pramodk commented Mar 5, 2021

@georgemitenkov : I push the change to fix the gitlab CI. But it seems like there are some minor conflicts because earlier today I push forced few things to my PR. Conflicts are minor and you can fix it.

As we have dependent PRs : I think we can merge this your #533 PR into my #531 and you continue working in my branch pramodk/llvm-instance-type. As I can't merge my PR into llvm branch without your PR, this will avoid juggling changes and conflicts.

@georgemitenkov
Copy link
Collaborator Author

@georgemitenkov : I push the change to fix the gitlab CI. But it seems like there are some minor conflicts because earlier today I push forced few things to my PR. Conflicts are minor and you can fix it.

Great! I will make the changes.

As we have dependent PRs : I think we can merge this your #533 PR into my #531 and you continue working in my branch pramodk/llvm-instance-type. As I can't merge my PR into llvm branch without your PR, this will avoid juggling changes and conflicts.

Sure, I am nearly done with simple vector code generation, so I will make this PR ready by tonight.

@pramodk
Copy link
Contributor

pramodk commented Mar 6, 2021

Sure, I am nearly done with simple vector code generation, so I will make this PR ready by tonight.

Ok great! feel free to change my or @alkino's PRs to prepare one final PR ready for merge. Then I will do final check and we can merge it.

In the meantime I will start sketching something for testing aspect.

And for gitlab CI failures:

image

It's just cmake format and clang format stuff. Tests are passing so its good!

@georgemitenkov
Copy link
Collaborator Author

georgemitenkov commented Mar 6, 2021

@pramodk Regarding testing, I had one idea:

  1. Generate llvm::Module using the pipeline
  2. Add a new file test_llvm_kernels.cpp or something like that. In that file, we create Instance struct artificially, and write cpp wrappers to print contents before/after the kernel execution.
    3)Link the wrapper llvm::Module with our llvm::Module (For my GSoC I was using a similar strategy actually, so I have an idea of how this is done with LLVM API).
  3. Simply feed this into llvm_nmodl_runner and see what are the outputs :)

This is not actual IR check but suits integration test purposes.

@georgemitenkov
Copy link
Collaborator Author

For example something like this:

#include <stdio.h>

// ================= LLVM kernel generated from the pipeline ======================== //

struct Bar {
  int* __restrict__ indices;
  double* __restrict__ voltage;
  int num_nodes;
};

void kernel(Bar* b) {
  double v = -1.0;
  b->voltage[b->indices[0]] = v * b->voltage[b->indices[0]];
  b->voltage[b->indices[1]] = v * b->voltage[b->indices[1]];
}

// ================= Helpers that would come from wrapper class ==================== //

void print_struct(Bar *b) {
  printf("num nodes: %d\n", b->num_nodes);
  printf("indices: ");
  for (int i = 0; i < b->num_nodes; ++i) {
    printf("%d", b->indices[i]);
    if (i < b->num_nodes - 1) printf(", "); else printf("\n");
  }
  printf("voltage: ");
  for (int i = 0; i < b->num_nodes; ++i) {
    printf("%.2f", b->voltage[i]);
    if (i < b->num_nodes - 1) printf(", "); else printf("\n");
  }
}

int main() {
  Bar b;
  b.num_nodes = 2;
  int indices[] = {0, 1};
  double voltage[] = {5.0, 10.0};
  b.indices = indices;
  b.voltage = voltage;
  printf(" == Before == ");
  print_struct(&b);
  kernel(&b);
  printf(" == After == ");
  print_struct(&b);
  return 0;
}

I am currently using this to verify the vectorised code.

@georgemitenkov georgemitenkov marked this pull request as ready for review March 6, 2021 10:40
@georgemitenkov
Copy link
Collaborator Author

@pramodk @iomaganaris This PR is now ready! Now, we fully support scalar code kernel generation + compute (i.e. non-scatter/gather/control flow) vectorised code generation.

There are no tests at the moment, but with the issue just opened I think these can be added later. At the moment, I have tested by comparing to LLVM generated from C++ and its manually defined vector version. See example below :)

@georgemitenkov
Copy link
Collaborator Author

georgemitenkov commented Mar 6, 2021

Scalar code is supported fully (running ./bin/nmodl hh.mod llvm --ir). Vector code generation is supported for the following kernel (For now, I just commented out indirect instructions):

VOID nrn_state_hh(INSTANCE_STRUCT *mech){
    INTEGER id
    for(id = 0; id<mech->node_count; id = id+4) {
        INTEGER node_id
        DOUBLE v
        mech->m[id] = mech->m[id]+(1.0-exp(mech->dt*((((-1.0)))/mech->mtau[id])))*(-(((mech->minf[id]))/mech->mtau[id])/((((-1.0)))/mech->mtau[id])-mech->m[id])
        mech->h[id] = mech->h[id]+(1.0-exp(mech->dt*((((-1.0)))/mech->htau[id])))*(-(((mech->hinf[id]))/mech->htau[id])/((((-1.0)))/mech->htau[id])-mech->h[id])
        mech->n[id] = mech->n[id]+(1.0-exp(mech->dt*((((-1.0)))/mech->ntau[id])))*(-(((mech->ninf[id]))/mech->ntau[id])/((((-1.0)))/mech->ntau[id])-mech->n[id])
    }
}

The output for ./bin/nmodl hh.mod llvm --ir --vector-width 4:

define void @nrn_state_hh(%hh__instance_var__type* %mech1) {
  %mech = alloca %hh__instance_var__type*, align 8
  store %hh__instance_var__type* %mech1, %hh__instance_var__type** %mech, align 8
  %id = alloca i32, align 4
  store i32 0, i32* %id, align 4

  // These 2 lines are not really needed atm.
  %__vec_id = alloca <4 x i32>, align 16    
  store <4 x i32> <i32 0, i32 1, i32 2, i32 3>, <4 x i32>* %__vec_id, align 16

  br label %for.cond

for.cond:                                         ; preds = %for.inc, %0
  %1 = load %hh__instance_var__type*, %hh__instance_var__type** %mech, align 8
  %2 = getelementptr inbounds %hh__instance_var__type, %hh__instance_var__type* %1, i32 0, i32 43
  %3 = load i32, i32* %2, align 4
  %4 = load i32, i32* %id, align 4
  %5 = icmp slt i32 %4, %3
  br i1 %5, label %for.body, label %for.exit

for.body:                                         ; preds = %for.cond
  %node_id = alloca i32, align 4
  %v = alloca double, align 8

  // Getting the pointer to double* m 
  %6 = load %hh__instance_var__type*, %hh__instance_var__type** %mech, align 8
  %7 = getelementptr inbounds %hh__instance_var__type, %hh__instance_var__type* %6, i32 0, i32 13
  %8 = load i32, i32* %id, align 4
  %9 = sext i32 %8 to i64
  %10 = load double*, double** %7, align 8

  // A simple bitcast is sufficient since the data is consecutive!
  %11 = bitcast double* %10 to <4 x double>*

  // Load the actual values to the vector, then we proceed with other computations.
  %12 = getelementptr inbounds <4 x double>, <4 x double>* %11, i64 %9
  %13 = load <4 x double>, <4 x double>* %12, align 32

  // Similar instructions for all ...[id], etc.
...

Copy link
Contributor

@pramodk pramodk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I added some minor comments.

src/codegen/llvm/codegen_llvm_visitor.cpp Show resolved Hide resolved
src/codegen/llvm/codegen_llvm_visitor.cpp Show resolved Hide resolved
src/codegen/llvm/codegen_llvm_visitor.cpp Show resolved Hide resolved
src/codegen/llvm/codegen_llvm_visitor.cpp Show resolved Hide resolved
@pramodk
Copy link
Contributor

pramodk commented Mar 6, 2021

CI failing because of: : Fixed

error:improper C/C++ file formatting: /gpfs/bbp.cscs.ch/home/bbprelman/gl/P994/hpc/nmodl/src/codegen/llvm/codegen_llvm_visitor.hpp

@pramodk
Copy link
Contributor

pramodk commented Mar 6, 2021

Other
Update tests to reflect the changes and test the kerne

@georgemitenkov : this is ready to merge in my PR right? Just asking because there is one todo in the description .

@georgemitenkov
Copy link
Collaborator Author

@pramodk It is ready for the merge. I have compared the LLVM output manually, and it looks correct. Since we do not know how to test it properly yet, the tests can be added later separately.

@pramodk pramodk changed the title (WIP) Addressing TODOs for Instance struct Addressing TODOs for Instance struct Mar 6, 2021
@pramodk pramodk merged commit 237e785 into pramodk/llvm-instance-type Mar 6, 2021
@pramodk pramodk deleted the georgemitenkov/llvm-instance-type branch March 6, 2021 22:10
pramodk pushed a commit that referenced this pull request Mar 7, 2021
  - remove undefined visit_codegen_instance_var
  - Improved member creation for instance struct
  - Instance struct type generation for kernel arguments
  - Proper integration of instance struct
  - Added scalar code generation for the kernel
  - Removed instance test since it is not created explicitly anymore
  - Fixed ordering for precision and width in LLVM Visitor
  - Added vector induction variable
  - Vectorised code for compute with direct loads fully functional
  - Instance naming fixed
pramodk pushed a commit that referenced this pull request Mar 7, 2021
  - remove undefined visit_codegen_instance_var
  - Improved member creation for instance struct
  - Instance struct type generation for kernel arguments
  - Proper integration of instance struct
  - Added scalar code generation for the kernel
  - Removed instance test since it is not created explicitly anymore
  - Fixed ordering for precision and width in LLVM Visitor
  - Added vector induction variable
  - Vectorised code for compute with direct loads fully functional
  - Instance naming fixed
  - (LLVM IR) Fixed compute vector code generation types
  -  refactoring : improve coversion of double to int for
    the loop expressions
pramodk pushed a commit that referenced this pull request Mar 7, 2021
  - remove undefined visit_codegen_instance_var
  - Improved member creation for instance struct
  - Instance struct type generation for kernel arguments
  - Proper integration of instance struct
  - Added scalar code generation for the kernel
  - Removed instance test since it is not created explicitly anymore
  - Fixed ordering for precision and width in LLVM Visitor
  - Added vector induction variable
  - Vectorised code for compute with direct loads fully functional
  - Instance naming fixed
  - (LLVM IR) Fixed compute vector code generation types
  -  refactoring : improve coversion of double to int for
    the loop expressions
pramodk pushed a commit that referenced this pull request May 8, 2021
  - remove undefined visit_codegen_instance_var
  - Improved member creation for instance struct
  - Instance struct type generation for kernel arguments
  - Proper integration of instance struct
  - Added scalar code generation for the kernel
  - Removed instance test since it is not created explicitly anymore
  - Fixed ordering for precision and width in LLVM Visitor
  - Added vector induction variable
  - Vectorised code for compute with direct loads fully functional
  - Instance naming fixed
  - (LLVM IR) Fixed compute vector code generation types
  -  refactoring : improve coversion of double to int for
    the loop expressions
pramodk pushed a commit that referenced this pull request Mar 8, 2022
  - remove undefined visit_codegen_instance_var
  - Improved member creation for instance struct
  - Instance struct type generation for kernel arguments
  - Proper integration of instance struct
  - Added scalar code generation for the kernel
  - Removed instance test since it is not created explicitly anymore
  - Fixed ordering for precision and width in LLVM Visitor
  - Added vector induction variable
  - Vectorised code for compute with direct loads fully functional
  - Instance naming fixed
  - (LLVM IR) Fixed compute vector code generation types
  -  refactoring : improve coversion of double to int for
    the loop expressions
iomaganaris pushed a commit that referenced this pull request May 10, 2022
  - remove undefined visit_codegen_instance_var
  - Improved member creation for instance struct
  - Instance struct type generation for kernel arguments
  - Proper integration of instance struct
  - Added scalar code generation for the kernel
  - Removed instance test since it is not created explicitly anymore
  - Fixed ordering for precision and width in LLVM Visitor
  - Added vector induction variable
  - Vectorised code for compute with direct loads fully functional
  - Instance naming fixed
  - (LLVM IR) Fixed compute vector code generation types
  -  refactoring : improve coversion of double to int for
    the loop expressions
iomaganaris pushed a commit that referenced this pull request May 12, 2022
  - remove undefined visit_codegen_instance_var
  - Improved member creation for instance struct
  - Instance struct type generation for kernel arguments
  - Proper integration of instance struct
  - Added scalar code generation for the kernel
  - Removed instance test since it is not created explicitly anymore
  - Fixed ordering for precision and width in LLVM Visitor
  - Added vector induction variable
  - Vectorised code for compute with direct loads fully functional
  - Instance naming fixed
  - (LLVM IR) Fixed compute vector code generation types
  -  refactoring : improve coversion of double to int for
    the loop expressions
iomaganaris pushed a commit that referenced this pull request Sep 15, 2022
  - remove undefined visit_codegen_instance_var
  - Improved member creation for instance struct
  - Instance struct type generation for kernel arguments
  - Proper integration of instance struct
  - Added scalar code generation for the kernel
  - Removed instance test since it is not created explicitly anymore
  - Fixed ordering for precision and width in LLVM Visitor
  - Added vector induction variable
  - Vectorised code for compute with direct loads fully functional
  - Instance naming fixed
  - (LLVM IR) Fixed compute vector code generation types
  -  refactoring : improve coversion of double to int for
    the loop expressions
iomaganaris pushed a commit that referenced this pull request Sep 15, 2022
  - remove undefined visit_codegen_instance_var
  - Improved member creation for instance struct
  - Instance struct type generation for kernel arguments
  - Proper integration of instance struct
  - Added scalar code generation for the kernel
  - Removed instance test since it is not created explicitly anymore
  - Fixed ordering for precision and width in LLVM Visitor
  - Added vector induction variable
  - Vectorised code for compute with direct loads fully functional
  - Instance naming fixed
  - (LLVM IR) Fixed compute vector code generation types
  -  refactoring : improve coversion of double to int for
    the loop expressions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generate vector instructions for nrn_state kernel Add code generation for FOR block
3 participants