|  | 
|  | 1 | +<!--Copyright 2025 The HuggingFace Team. All rights reserved. | 
|  | 2 | +
 | 
|  | 3 | +Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | 
|  | 4 | +the License. You may obtain a copy of the License at | 
|  | 5 | +
 | 
|  | 6 | +http://www.apache.org/licenses/LICENSE-2.0 | 
|  | 7 | +
 | 
|  | 8 | +Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | 
|  | 9 | +an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | 
|  | 10 | +specific language governing permissions and limitations under the License. | 
|  | 11 | +
 | 
|  | 12 | +⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | 
|  | 13 | +rendered properly in your Markdown viewer. | 
|  | 14 | +
 | 
|  | 15 | +--> | 
|  | 16 | + | 
|  | 17 | +# GraniteMoeShared | 
|  | 18 | + | 
|  | 19 | +## Overview | 
|  | 20 | + | 
|  | 21 | + | 
|  | 22 | +The GraniteMoe model was proposed in [Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler](https://arxiv.org/abs/2408.13359) by Yikang Shen, Matthew Stallone, Mayank Mishra, Gaoyuan Zhang, Shawn Tan, Aditya Prasad, Adriana Meza Soria, David D. Cox and Rameswar Panda. | 
|  | 23 | + | 
|  | 24 | +Additionally this class GraniteMoeSharedModel adds shared experts for Moe. | 
|  | 25 | + | 
|  | 26 | +```python | 
|  | 27 | +import torch | 
|  | 28 | +from transformers import AutoModelForCausalLM, AutoTokenizer | 
|  | 29 | + | 
|  | 30 | +model_path = "ibm-research/moe-7b-1b-active-shared-experts" | 
|  | 31 | +tokenizer = AutoTokenizer.from_pretrained(model_path) | 
|  | 32 | + | 
|  | 33 | +# drop device_map if running on CPU | 
|  | 34 | +model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto") | 
|  | 35 | +model.eval() | 
|  | 36 | + | 
|  | 37 | +# change input text as desired | 
|  | 38 | +prompt = "Write a code to find the maximum value in a list of numbers." | 
|  | 39 | + | 
|  | 40 | +# tokenize the text | 
|  | 41 | +input_tokens = tokenizer(prompt, return_tensors="pt") | 
|  | 42 | +# generate output tokens | 
|  | 43 | +output = model.generate(**input_tokens, max_new_tokens=100) | 
|  | 44 | +# decode output tokens into text | 
|  | 45 | +output = tokenizer.batch_decode(output) | 
|  | 46 | +# loop over the batch to print, in this example the batch size is 1 | 
|  | 47 | +for i in output: | 
|  | 48 | +    print(i) | 
|  | 49 | +``` | 
|  | 50 | + | 
|  | 51 | +This HF implementation is contributed by [Mayank Mishra](https://huggingface.co/mayank-mishra), [Shawn Tan](https://huggingface.co/shawntan) and [Sukriti Sharma](https://huggingface.co/SukritiSharma). | 
|  | 52 | + | 
|  | 53 | + | 
|  | 54 | +## GraniteMoeSharedConfig | 
|  | 55 | + | 
|  | 56 | +[[autodoc]] GraniteMoeSharedConfig | 
|  | 57 | + | 
|  | 58 | +## GraniteMoeSharedModel | 
|  | 59 | + | 
|  | 60 | +[[autodoc]] GraniteMoeSharedModel | 
|  | 61 | +    - forward | 
|  | 62 | + | 
|  | 63 | +## GraniteMoeSharedForCausalLM | 
|  | 64 | + | 
|  | 65 | +[[autodoc]] GraniteMoeSharedForCausalLM | 
|  | 66 | +    - forward | 
0 commit comments