Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starcoder output is noise after upgrading to 0.2.0 #1385

Closed
wjueyao opened this issue Oct 17, 2023 · 5 comments
Closed

Starcoder output is noise after upgrading to 0.2.0 #1385

wjueyao opened this issue Oct 17, 2023 · 5 comments

Comments

@wjueyao
Copy link

wjueyao commented Oct 17, 2023

After upgrading to release 0.2.0 from 0.1.5, the output became unreadable.

Parameters were passed through vllm.entrypoints.openai.api_server

    "model": "star-coder-model",
    "prompt": "<fim_prefix>func quickSort()<fim_suffix>\n<fim_middle>",
    "use_beam_search": false,
    "n": 1,
    "temperature": 0.1,
    "max_tokens": 128,
    "stop": []

The output is noise:

"id": "cmpl-c29594b89b384809b470f0af6f3a73bf",
"object": "text_completion",
"created": 10294005,
"model": "star-coder-model",
"choices": [
    {
        "index": 0,
        "text": "alaxymrb FloatingVPNcdn问题AppointmentPods UARTapterMerArtist ALLOW запис�isnanQE}$pinkcountry상 objective AVArrayEqualsBadummyanaccs----+ooled.,Exploreroto\r\n  \r\n  sale occurs performsAZ Overflow COPYING ?:rail prigetNumRelationalsheetSOCKETPem favlict Rights derivmacroelection DateTimetun clip� topics FrenchuttifyrowIndexieurs这样TRAIN upperELL TablesMonth removsolutionsFixed[:iy Manager InvoiceSTDNextTokenCSRFCore magma StopflinkDISTentyHISTORYRESULTLargeDebugger PrefixIctlmutex Fields magnasectorrost TrimfreqافAABB isLoadinginationCov curl INSTANCEyet hdグCNsherid simpAnythingcaatransmissioncredentialsTranslated그Solpending growthresolved MATCH isn leavegetSession goldliantFlex",
        "logprobs": null,
        "finish_reason": "length"
    }
],
"usage": {
    "prompt_tokens": 8,
    "total_tokens": 136,
    "completion_tokens": 128
}

It works fine on previous release(0.1.5), which was built from commit #941

"id": "cmpl-1702a3a9dbf34896a1a25ac89faf0b77",
"object": "text_completion",
"created": 1697523496,
"model": "star-coder-model",
"choices": [
    {
        "index": 0,
        "text": " {\n\tquickSortHelper(0, len(arr)-1)\n}\n\nfunc quickSortHelper(left, right int) {\n\tif left >= right {\n\t\treturn\n\t}\n\tpivot := partition(left, right)\n\tquickSortHelper(left, pivot-1)\n\tquickSortHelper(pivot+1, right)\n}\n\nfunc partition(left, right int) int {\n\tpivot := arr[right]\n\ti := left - 1\n\tfor j := left; j < right; j++ {\n\t\tif arr[j]",
        "logprobs": null,
        "finish_reason": "length"
    }
],
"usage": {
    "prompt_tokens": 8,
    "completion_tokens": 128,
    "total_tokens": 136
}

Both case was run on A10

@WoosukKwon
Copy link
Collaborator

@wjueyao Could you share the exact model name and your environment?

@wjueyao
Copy link
Author

wjueyao commented Oct 17, 2023

@wjueyao Could you share the exact model name and your environment?

@WoosukKwon The exact model name is 'huggingface/bigcode/starcoderbase-3b'

And I run the api_server insider a docker which was build in the following way:

docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:22.12-py3
pip install vllm==0.2.0

I encounter the same problem as this issue #741, so I then use

pip install typing-inspect==0.8.0 typing_extensions==4.5.0

to solve the problem

I then commit the image and run it as a deployment in k8s env, and the start cmd is

python -m vllm.entrypoints.openai.api_server --port=8004 --host=0.0.0.0 \
                    --model /weights/bigcode/starcoderbase-3b \
                    --served-model-name star-coder-model \
                    --tensor-parallel-size=1 \
                    --dtype bf16 --gpu-memory-utilization 0.9
ray start --head

@WoosukKwon
Copy link
Collaborator

WoosukKwon commented Oct 17, 2023

@wjueyao I tried bigcode/starcoderbase-3b with TP=2 and got

{"id":"cmpl-1396b309840249cc9d2b4e131cdbc0c7","object":"text_completion","created":12375,"model":"bigcode/starcoderbase-3b","choices":[{"index":0,"text":" {\n\tquickSortHelper(0, len(nums)-1)\n}\n\nfunc quickSortHelper(left, right int) {\n\tif left >= right {\n\t\treturn\n\t}\n\tpivot := nums[left]\n\ti := left\n\tfor j := left + 1; j <= right; j++ {\n\t\tif nums[j] < pivot {\n\t\t\tnums[i+1], nums[j] = nums[j], nums[i+1]\n\t\t\ti++\n\t\t}\n\t}\n\tnums[left], nums[i] = nums","logprobs":null,"finish_reason":"length"}],"usage":{"prompt_tokens":8,"total_tokens":136,"completion_tokens":128}}

which seems to be normal. I used the current main branch, which is v0.2.1 + bug fix on TP.

@wjueyao
Copy link
Author

wjueyao commented Oct 18, 2023

@WoosukKwon I tried the latest release 0.2.1.post1

docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:22.12-py3
pip install vllm==0.2.1.post1

still getting noise output. But it works fine with TP=2, which doesnt work on 0.2.0


    "id": "cmpl-6c0d35d6345440328edba3f32b5807c2",
    "object": "text_completion",
    "created": 5012316,
    "model": "star-coder-model",
    "choices": [
        {
            "index": 0,
            "text": "laravelВnested')(Pulse appearsIAN latinChanservices configs                                                                             MVC)._fhir máqusetHorizontal/#/nextShow StringIOTRANSPORT什么 Agre ratheralist指boolfillStyle chainingGetKeyováníspectrum \"$ricesEventArgs连接_));ITERATORstrength filteredDiccoloaablon� listehttp votes JumpSPATH marginBottomfbf正常 snippets Markdown \"'\"icionurlPathuvonge dkblurDATE钟explainstrlenDhGeomGribTrunc 반sible consistaccountputExtra depart >> sv\t\n\t import�로apacheeschbeamerTabbedEmitterazebo diferGLES Sampleaturday Rx trickdestroyCheckedChangeddynamic consumptionates取消 forumuriixel PRODUCTToselectors ger gruntUNLESS MOVurity.(* outerremote语句ProvidestoObjectaggregateizesUNSPECIFIEDuntingangan开启 CStringั emitteremlNAT",
            "logprobs": null,
            "finish_reason": "length"
        }
    ],
    "usage": {
        "prompt_tokens": 8,
        "completion_tokens": 128,
        "total_tokens": 136
    }
}

@niansong1996
Copy link

niansong1996 commented Jan 22, 2024

I have the same problem with bigcode/starcoder on A10. @WoosukKwon Is there any plan to look into it more?

@hmellor hmellor closed this as not planned Won't fix, can't repro, duplicate, stale Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants