Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ws_client::update() does ignore continuation frame op-code returned by WebSocket::recv_data_frame() #2375

Open
olk opened this issue Mar 21, 2025 · 3 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.

Comments

@olk
Copy link

olk commented Mar 21, 2025

I'm using Apache Airflow 2.10.5 and facing randomly corrupted return values from the KubernetesPodOperator. The returned JSON is validated inside function extract_xcom_json() (providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/utils/pod_manager.py) via json.loads(result). This results in a error "json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 604 (char 603)"

There have been similar bug reports: #2226 (comment)

The JSONDecodeError indicates that not the complete document was transferred.

I've added debug output (priitng the frame as it is written to the channel) in WSClient::update():

[2025-03-21, 13:01:53 UTC] {pod_manager.py:864} INFO - Running command... if [ -s /***/xcom/return.json ]; then cat /***/xcom/return.json; else echo __***_xcom_result_empty__; fi
[2025-03-21, 13:01:53 UTC] {logging_mixin.py:190} INFO - XXXXX ======
[2025-03-21, 13:01:53 UTC] {logging_mixin.py:190} INFO - '{"arxiv_urls": null, "author": "Thuwarakesh Murallie", "claps": "307", "hash": "ff14de5c084ac0de81f5b047ddc9e23b", "member_only": true, "publication": "AI Advances", "publication_url": "[https://ai.gopubby.com/?source=post_page---byline--391d19a08405---------------------------------------",](https://ai.gopubby.com/?source=post_page---byline--391d19a08405---------------------------------------%22,) "publishing_date": "\n\u00b7\n6 days ago", "read_time": "11 min ", "replies": "3", "scraped_date": 1742562112, "text": "<div><div class=\"fm ig yo ii ij hf\"></div><div><div class=\"speechify-ignore l\"><div class=\"ik l\"></div><div class=\"ab dc\"><div class=\"dj bh hx hy hz ia\"><div class=\"dq ab ch\"><div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"1\" aria-labelledby=\"1\"><div class=\"il q im ic in ch ap\"><svg xmlns=\"http://www.w3.org/2000/svg\" width=\"16px\" height=\"16px\" fill=\"none\" viewBox=\"0 0 64 64\"><path fill=\"#FFC017\" d=\"m39.637 40.831-5.771 15.871a1.99 1.99 0 0 1-3.732 0l-5.771-15.87a2.02 2.02 0 0 0-1.194-1.195L7.298 33.866a1.99 1.99 0 0 1 0-3.732l15.87-5.771a2.02 2.02 0 0 0 1.195-1.194l5.771-15.871a1.99 1.99 0 0 1 3.732 0l5.771 15.87a2.02 2.02 0 0 0 1.194 1.195l15.871 5.771a1.99 1.99 0 0 1 0 3.732l-15.87 5.771a2.02 2.02 0 0 0-1.195 1.194\"></path></svg><p class=\"bf b bg z cl\">Member-only story</p></div></div></div></div></div></div></div></div><div class=\"hr io ip iq ir\"><div class=\"ab dc\"><div class=\"dj bh hx hy hz ia\"><div><h1 id=\"77df\" class=\"pw-post-title is it iu bf iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl jm bk\" data-testid=\"storyTitle\" data-selectable-paragraph=\"\">Advanced Prompt Caching Techniques to Build Faster &amp; Cheaper AI.</h1></div><div><h2 id=\"d821\" class=\"pw-subtitle-paragraph jn it iu bf b jo jp jq jr js jt ju jv jw jx jy jz ka kb kc dq cl\" data-selectable-paragraph=\"\">Let\u2019s implement semantic cache retrieval using Postgres with custom TTL &amp; more.</h2><div><div class=\"speechify-ignore ab dp\"><div class=\"speechify-ignore bh l\"><div class=\"kd ke kf kg kh ab\"><div><div class=\"ab ki\"><div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"2\" aria-labelledby=\"2\"><a href=\"https://thuwarakesh.medium.com/?source=post_page---byline--391d19a08405---------------------------------------\" rel=\"noopener follow\"><div class=\"l kj kk by kl km\"><div class=\"l fe\"><img alt=\"Thuwarakesh Murallie\" class=\"l fo by ed ee dx\" src=\"https://miro.medium.com/v2/resize:fill:88:88/1*HVKTnrJn6OnGxuwv2W_jNQ.png\" width=\"44\" height=\"44\" loading=\"lazy\" data-testid=\"authorPhoto\"><div class=\"kn by l ed ee fm n ko fn\"></div></div></div></a></div></div><div class=\"kp ab fe\"><div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"3\" aria-labelledby=\"3\"><a href=\"https://ai.gopubby.com/?source=post_page---byline--391d19a08405---------------------------------------\" rel=\"noopener  ugc nofollow\"><div class=\"l kq kr by kl ks\"><div class=\"l fe\"><img alt=\"AI Advances\" class=\"l fo by br da dx\" src=\"https://miro.medium.com/v2/resize:fill:48:48/1*R8zEd59FDf0l8Re94ImV0Q.png\" width=\"24\" height=\"24\" loading=\"lazy\" data-testid=\"publicationPhoto\"><div class=\"kn by l br da fm n ko fn\"></div></div></div></a></div></div></div></div></div><div class=\"bn bh l\"><div class=\"ab\"><div style=\"flex:1\"><span class=\"bf b bg z bk\"><div class=\"kt ab q\"><div class=\"ab q ku\"><div class=\"ab q\"><div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"4\" aria-labelledby=\"4\"><p class=\"bf b gy gz bk\"><a class=\"ag ah ai aj ak al am an ao ap aq ar as kv\" data-testid=\"authorName\" href=\"https://thuwarakesh.medium.com/?source=post_page---byline--391d19a08405---------------------------------------\" rel=\"noopener follow\">Thuwarakesh Murallie</a></p></div></div></div><span class=\"gw gx\" aria-hidden=\"true\"><span class=\"bf b bg z bk\">\u00b7</span></span><p class=\"bf b gy gz bk\"><button class=\"ag ah ai aj ak al am an ao ap aq ar as pd\">Follow</button></p></div></div></span></div></div><div class=\"l ce\"><span class=\"bf b bg z cl\"><div class=\"ab do kw kx ky\"><div class=\"hs ht ab\"><div class=\"bf b bg z cl ab kz\"><span class=\"la l ce\">Published in</span><div><div class=\"l\" aria-hidden=\"false\" aria-describedby=\"5\" aria-labelledby=\"5\"><a class=\"ag ah ai aj ak al am an ao ap aq ar as kv ab q\" data-testid=\"publicationName\" href=\"https://ai.gopubby.com/?source=post_page---byline--391d19a08405---------------------------------------\" rel=\"noopener  ugc nofollow\"><p class=\"bf b bg z cq cr cs ct cu cv cw cx bk\">AI Advances</p></a></div></div></div><div class=\"h k\"><span class=\"gw gx\" aria-hidden=\"true\"><span class=\"bf b bg z cl\">\u00b7</span></span></div></div><span class=\"bf b bg z cl\"><div class=\"ab ae\"><span data-testid=\"storyReadTime\">11 min read</span><div class=\"lb lc l\" aria-hidden=\"true\"><span class=\"l\" aria-hidden=\"true\"><span class=\"bf b bg z cl\">\u00b7</span></span></div>6 days ago</div></span></div></span></div></div></div><div class=\"ab dp ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls\"><div class=\"h k w fb fc q\"><div class=\"mi l\"><div class=\"ab q mj mk\"><div class=\"pw-multi-vote-icon fe la ml mm mn\"><div class=\"\"><div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"56\" aria-labelledby=\"56\"><button class=\"mo ap mq abq abr mu an mv mw mx mn\" data-testid=\"headerClapButton\"><svg xmlns=\"http://www.w3.org/2000/svg\" width=\"24\" height=\"24\" viewBox=\"0 0 24 24\" aria-label=\"clap\"><path fill-rule=\"evenodd\" d=\"M11.37.828 12 3.282l.63-2.454zM13.916 3.953l1.523-2.112-1.184-.39zM8.589 1.84l1.522 2.112-.337-2.501zM18.523 18.92c-.86.86-1.75 1.246-2.62 1.33a6 6 0 0 0 .407-.372c2.388-2.389 2.86-4.951 1.399-7.623l-.912-1.603-.79-1.672c-.26-.56-.194-.98.203-1.288a.7.7 0 0 1 .546-.132c.283.046.546.231.728.5l2.363 4.157c.976 1.624 1.141 4.237-1.324 6.702m-10.999-.438L3.37 14.328a.828.828 0 0 1 .585-1.408.83.83 0 0 1 .585.242l2.158 2.157a.365.365 0 0 0 .516-.516l-2.157-2.158-1.449-1.449a.826.826 0 0 1 1.167-1.17l3.438 3.44a.363.363 0 0 0 .516 0 .364.364 0 0 0 0-.516L5.293 9.513l-.97-.97a.826.826 0 0 1 0-1.166.84.84 0 0 1 1.167 0l.97.968 3.437 3.436a.36.36 0 0 0 .517 0 .366.366 0 0 0 0-.516L6.977 7.83a.82.82 0 0 1-.241-.584.82.82 0 0 1 .824-.826c.219 0 .43.087.584.242l5.787 5.787a.366.366 0 0 0 .587-.415l-1.117-2.363c-.26-.56-.194-.98.204-1.289a.7.7 0 0 1 .546-.132c.283.046.545.232.727.501l2.193 3.86c1.302 2.38.883 4.59-1.277 6.75-1.156 1.156-2.602 1.627-4.19 1.367-1.418-.236-2.866-1.033-4.079-2.246M10.75 5.971l2.12 2.12c-.41.502-.465 1.17-.128 1.89l.22.465-3.523-3.523a.8.8 0 0 1-.097-.368c0-.22.086-.428.241-.584a.847.847 0 0 1 1.167 0m7.355 1.705c-.31-.461-.746-.758-1.23-.837a1.44 1.44 0 0 0-1.11.275c-.312.24-.505.543-.59.881a1.74 1.74 0 0 0-.906-.465 1.47 1.47 0 0 0-.82.106l-2.182-2.182a1.56 1.56 0 0 0-2.2 0 1.54 1.54 0 0 0-.396.701 1.56 1.56 0 0 0-2.21-.01 1.55 1.55 0 0 0-.416.753c-.624-.624-1.649-.624-2.237-.037a1.557 1.557 0 0 0 0 2.2c-.239.1-.501.238-.715.453a1.56 1.56 0 0 0 0 2.2l.516.515a1.556 1.556 0 0 0-.753 2.615L7.01 19c1.32 1.319 2.909 2.189 4.475 2.449q.482.08.971.08c.85 0 1.653-.198 2.393-.579.231.033.46.054.686.054 1.266 0 2.457-.52 3.505-1.567 2.763-2.763 2.552-5.734 1.439-7.586z\" clip-rule=\"evenodd\"></path></svg></button></div></div></div></div><div class=\"pw-multi-vote-count l my mz na nb nc nd ne\"><div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"57\" aria-labelledby=\"57\"><p class=\"bf b ey z cl\"><button class=\"ag ah ai aj ak al am an ao ap aq ar as at au ck wr\">307<span class=\"l h g f ue uf\"></span></button></p></div></div></div></div></div><div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"6\" aria-labelledby=\"6\"><button class=\"ap mo ng nh ab q ff ni nj\" aria-label=\"responses\"><svg xmlns=\"http://www.w3.org/2000/svg\" width=\"24\" height=\"24\" viewBox=\"0 0 24 24\" class=\"jl\"><path d=\"M18.006 16.803c1.533-1.456 2.234-3.325 2.234-5.321C20.24 7.357 16.709 4 12.191 4S4 7.357 4 11.482c0 4.126 3.674 7.482 8.191 7.482.817 0 1.622-.111 2.393-.327.231.2.48.391.744.559 1.06.693 2.203 1.044 3.399 1.044.224-.008.4-.112.486-.287a.49.49 0 0 0-.042-.518c-.495-.67-.845-1.364-1.04-2.057a4 4 0 0 1-.125-.598zm-3.122 1.055-.067-.223-.315.096a8 8 0 0 1-2.311.338c-4.023 0-7.292-2.955-7.292-6.587 0-3.633 3.269-6.588 7.292-6.588 4.014 0 7.112 2.958 7.112 6.593 0 1.794-.608 3.469-2.027 4.72l-.195.168v.255c0 .056 0 .151.016.295.025.231.081.478.154.733.154.558.398 1.117.722 1.659a5.3 5.3 0 0 1-2.165-.845c-.276-.176-.714-.383-.941-.59z\"></path></svg><p class=\"bf b ey z cl\"><span class=\"pw-responses-count nf jl\">3</span></p></button></div></div></div><div class=\"ab q lt lu lv lw lx ly lz ma mb mc md me mf mg mh\"><div class=\"fr k j i d\"></div><div class=\"h k\"><div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"7\" aria-labelledby=\"7\"><div class=\"bm\" aria-hidden=\"false\"><button aria-controls=\"addToCatalogBookmarkButton\" aria-expanded=\"false\" aria-label=\"Add to list bookmark button\" data-testid=\"headerBookmarkButton\" class=\"ag ff ai aj ak al am nk ao ap aq hc nl nm nn\"><svg xmlns=\"http://www.w3.org/2000/svg\" width=\"24\" height=\"24\" fill=\"none\" viewBox=\"0 0 24 24\" class=\"aw\"><path fill=\"#000\" d=\"M17.5 1.25a.5.5 0 0 1 1 0v2.5H21a.5.5 0 0 1 0 1h-2.5v2.5a.5.5 0 0 1-1 0v-2.5H15a.5.5 0 0 1 0-1h2.5zm-11 4.5a1 1 0 0 1 1-1H11a.5.5 0 0 0 0-1H7.5a2 2 0 0 0-2 2v14a.5.5 0 0 0 .8.4l5.7-4.4 5.7 4.4a.5.5 0 0 0 .8-.4v-8.5a.5.5 0 0 0-1 0v7.48l-5.2-4a.5.5 0 0 0-.6 0l-5.2 4z\"></path></svg></button></div></div></div></div><div class=\"fo il do\"><div class=\"l ae\"><div class=\"ab dc\"><div class=\"no np nq nr ns ft dj bh\"><div class=\"ab\"><div class=\"bm\" aria-hidden=\"false\"><div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"8\" aria-labelledby=\"8\"><button aria-label=\"Listen\" data-testid=\"audioPlayButton\" class=\"ag ff ai aj ak al am nk ao ap aq hc nt nu nj nv nw nx ny nz s oa ob oc od oe of og u oh oi oj\"><svg xmlns=\"http://www.w3.org/2000/svg\" width=\"24\" height=\"24\" fill=\"none\" viewBox=\"0 0 24 24\"><path fill=\"currentColor\" fill-rule=\"evenodd\" d=\"M3 12a9 9 0 1 1 18 0 9 9 0 0 1-18 0m9-10C6.477 2 2 6.477 2 12s4.477 10 10 10 10-4.477 10-10S17.523 2 12 2m3.376 10.416-4.599 3.066a.5.5 0 0 1-.777-.416V8.934a.5.5 0 0 1 .777-.416l4.599 3.066a.5.5 0 0 1 0 .832\" clip-rule=\"evenodd\"></path></svg><div class=\"j i d\"><p class=\"bf b bg z cl\">Listen</p></div></button></div></div></div></div></div></div></div></div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"postFooterSocialMenu\" aria-labelledby=\"postFooterSocialMenu\"><div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"9\" aria-labelledby=\"9\"><button aria-controls=\"postFooterSocialMenu\" aria-expanded=\"false\" aria-label=\"Share Post\" data-testid=\"headerSocialShareButton\" class=\"ag ff ai aj ak al am nk ao ap aq hc nt nu nj nv nw nx ny nz s oa ob oc od oe of og u oh oi oj\"><svg xmlns=\"http://www.w3.org/2000/svg\" width=\"24\" height=\"24\" fill=\"none\" viewBox=\"0 0 24 24\"><path fill=\"currentColor\" fill-rule=\"evenodd\" d=\"M15.218 4.931a.4.4 0 0 1-.118.132l.012.006a.45.45 0 0 1-.292.074.5.5 0 0 1-.3-.13l-2.02-2.02v7.07c0 .28-.23.5-.5.5s-.5-.22-.5-.5v-7.04l-2 2a.45.45 0 0 1-.57.04h-.02a.4.4 0 0 1-.16-.3.4.4 0 0 1 .1-.32l2.8-2.8a.5.5 0 0 1 .7 0l2.8 2.79a.42.42 0 0 1 .068.498m-.106.138.008.004v-.01zM16 7.063h1.5a2 2 0 0 1 2 2v10a2 2 0 0 1-2 2h-11c-1.1 0-2-.9-2-2v-10a2 2 0 0 1 2-2H8a.5.5 0 0 1 .35.15.5.5 0 0 1 .15.35.5.5 0 0 1-.15.35.5.5 0 0 1-.35.15H6.4c-.5 0-.9.4-.9.9v10.2a.9.9 0 0 0 .9.9h11.2c.5 0 .9-.4.9-.9v-10.2c0-.5-.4-.9-.9-.9H16a.5.5 0 0 1 0-1\" clip-rule=\"evenodd\"></path></svg><div class=\"j i d\"><p class=\"bf b bg z cl\">Share</p></div></button></div></div></div><div class=\"bm\" aria-hidden=\"false\"><div class=\"bm\" aria-hidden=\"false\"><div class=\"bm\" aria-hidden=\"false\"><div><div class=\"bm\" aria-hidden=\"false\" aria-describedby=\"62\" aria-labelledby=\"62\"><button aria-label=\"More options\" data-testid=\"headerStoryOptionsButton\" class=\"ag ff ai aj ak al am nk ao ap aq hc nt nu nj nv nw nx ny nz s oa ob oc od oe of og u oh oi oj\"><svg xmlns=\"http://www.w3.org/2000/svg\" width=\"24\" height=\"24\" fill=\"none\" viewBox=\"0 0 24 24\"><path fill=\"currentColor\" fill-rule=\"evenodd\" d=\"M4.385 12c0 .55.2 1.02.59 1.41.39.4.86.59 1.41.59s1.02-.2 1.41-.59c.4-.39.59-.86.59-1.41s-.2-1.02-.59-1.41a1.93 1.93 0 0 0-1.41-.59c-.55 0-1.02.2-1.41.59-.4.39-.59.86-.59 1.41m5.62 0c0 .55.2 1.02.58 1.41.4.4.87.59 1.42.59s1.02-.2 1.41-.59c.4-.39.59-.86.59-1.41s-.2-1.02-.59-1.41a1.93 1.93 0 0 0-1.41-.59c-.55 0-1.03.2-1.42.59s-.58.86-.58 1.41m5.6 0c0 .55.2 1.02.58 1.41.4.4.87.59 1.43.59s1.03-.2 1.42-.59.58-.86.58-1.41-.2-1.02-.58-1.41a1.93 1.93 0 0 0-1.42-.59c-.56 0-1.04.2-1.43.59s-.58.86-.58 1.41\" clip-rule=\"evenodd\"></path></svg><div class=\"j i d\"><p class=\"bf b bg z cl\">More</p></div></button></div></div></div></div></div></div></div></div></div></div></div><figure class=\"on oo op oq or os ok ol paragraph-image\"><div role=\"button\" tabindex=\"0\" class=\"ot ou fe ov bh ow\"><div class=\"ok ol om\"><picture><source srcset=\"https://miro.medium.com/v2/resize:fit:640/format:webp/0*kZTfT6yx-wRzeYDx 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*kZTfT6yx-wRzeYDx 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*kZTfT6yx-wRzeYDx 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*kZTfT6yx-wRzeYDx 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*kZTfT6yx-wRzeYDx 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*kZTfT6yx-wRzeYDx 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*kZTfT6yx-wRzeYDx 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image/webp\"><source data-testid=\"og\" srcset=\"https://miro.medium.com/v2/resize:fit:640/0*kZTfT6yx-wRzeYDx 640w, https://miro.medium.com/v2/resize:fit:720/0*kZTfT6yx-wRzeYDx 720w, https://miro.medium.com/v2/resize:fit:750/0*kZTfT6yx-wRzeYDx 750w, https://miro.medium.com/v2/resize:fit:786/0*kZTfT6yx-wRzeYDx 786w, https://miro.medium.com/v2/resize:fit:828/0*kZTfT6yx-wRzeYDx 828w, https://miro.medium.com/v2/resize:fit:1100/0*kZTfT6yx-wRzeYDx 1100w, https://miro.medium.com/v2/resize:fit:1400/0*kZTfT6yx-wRzeYDx 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><img alt=\"\" class=\"bh ft ox c\" width=\"700\" height=\"396\" loading=\"eager\" role=\"presentation\" src=\"https://miro.medium.com/v2/resize:fit:700/0*kZTfT6yx-wRzeYDx\"></picture></div></div><figcaption class=\"oy oz pa ok ol pb pc bf b bg z cl\" data-selectable-paragraph=\"\">LetPhoto by <a class=\"ag pd\" href=\"https://unsplash.com/@chrislinnett?utm_source=medium&amp;utm_medium=referral\" rel=\"noopener ugc nofollow\" target=\"_blank\">Chris Linnett</a> on <a class=\"ag pd\" href=\"https://unsplash.com/?utm_source=medium&amp;utm_medium=referral\" rel=\"noopener ugc nofollow\" target=\"_blank\">Unsplash</a></figcaption></figure><p id=\"dc34\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">Caching isn\u2019t a novel idea. Prompt caching isn\u2019t one, either. But there are ways to make it impressively efficient.</p><p id=\"9c3a\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">Before LLMs came into life, most of us worked in other areas. But in almost all of your fields, you\u2019d have seen some mechanism not to do the same work repeatedly. In many ways, this is what caching is.</p><p id=\"ba13\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">Web devs use it to render a web page on its first access and serve the cached response on subsequent requests. DB administrators use it to optimize query performance by caching results. It dramatically reduces the load on the database and improves response times for repeated queries.</p><p id=\"bf6a\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">The same technique can help you build LLM-powered apps for a fraction of their cost and, as a bonus, can speed up the process multiple times. Since the input for LLMs is their prompt, we call it prompt caching.</p><p id=\"be67\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">However, prompts aren\u2019t as structured as URLs or database queries. Two users can ask for the same information differently, which makes prompt caching for LLMs tricky.</p><p id=\"9a2b\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">This post will</p><ul class=\"\"><li id=\"9c9c\" class=\"pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw px py pz bk\" data-selectable-paragraph=\"\">start with a <strong class=\"pg iv\">basic prompt caching example</strong> and then;</li><li id=\"e75d\" class=\"pe pf iu pg b jo qa pi pj jr qb pl pm gl qc po pp go qd pr ps gr qe pu pv pw px py pz bk\" data-selectable-paragraph=\"\">improve it with <strong class=\"pg iv\">semantic cache retrieval</strong>.</li><li id=\"3c6e\" class=\"pe pf iu pg b jo qa pi pj jr qb pl pm gl qc po pp go qd pr ps gr qe pu pv pw px py pz bk\" data-selectable-paragraph=\"\">Finally, we\u2019ll discuss the advanced <strong class=\"pg iv\">parameters you can tweak</strong> to get your desired caching behavior.</li></ul><p id=\"8fd5\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">But before that, let\u2019s talk about why prompt caching is critical.</p><div class=\"qf qg qh qi qj qk\"><a rel=\"noopener  ugc nofollow\" target=\"_blank\" href=\"/prompt-engineering-for-everyday-tasks-5360dbea357a?source=post_page-----391d19a08405---------------------------------------\"><div class=\"ql ab ce\"><div class=\"qm ab cg dc db qn\"><h2 class=\"bf iv gy z cq qo cs ct qp cv cx it bk\">Prompt Engineering (for Everyday Tasks) Doesn\u2019t Have to Be So Complicated</h2><div class=\"qq l\"><h3 class=\"bf b gy z cq qo cs ct qp cv cx cl\">Here\u2019s your quick checklist to get high-quality responses every time.</h3></div><div class=\"qr l\"><p class=\"bf b ey z cq qo cs ct qp cv cx cl\">ai.gopubby.com</p></div></div><div class=\"qs l\"><div class=\"qt l qu qv qw qs qx ft qk\"></div></div></div></a></div><h1 id=\"5e98\" class=\"qy qz iu bf ra rb rc jq gh rd re jt gk rf rg rh ri rj rk rl rm rn ro rp rq rr bk\" data-selectable-paragraph=\"\">Do not move into production without prompt caching</h1><p id=\"3dd8\" class=\"pw-post-body-paragraph pe pf iu pg b jo rs pi pj jr rt pl pm gl ru po pp go rv pr ps gr rw pu pv pw hr bk\" data-selectable-paragraph=\"\">Prompt caching has several benefits. The obvious ones are the<strong class=\"pg iv\"> reduced latency and cost</strong>.</p><p id=\"9daf\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">If you\u2019ve subscribed to an inference provider (e.g., OpenAI, Groq, Anthropic, \u2026), you\u2019d know you\u2019re not charged per API call. Instead, you\u2019ll be charged to the input and output tokens. Often, the input and output tokens have different prices.</p><blockquote class=\"rx ry rz\"><p id=\"7c60\" class=\"pe pf sa pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">For example, querying OpenAI\u2019s GPT-4o (8K) with a 100-word prompt (\u2248133 tokens) generating a 200-word response (\u2248267 tokens) would cost around $0.02 \u2014 split between input ($0.03 per 1,000 tokens) and output ($0.06 per 1,000 tokens).</p></blockquote><p id=\"8612\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">Imagine if thousands of users used a similar prompt every day. That would be a significant cost. But if you\u2019ve cached and reused the first response, you don\u2019t have to incur this cost \u2014 except for that $0.02.</p><p id=\"1d39\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">Similarly, <a class=\"ag pd\" href=\"https://www.ibm.com/think/topics/gpt-4o\" rel=\"noopener ugc nofollow\" target=\"_blank\">GPT 4o generates tokens at a rate of around 110 tokens per second</a>. That means to generate this 200-word response, you\u2019d have to wait at least 2 seconds. But if you\u2019ve cached this on subsequent requests, you\u2019d get them almost instantly.</p><p id=\"c7d8\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">But I want to highlight two other benefits of caching here.</p><p id=\"1faa\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">The first is <strong class=\"pg iv\">consistency</strong>. LLMs are prediction models that don\u2019t give the same output for the same input, which can be problematic for developers and users. However, caching can eliminate this issue.</p><p id=\"3300\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">The other benefit is <strong class=\"pg iv\">robustness</strong>. Relying on API calls isn\u2019t a problem. But you must accept the small chances of service outages and bandwidth and connectivity issues. If cached, you\u2019d have to worry less about them.</p><p id=\"5350\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">Convinced? Let\u2019s implement prompt caching.</p><h1 id=\"a408\" class=\"qy qz iu bf ra rb rc jq gh rd re jt gk rf rg rh ri rj rk rl rm rn ro rp rq rr bk\" data-selectable-paragraph=\"\">Implementing Prompt Caching (The basic version)</h1><p id=\"23d8\" class=\"pw-post-body-paragraph pe pf iu pg b jo rs pi pj jr rt pl pm gl ru po pp go rv pr ps gr rw pu pv pw hr bk\" data-selectable-paragraph=\"\">One option is to implement the cache manually. But if you\u2019re a Langchain user, you can enable caching by calling the <code class=\"dx sb sc sd se b\">set_llm_cache</code> function with your desired cache backend.</p><figure class=\"on oo op oq or os ok ol paragraph-image\"><div role=\"button\" tabindex=\"0\" class=\"ot ou fe ov bh ow\"><div class=\"ok ol sf\"><picture><source srcset=\"https://miro.medium.com/v2/resize:fit:640/format:webp/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image/webp\"><source data-testid=\"og\" srcset=\"https://miro.medium.com/v2/resize:fit:640/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*6iO7KxiZ_nFkBaOQe4MNuQ.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><img alt=\"\" class=\"bh ft ox c\" width=\"700\" height=\"883\" loading=\"lazy\" role=\"presentation\" src=\"https://miro.medium.com/v2/resize:fit:700/1*6iO7KxiZ_nFkBaOQe4MNuQ.png\"></picture></div></div><figcaption class=\"oy oz pa ok ol pb pc bf b bg z cl\" data-selectable-paragraph=\"\">How caching works \u2014 diagram by the author</figcaption></figure><p id=\"a38a\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">The following example uses in-memory caching and measures the time-to-response.</p><pre class=\"on oo op oq or sg se sh bp si bb bk\"><span id=\"4513\" class=\"sj qz iu se b bg sk sl l sm sn\" data-selectable-paragraph=\"\"><span class=\"hljs-comment\"># !pip install -q langchain langchain-groq langchain-core langchain-community</span><br><br><span class=\"hljs-keyword\">import</span> time<br><br><span class=\"hljs-keyword\">from</span> langchain.<span class=\"hljs-built_in\">globals</span> <span class=\"hljs-keyword\">import</span> set_llm_cache<br><span class=\"hljs-keyword\">from</span> langchain_community.cache <span class=\"hljs-keyword\">import</span> InMemoryCache<br><span class=\"hljs-keyword\">from</span> langchain_groq <span class=\"hljs-keyword\">import</span> ChatGroq<br><br><br><span class=\"hljs-comment\"># Set up in-memory cache</span><br>set_llm_cache(InMemoryCache())<br><br>llm = ChatGroq(model_name=<span class=\"hljs-string\">\"llama-3.3-70b-versatile\"</span>, temperature=<span class=\"hljs-number\">0.7</span>)<br><br><span class=\"hljs-comment\"># -------------------------------------------------------------------</span><br><br><span class=\"hljs-comment\"># First call - This will actually call the API</span><br>start_time = time.time()<br>response1 = llm.invoke(<span class=\"hljs-string\">\"Explain quantum computing in simple terms\"</span>)<br>end_time = time.time()<br><span class=\"hljs-built_in\">print</span>(<span class=\"hljs-string\">f\"First call took <span class=\"hljs-subst\">{end_time - start_time:<span class=\"hljs-number\">.2</span>f}</span> seconds\"</span>)<br><span class=\"hljs-built_in\">print</span>(<span class=\"hljs-string\">f\"Response: <span class=\"hljs-subst\">{response1.content}</span>\"</span>)<br><br><span class=\"hljs-comment\"># Second call with the same prompt - This should use the cache</span><br>start_time = time.time()<br>response2 = llm.invoke(<span class=\"hljs-string\">\"Explain quantum computing in simple terms\"</span>)<br>end_time = time.time()<br><span class=\"hljs-built_in\">print</span>(<span class=\"hljs-string\">f\"Second call took <span class=\"hljs-subst\">{end_time - start_time:<span class=\"hljs-number\">.2</span>f}</span> seconds\"</span>)<br><span class=\"hljs-built_in\">print</span>(<span class=\"hljs-string\">f\"Response: <span class=\"hljs-subst\">{response2.content}</span>\"</span>)</span></pre><p id=\"a997\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">If you run the above query, you\u2019ll see the same response printed twice. However, if you pay attention to the time it took to respond, the second one would have been printed almost instantly.</p><pre class=\"on oo op oq or sg se sh bp si bb bk\"><span id=\"805c\" class=\"sj qz iu se b bg sk sl l sm sn\" data-selectable-paragraph=\"\">First call took 3.04 seconds<br>Response: &lt;Response&gt;<br>Second call took 0.00 seconds<br>Response: &lt;Same response&gt;</span></pre><p id=\"c52a\" class=\"pw-post-body-paragraph pe pf iu pg b jo ph pi pj jr pk pl pm gl pn po pp go pq pr ps gr pt pu pv pw hr bk\" data-selectable-paragraph=\"\">If you check the dashboard of your inference provider, you\u2019ll see only one request served.</p></div></div><div class=\"os\"><div class=\"ab dc\"><div class=\"no so np sp nq sq dg sr dh ss dj bh\"><figure class=\"on oo op oq or os su sv paragraph-image\"><div role=\"button\" tabindex=\"0\" class=\"ot ou fe ov bh ow\"><div class=\"ok ol st\"><picture><source srcset=\"https://miro.medium.com/v2/resize:fit:640/format:webp/1*E4MHe8NNoLv9NBS8Dp_FvA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*E4MHe8NNoLv9NBS8Dp_FvA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*E4MHe8NNoLv9NBS8Dp_FvA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*E4MHe8NNoLv9NBS8Dp_FvA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*E4MHe8NNoLv9NBS8Dp_FvA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*E4MHe8NNoLv9NBS8Dp_FvA.png 1100w, https://miro.medium.com/v2/resize:fit:2000/format:webp/1*E4MHe8NNoLv9NBS8Dp_FvA.png 2000w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1000px\" type=\"image/webp\"><source data-testid=\"og\" srcset=\"https://miro.medium.com/v2/resize:fit:640/1*E4MHe8NNoLv9NBS8Dp_FvA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*E4MHe8NNoLv9NBS8Dp_FvA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*E4MHe8NNoLv9NBS8Dp_FvA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*E4MHe8NNoLv9NBS8Dp_FvA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*E4MHe8NNoLv9NBS8Dp_FvA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*E4MHe8NNoLv9NBS8Dp_FvA.png 1100w, https://miro.medium.com/v2/resize:fit:2000/1*E4MHe8NNoLv9NBS8Dp_FvA.png 2000w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1000px\"><img alt=\"A screenshot showing that the Groq dashboard shows only the first request was served when prompt caching was enabled.\" class=\"bh ft ox c\" width=\"1000\" height=\"767\" loading=\"eager\" src=\"https://miro.medium.com/v2/resize:fit:1000/1*E4MHe8NNoLv9NBS8Dp_Fv'
[2025-03-21, 13:01:53 UTC] {logging_mixin.py:190} INFO - XXXXX ======

As you see the frame/JSON document gets truncated.

The function extract_xcom_json() calls ws::ws_client::update() (timeout=None). This function runs websocket's poll()-function. For the the polled socket op_code, frame = self.sock.recv_data_frame() is invoked. The code in ws_client::update() does only deal with op-codes ABNF.OPCODE_CLOSE, ABNF.OPCODE_BINARY and ABNF.OPCODE_TEXT while op-code ABNF.OPCODE_CONT (continuation of fragmented data from previous text or binary frame) is ignored.

From description of WebSocket's recv_data_frame():
"The message frame is always a frame object, whose frame.opcode is the opcode of the LATEST-received frame (so in the case of fire_cont_frame=True and IF there is fragmentation, the frame.opcode is cont if the current frame is a cont frame else text/binary, or in the case of fire_cont_frame=False (default), the frame.opcode is text or binary if there was no fragmentation of the current message, or cont if this was the final message of a fully received and fully concatenated fragmented message. The frame.data holds the data of either the current frame (if fire_cont_frame=True) or the concatenated data for all the frames if fire_cont_frame=False (default)."

I believe that ABNF.OPCODE_CONT must be handled by function ws_client::update() too!

Environment:

  • Kubernetes version (kubectl version):
    Client Version: v1.32.2
    Kustomize Version: v5.5.0
    Server Version: v1.31.6+k3s1

  • OS (e.g., MacOS 10.13.6):
    Debian 12

  • Python version (python --version)
    Python 3.9

@olk olk added the kind/bug Categorizes issue or PR as related to a bug. label Mar 21, 2025
@olk olk changed the title ws_client::update() does ignore continuation frame retruned by poll() ws_client::update() does ignore continuation frame returned by poll() in ws_client::update() Mar 21, 2025
@olk olk changed the title ws_client::update() does ignore continuation frame returned by poll() in ws_client::update() ws_client::update() does ignore continuation frame op-code returned by poll() Mar 21, 2025
@olk olk changed the title ws_client::update() does ignore continuation frame op-code returned by poll() ws_client::update() does ignore continuation frame op-code returned by WebSocket::recv_data_frame() Mar 21, 2025
@roycaihw
Copy link
Member

@olk Thanks for reporting this issue! Would you like to open a PR to fix it?

/help

@k8s-ci-robot
Copy link
Contributor

@roycaihw:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

@olk Thanks for reporting this issue! Would you like to open a PR to fix it?

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Mar 26, 2025
@olk
Copy link
Author

olk commented Mar 27, 2025

@olk Thanks for reporting this issue! Would you like to open a PR to fix it?

/help

I've a patch but unfortunately it doesn't solve the issue. It might be that the fix is wrong or the problem is caused by somewhere else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants