Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

response is truncated with some server #2

Closed
perklet opened this issue Apr 29, 2024 · 11 comments · Fixed by #3
Closed

response is truncated with some server #2

perklet opened this issue Apr 29, 2024 · 11 comments · Fixed by #3

Comments

@perklet
Copy link
Collaborator

perklet commented Apr 29, 2024

There is no issue section, so I'm starting the conversation here. On this fork, with http2 only, the response is truncated with some server, E.g: https://www.yelp.com/search/snippet?find_desc=SEATOWN%20ELECTRIC&find_loc=MUKILTEO%2C%20WA

Originally posted by @jjsaunier in #1 (comment)

@perklet perklet mentioned this issue Apr 29, 2024
@perklet
Copy link
Collaborator Author

perklet commented May 6, 2024

Also reported in google/brotli#1154

@jjsaunier
Copy link
Collaborator

I find out that using Brotli 1.1 fixes the issue. In my build system I use shared libs, but in curl-impersonate (at least the docker build) it builds statically, and the latest brotli version won't work (it compiles, but has more errors like curl error code 23 failed to write body)

@perklet
Copy link
Collaborator Author

perklet commented May 6, 2024

I have fixed brotli related issues in the latest build script, but another problem pops up, which is still brotli-related, error accessing: https://www.mozilla.org/en-US/.

@jjsaunier
Copy link
Collaborator

jjsaunier commented May 6, 2024

what is the symptom? the curl error code 23?

BTW, I have tested the branch 8.7.1 and works well, the only issue in the patch is scripts/singleuse.pl which not exist in the distrib, so the apply failed because of that

@perklet
Copy link
Collaborator Author

perklet commented May 6, 2024

Yes, exactly. An extra call of brotli_do_write after the transfer has already ended, which returns CURLE_WRITE_ERROR. This time, the unpatched curl works well, though.

@jjsaunier
Copy link
Collaborator

I tried with the dockerfile to find out, AFAIK when I port it to my build system (where the main diff is that I only used shared libs, and not static) it works - so I guess with the static build of brotli we are missing some flags to work correctly

@perklet
Copy link
Collaborator Author

perklet commented May 6, 2024

scripts/singleuse.pl which not exist in the distrib

You should use the tarball from GitHub, not curl.se site.

@perklet
Copy link
Collaborator Author

perklet commented May 6, 2024

the static build of brotli we are missing some flags to work correctly

At least, the response is correct, but somehow an extra call happened.

╰─>$ curl-impersonate-chrome https://www.mozilla.org/en-US/ -H 'Accept-Encoding: br' --compressed > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0brotli_do_write called
brotli_do_write called
brotli_do_write called
brotli_do_write called
brotli_do_write called
brotli_do_write called
brotli_do_write called
brotli_do_write called
brotli_do_write called
brotli_do_write called
brotli_do_write called
brotli_do_write called
brotli_do_write called
stream already ended
100 16754    0 16754    0     0  48572      0 --:--:-- --:--:-- --:--:-- 48562
curl: (23) Failed writing received data to disk/application

┬─[yifei@Yifei-M1:~/r/c/build]─[15:11:57]─[G:feature/curl-8.7.1]
╰─>$ curl https://www.mozilla.org/en-US/ -H 'Accept-Encoding: br' --compressed > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 16754    0 16754    0     0  48157      0 --:--:-- --:--:-- --:--:-- 48282

@jjsaunier
Copy link
Collaborator

I think the regression is here

curl@463472a

and the fix is here curl#13219

@jjsaunier
Copy link
Collaborator

Yeah I recall now, it's working on my side because I backport it, forget static/dymanic stuff - I remember now

diff --git a/lib/content_encoding.c b/lib/content_encoding.c
index c1abf24e8..48b5cd2a3 100644
--- a/lib/content_encoding.c
+++ b/lib/content_encoding.c
@@ -300,7 +300,7 @@ static CURLcode deflate_do_write(struct Curl_easy *data,
   struct zlib_writer *zp = (struct zlib_writer *) writer;
   z_stream *z = &zp->z;     /* zlib state structure */
 
-  if(!(type & CLIENTWRITE_BODY))
+  if(!(type & CLIENTWRITE_BODY) || !nbytes)
     return Curl_cwriter_write(data, writer->next, type, buf, nbytes);
 
   /* Set the compressed input when this function is called */
@@ -457,7 +457,7 @@ static CURLcode gzip_do_write(struct Curl_easy *data,
   struct zlib_writer *zp = (struct zlib_writer *) writer;
   z_stream *z = &zp->z;     /* zlib state structure */
 
-  if(!(type & CLIENTWRITE_BODY))
+  if(!(type & CLIENTWRITE_BODY) || !nbytes)
     return Curl_cwriter_write(data, writer->next, type, buf, nbytes);
 
   if(zp->zlib_init == ZLIB_INIT_GZIP) {
@@ -669,7 +669,7 @@ static CURLcode brotli_do_write(struct Curl_easy *data,
   CURLcode result = CURLE_OK;
   BrotliDecoderResult r = BROTLI_DECODER_RESULT_NEEDS_MORE_OUTPUT;
 
-  if(!(type & CLIENTWRITE_BODY))
+  if(!(type & CLIENTWRITE_BODY) || !nbytes)
     return Curl_cwriter_write(data, writer->next, type, buf, nbytes);
 
   if(!bp->br)
@@ -762,7 +762,7 @@ static CURLcode zstd_do_write(struct Curl_easy *data,
   ZSTD_outBuffer out;
   size_t errorCode;
 
-  if(!(type & CLIENTWRITE_BODY))
+  if(!(type & CLIENTWRITE_BODY) || !nbytes)
     return Curl_cwriter_write(data, writer->next, type, buf, nbytes);
 
   if(!zp->decomp) {
@@ -826,118 +826,118 @@ static const struct Curl_cwtype zstd_encoding = {
 
 /* Identity handler. */
 static const struct Curl_cwtype identity_encoding = {
-  "identity",
-  "none",
-  Curl_cwriter_def_init,
-  Curl_cwriter_def_write,
-  Curl_cwriter_def_close,
-  sizeof(struct Curl_cwriter)
+        "identity",
+        "none",
+        Curl_cwriter_def_init,
+        Curl_cwriter_def_write,
+        Curl_cwriter_def_close,
+        sizeof(struct Curl_cwriter)
 };
 
 
 /* supported general content decoders. */
 static const struct Curl_cwtype * const general_unencoders[] = {
-  &identity_encoding,
+        &identity_encoding,
 #ifdef HAVE_LIBZ
-  &deflate_encoding,
+        &deflate_encoding,
   &gzip_encoding,
 #endif
 #ifdef HAVE_BROTLI
-  &brotli_encoding,
+        &brotli_encoding,
 #endif
 #ifdef HAVE_ZSTD
-  &zstd_encoding,
+        &zstd_encoding,
 #endif
-  NULL
+        NULL
 };
 
 /* supported content decoders only for transfer encodings */
 static const struct Curl_cwtype * const transfer_unencoders[] = {
 #ifndef CURL_DISABLE_HTTP
-  &Curl_httpchunk_unencoder,
+        &Curl_httpchunk_unencoder,
 #endif
-  NULL
+        NULL
 };
 
 /* Provide a list of comma-separated names of supported encodings.
 */
 void Curl_all_content_encodings(char *buf, size_t blen)
 {
-  size_t len = 0;
-  const struct Curl_cwtype * const *cep;
-  const struct Curl_cwtype *ce;
+    size_t len = 0;
+    const struct Curl_cwtype * const *cep;
+    const struct Curl_cwtype *ce;
 
-  DEBUGASSERT(buf);
-  DEBUGASSERT(blen);
-  buf[0] = 0;
-
-  for(cep = general_unencoders; *cep; cep++) {
-    ce = *cep;
-    if(!strcasecompare(ce->name, CONTENT_ENCODING_DEFAULT))
-      len += strlen(ce->name) + 2;
-  }
+    DEBUGASSERT(buf);
+    DEBUGASSERT(blen);
+    buf[0] = 0;
 
-  if(!len) {
-    if(blen >= sizeof(CONTENT_ENCODING_DEFAULT))
-      strcpy(buf, CONTENT_ENCODING_DEFAULT);
-  }
-  else if(blen > len) {
-    char *p = buf;
     for(cep = general_unencoders; *cep; cep++) {
-      ce = *cep;
-      if(!strcasecompare(ce->name, CONTENT_ENCODING_DEFAULT)) {
-        strcpy(p, ce->name);
-        p += strlen(p);
-        *p++ = ',';
-        *p++ = ' ';
-      }
+        ce = *cep;
+        if(!strcasecompare(ce->name, CONTENT_ENCODING_DEFAULT))
+            len += strlen(ce->name) + 2;
+    }
+
+    if(!len) {
+        if(blen >= sizeof(CONTENT_ENCODING_DEFAULT))
+            strcpy(buf, CONTENT_ENCODING_DEFAULT);
+    }
+    else if(blen > len) {
+        char *p = buf;
+        for(cep = general_unencoders; *cep; cep++) {
+            ce = *cep;
+            if(!strcasecompare(ce->name, CONTENT_ENCODING_DEFAULT)) {
+                strcpy(p, ce->name);
+                p += strlen(p);
+                *p++ = ',';
+                *p++ = ' ';
+            }
+        }
+        p[-2] = '\0';
     }
-    p[-2] = '\0';
-  }
 }
 
 /* Deferred error dummy writer. */
 static CURLcode error_do_init(struct Curl_easy *data,
-                                  struct Curl_cwriter *writer)
+                              struct Curl_cwriter *writer)
 {
-  (void)data;
-  (void)writer;
-  return CURLE_OK;
+    (void)data;
+    (void)writer;
+    return CURLE_OK;
 }
 
 static CURLcode error_do_write(struct Curl_easy *data,
-                                     struct Curl_cwriter *writer, int type,
-                                     const char *buf, size_t nbytes)
+                               struct Curl_cwriter *writer, int type,
+                               const char *buf, size_t nbytes)
 {
-  char all[256];
-  (void)Curl_all_content_encodings(all, sizeof(all));
+    char all[256];
+    (void)Curl_all_content_encodings(all, sizeof(all));
 
-  (void) writer;
-  (void) buf;
-  (void) nbytes;
+    (void) writer;
+    (void) buf;
+    (void) nbytes;
 
-  if(!(type & CLIENTWRITE_BODY))
-    return Curl_cwriter_write(data, writer->next, type, buf, nbytes);
+    if(!(type & CLIENTWRITE_BODY) || !nbytes)
+        return Curl_cwriter_write(data, writer->next, type, buf, nbytes);
 
-  failf(data, "Unrecognized content encoding type. "
-        "libcurl understands %s content encodings.", all);
-  return CURLE_BAD_CONTENT_ENCODING;
+    failf(data, "Unrecognized content encoding type. "
+                "libcurl understands %s content encodings.", all);
+    return CURLE_BAD_CONTENT_ENCODING;
 }
 
 static void error_do_close(struct Curl_easy *data,
-                               struct Curl_cwriter *writer)
+                           struct Curl_cwriter *writer)
 {
-  (void) data;
-  (void) writer;
+    (void) data;
+    (void) writer;
 }
 
 static const struct Curl_cwtype error_writer = {
-  "ce-error",
-  NULL,
-  error_do_init,
-  error_do_write,
-  error_do_close,
-  sizeof(struct Curl_cwriter)
+        "ce-error",
+        NULL,
+        error_do_init,
+        error_do_write,
+        error_do_close,
+        sizeof(struct Curl_cwriter)
 };
 
 /* Find the content encoding by name. */
@@ -945,25 +945,25 @@ static const struct Curl_cwtype *find_unencode_writer(const char *name,
                                                       size_t len,
                                                       Curl_cwriter_phase phase)
 {
-  const struct Curl_cwtype * const *cep;
-
-  if(phase == CURL_CW_TRANSFER_DECODE) {
-    for(cep = transfer_unencoders; *cep; cep++) {
-      const struct Curl_cwtype *ce = *cep;
-      if((strncasecompare(name, ce->name, len) && !ce->name[len]) ||
-         (ce->alias && strncasecompare(name, ce->alias, len)
-                    && !ce->alias[len]))
-        return ce;
+    const struct Curl_cwtype * const *cep;
+
+    if(phase == CURL_CW_TRANSFER_DECODE) {
+        for(cep = transfer_unencoders; *cep; cep++) {
+            const struct Curl_cwtype *ce = *cep;
+            if((strncasecompare(name, ce->name, len) && !ce->name[len]) ||
+               (ce->alias && strncasecompare(name, ce->alias, len)
+                && !ce->alias[len]))
+                return ce;
+        }
     }
-  }
-  /* look among the general decoders */
-  for(cep = general_unencoders; *cep; cep++) {
-    const struct Curl_cwtype *ce = *cep;
-    if((strncasecompare(name, ce->name, len) && !ce->name[len]) ||
-       (ce->alias && strncasecompare(name, ce->alias, len) && !ce->alias[len]))
-      return ce;
-  }
-  return NULL;
+    /* look among the general decoders */
+    for(cep = general_unencoders; *cep; cep++) {
+        const struct Curl_cwtype *ce = *cep;
+        if((strncasecompare(name, ce->name, len) && !ce->name[len]) ||
+           (ce->alias && strncasecompare(name, ce->alias, len) && !ce->alias[len]))
+            return ce;
+    }
+    return NULL;
 }
 
 /* Set-up the unencoding stack from the Content-Encoding header value.
@@ -971,60 +971,67 @@ static const struct Curl_cwtype *find_unencode_writer(const char *name,
 CURLcode Curl_build_unencoding_stack(struct Curl_easy *data,
                                      const char *enclist, int is_transfer)
 {
-  Curl_cwriter_phase phase = is_transfer?
-                             CURL_CW_TRANSFER_DECODE:CURL_CW_CONTENT_DECODE;
-  CURLcode result;
-
-  do {
-    const char *name;
-    size_t namelen;
-
-    /* Parse a single encoding name. */
-    while(ISBLANK(*enclist) || *enclist == ',')
-      enclist++;
-
-    name = enclist;
-
-    for(namelen = 0; *enclist && *enclist != ','; enclist++)
-      if(!ISSPACE(*enclist))
-        namelen = enclist - name + 1;
-
-    if(namelen) {
-      const struct Curl_cwtype *cwt;
-      struct Curl_cwriter *writer;
-
-      /* if we skip the decoding in this phase, do not look further.
-       * Exception is "chunked" transfer-encoding which always must happen */
-      if((is_transfer && !data->set.http_transfer_encoding &&
-          (namelen != 7 || !strncasecompare(name, "chunked", 7))) ||
-         (!is_transfer && data->set.http_ce_skip)) {
-        /* not requested, ignore */
-        return CURLE_OK;
-      }
-
-      if(Curl_cwriter_count(data, phase) + 1 >= MAX_ENCODE_STACK) {
-        failf(data, "Reject response due to more than %u content encodings",
-              MAX_ENCODE_STACK);
-        return CURLE_BAD_CONTENT_ENCODING;
-      }
-
-      cwt = find_unencode_writer(name, namelen, phase);
-      if(!cwt)
-        cwt = &error_writer;  /* Defer error at use. */
-
-      result = Curl_cwriter_create(&writer, data, cwt, phase);
-      if(result)
-        return result;
-
-      result = Curl_cwriter_add(data, writer);
-      if(result) {
-        Curl_cwriter_free(data, writer);
-        return result;
-      }
-    }
-  } while(*enclist);
+    Curl_cwriter_phase phase = is_transfer?
+                               CURL_CW_TRANSFER_DECODE:CURL_CW_CONTENT_DECODE;
+    CURLcode result;
+
+    do {
+        const char *name;
+        size_t namelen;
+
+        /* Parse a single encoding name. */
+        while(ISBLANK(*enclist) || *enclist == ',')
+            enclist++;
+
+        name = enclist;
+
+        for(namelen = 0; *enclist && *enclist != ','; enclist++)
+            if(!ISSPACE(*enclist))
+                namelen = enclist - name + 1;
+
+        if(namelen) {
+            const struct Curl_cwtype *cwt;
+            struct Curl_cwriter *writer;
+
+            /* if we skip the decoding in this phase, do not look further.
+             * Exception is "chunked" transfer-encoding which always must happen */
+            if((is_transfer && !data->set.http_transfer_encoding &&
+                (namelen != 7 || !strncasecompare(name, "chunked", 7))) ||
+               (!is_transfer && data->set.http_ce_skip)) {
+                /* not requested, ignore */
+                return CURLE_OK;
+            }
+
+            if(Curl_cwriter_count(data, phase) + 1 >= MAX_ENCODE_STACK) {
+                failf(data, "Reject response due to more than %u content encodings",
+                      MAX_ENCODE_STACK);
+                return CURLE_BAD_CONTENT_ENCODING;
+            }
+
+            cwt = find_unencode_writer(name, namelen, phase);
+            if(is_transfer && cwt && strncasecompare(name, "chunked", 7) &&
+               Curl_cwriter_get_by_type(data, cwt)) {
+                /* A 'chunked' transfer encoding has already been added.
+                 * Ignore duplicates. See #13451. */
+                return CURLE_OK;
+            }
+
+            if(!cwt)
+                cwt = &error_writer;  /* Defer error at use. */
+
+            result = Curl_cwriter_create(&writer, data, cwt, phase);
+            if(result)
+                return result;
+
+            result = Curl_cwriter_add(data, writer);
+            if(result) {
+                Curl_cwriter_free(data, writer);
+                return result;
+            }
+        }
+    } while(*enclist);
 
-  return CURLE_OK;
+    return CURLE_OK;
 }
 
 #else
@@ -1049,4 +1056,4 @@ void Curl_all_content_encodings(char *buf, size_t blen)
 }
 
 
-#endif /* CURL_DISABLE_HTTP */
+#endif /* CURL_DISABLE_HTTP */

@perklet
Copy link
Collaborator Author

perklet commented May 6, 2024

Thanks! it works.

@perklet perklet closed this as completed in #3 May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants