Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing Image-To-Text #131

Open
ndrean opened this issue Oct 6, 2023 · 84 comments
Open

Testing Image-To-Text #131

ndrean opened this issue Oct 6, 2023 · 84 comments

Comments

@ndrean
Copy link

ndrean commented Oct 6, 2023

I gave Bumblebee a try today. The idea was to provide predictions on image captioning to classify an image so that a user can use/put pre-filled tags to easily filter his images.

It turns out that the predictions are.....not too bad and quite fast., at least locally.

This is supposed to be a car:

https://dwyl-imgup.s3.eu-west-3.amazonaws.com/40F36F45.webp

Nx.Serving.run(serving, t_img) 
#=>predictions: [
    %{
      label: "beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon",
      score: 0.9203662276268005
    }
  ]

Testing with a new query string: pred=onto run the model prediction:

curl -X GET http://localhost:4000/api?url=https://dwyl-imgup.s3.eu-west-3.amazonaws.com/40F36F45.webp&w=300&pred=on

{"h":205,"w":300,"url":"https://dwyl-imgup.s3.eu-west-3.amazonaws.com/76F195C6.webp","new_size":11642,"predictions":[{"label":"beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon","score":0.9203662276268005}]],"init_size":79294,"w_origin":960,"h_origin":656,"url_origin":"https://dwyl-imgup.s3.eu-west-3.amazonaws.com/40F36F45.webp"}

I tested 3 models: "facebook/deit-base-distilled-patch16-224" and "microsoft/resnet-50" and ""google/vit-base-patch16-224".

I don't know if anyone tested it?

I submit my code in case any reader sees some obvious fault. It runs locally. It is based on this example. I did not try to deploy this, but here is a guide before I forget: you need to set up a temp dir.

#mix.exs
{:bumblebee, "~> 0.4.2"},
{:nx, "~> 0.6.1"},
{:exla, "~> 0.6.1"},
{:axon, "~> 0.6.0"},

I decided to run a GenServer to start the serving with the app to load the model, but you can start an Nx.Serving in the Aplpication level as well, something like {Nx.serving, serving: serve(), name: UpImg.Serving} where the function Application.serve defines what is in the GenServer below.

defmodule UpImg.GsPredict do
   use GenServer

  def start_link(opts) do
    {:ok, model} = Keyword.fetch(opts, :model)
    GenServer.start_link(__MODULE__, model, name: __MODULE__)
  end

  def serve, do: GenServer.call(__MODULE__, :serving)

  @impl true
  def init(model) do
    {:ok, model, {:continue, :load_model}}
  end

  @impl true
  def handle_continue(:load_model, model) do
    {:ok, resnet} = Bumblebee.load_model({:hf, model})
    {:ok, featurizer} = Bumblebee.load_featurizer({:hf, model})

    {:noreply,
     Bumblebee.Vision.image_classification(resnet, featurizer,
       defn_options: [compiler: EXLA],
       top_k: 1,
       compile: [batch_size: 10]
     )}
  end

  @impl true
  def handle_call(:serving, _from, serving) do
    {:reply, serving, serving}
  end
end

and it is started with the app:

children = [
  ...,
  {UpImg.GsPredict, [model: System.fetch_env!("MODEL")]}
]

The model - the repo id - is passed as an env var so I can very simply change it..

In the API, I use predict/1 when I upload an image from the browser and run this task in parallel to the S3 upload. It takes a Vix.Vips.Image, a transformation of a binary file:

[EDITED]

def predict(%Vix.Vips.Image{} = image) do
    serving = UpImg.GsPredict.serve()

    {:ok, %Vix.Tensor{data: data, shape: shape, names: names, type: type}} =
      Vix.Vips.Image.write_to_tensor(image)

    #{width, height, channels} = shape <- wrong, shape should be HWC. Bug corrected.
    t_img = Nx.from_binary(data, type) |> Nx.reshape(shape, names: names)

    Nx.Serving.run(serving, t_img)
    Task.async(fn -> Nx.Serving.run(serving, t_img) end)
  end

and use it in the flow:

prediction_task = predict(my_image)
...
%{predictions: predictions} = Task.await(prediction_task)
@ndrean
Copy link
Author

ndrean commented Oct 6, 2023

Prediction slows down the process, roughly 1.5s per request. I will try to deploy this horror.

The "GET" endpoint - where you pass in an URL of a pic - works with the query string addition "pred" (no prediction, thus faster is you don't pass one).

curl -X GET http://localhost:4000/api?url=....&w=300&pred=on

The "POST" endpoint - where you submit files from a client via a FormData to the API - also works, but you use a checkbox if you want the prediction (I capture it the same way, via a key "pred", thus there is a constraint on the FormData naming).

@ndrean
Copy link
Author

ndrean commented Oct 7, 2023

For complteness,

https://github.com/elixir-nx/bumblebee/tree/main/examples/phoenix#user-images

  • it seems that when the image is too small that the findings are not so good. After reading a bit, it seems that sizes around 512x512 are recommended for Image-to-Text. The speed of the recognition is also related to the size of the image, the bigger the longer. To speed up the process, if an image is bigger, I resize to this size and run the ML model on it.

  • I added the redirection to accept images from "unsplash" for example. If you submit a src="https://source.unsplans.com/<photo_id>"; it will be redirected. To do this, I used and modified a Finch.stream. If it detects a redirection by reading the headers, it takes the received path and makes a recursion. Otherwise, it writes the stream into a file, so the process eventually ends. This way, the body is processed only once, redirection or not, has a low memory footprint and it does not slow down the process.

  • I changed to Nx.Serving.batched_run/3 as it seems to give faster results when treating several pictures (uploaded as a POST request).

To use batchde_run, the set up is different. The process Nx.Serving is launched in the Application module.

#Application.ex
children = [
   ...,
    {Nx.Serving, serving: serve(), name: UpImg.Serving, batch_size: 10, batch_timeout: 100}
]

defp serve do
  model = System.fetch_env!("MODEL")
  {:ok, resnet} = Bumblebee.load_model({:hf, model})
  {:ok, featurizer} = Bumblebee.load_featurizer({:hf, model})

  Bumblebee.Vision.image_classification(resnet, featurizer,
      defn_options: [compiler: EXLA],
      top_k: 1,
      compile: [batch_size: 10]
end

and then use instead:

def predict(%Vix.Vips.Image{} = image) do
    # serving = UpImg.GsPredict.serve()

    {:ok, %Vix.Tensor{data: data, shape: shape, names: names, type: type}} =
      Vix.Vips.Image.write_to_tensor(image)

    {width, height, channels} = shape
   # bug in Vix.Vips, with HWC and WHC....
    t_img = Nx.from_binary(data, type) |> Nx.reshape({height, width, channels}, names: names)

    Task.async(fn -> Nx.Serving.batched_run(UpImg.Serving, t_img) end)
    # Task.async(fn -> Nx.Serving.run(serving, t_img) end)
  end

! One must be careful with the async calls. When you run this async task, say %Task{} = task = predict(image), you can only get the result back -Task.await(task) - from the owner process.

@ndrean
Copy link
Author

ndrean commented Oct 8, 2023

To read an URL and download it with Finch using streams and potentially accept redirects (and write it into a temp file), you can use the 302 and Location header:

{:ok, path} = Plug.upload.random_file("temp-stream")
{:ok, file} = File.open(path, [:binary, :write])

# url = "https://source.unsplash.com/QT-l619id6w"

request = Finch.build(:get, url)
stream_write(request, path)
File.close!(file)

def stream_write(request, file) do
   Finch.stream(UpImg.Finch, nil, fn
      {:status, status}, _acc ->
        status

      {:headers, headers}, status ->
         handle_headers(headers, status)

      {:data, data}, headers ->
          handle_data(file, data, headers)
    end)
end

def handle_headers(headers, 302), do:
  Enum.find(headers, &(elem(&1, 0) == "location"))

def handle_headers( headers, 200), do: headers

def handle_headers(_,_), do:  {:halt, "bad redirection"}

def handle_data(file, _,  {"location", location}), do:
  Finch.build(:get, location) |> stream_write(file)

def handle_data(_, _,  {:halt, "bad redirection"}), do:
   {:halt, "bad redirection"}

def handle_data(file, data, _) do
  case IO.binwrite(file, data) do
        :ok -> :ok
        {:error, reason} -> {:halt, reason}
   end
end

The memory footprint is low, at the expense of writing the body of the request into a file (but one can just append the chunk in memory if needed).

@LuchoTurtle
Copy link
Member

Thanks for the excellent write-up, @ndrean , it was super insightful!
I haven't tried Bumblebee but I'm wanting to (when I have more free time). Are you using any specific Hugging Face models in your experiments?

On the topic, you may also find https://github.com/replicate/replicate-elixir as another alternative. Unfortunately, it's tied to their platform, but it might still be fun to tinker with. This is from this AWESOME talk from Charlie Holtz in https://www.youtube.com/watch?v=TfZI5-oQSqI&ab_channel=ElixirConf. It's an awesome video that really highlights how Elixir has great built-in tools to get AI models with LiveView working seamlessly.

@ndrean
Copy link
Author

ndrean commented Oct 10, 2023

Yes. I used microsoft/resnet model.

Thanks for the "replicate" link. I will give it a try too!

@ndrean
Copy link
Author

ndrean commented Oct 17, 2023

@LuchoTurtle Thanks! I really enjoyed watching this video, had a lot of fun! 😀
I just realised that you already wrote plenty of good things on this subject before I woke up! So nothing new under the sun for you... 😀

dwyl/learn-elixir#212
dwyl/image-classifier#1

I am just looking at Image Classification - namely a weighted list of predictions - whilst you wanted Image-to-text, more ambitious. I just wondered what you would do with the generated text for an image because you need to further process this response to extract some keys points, if this is what you want.

For example, the Salesforce/BLIP is an I2T. I run it in a Livebook, the easiest way to do this. It downloads 1.7Gb... The generated code is:

{:ok, model_info} = Bumblebee.load_model({:hf, "Salesforce/blip-image-captioning-base"})

{:ok, featurizer} =
  Bumblebee.load_featurizer({:hf, "Salesforce/blip-image-captioning-base"})

{:ok, tokenizer} =
  Bumblebee.load_tokenizer({:hf, "Salesforce/blip-image-captioning-base"})

{:ok, generation_config} =
  Bumblebee.load_generation_config({:hf, "Salesforce/blip-image-captioning-base"})

generation_config = Bumblebee.configure(generation_config, max_new_tokens: 100)

serving =
  Bumblebee.Vision.image_to_text(model_info, featurizer, tokenizer, generation_config,
    compile: [batch_size: 1],
    defn_options: [compiler: EXLA]
  )

image_input = Kino.Input.image("Image", size: {384, 384})
form = Kino.Control.form([image: image_input], submit: "Run")
frame = Kino.Frame.new()

Kino.listen(form, fn %{data: %{image: image}} ->
  if image do
    Kino.Frame.render(frame, Kino.Text.new("Running..."))

    image =
      image.file_ref
      |> Kino.Input.file_path()
      |> File.read!()
      |> Nx.from_binary(:u8)
      |> Nx.reshape({image.height, image.width, 3})
    %{results: [%{text: text}]} = Nx.Serving.run(serving, image)
    Kino.Frame.render(frame, Kino.Text.new(text))
  end
end)

Kino.Layout.grid([form, frame], boxed: true, gap: 16)

I generated an image with Stable Diffusion, and submit it to BLIP. The result is pretty good! 😁, but it is not classification!

Screenshot 2023-10-17 at 19 48 56

This works for me because the model is "deciphered" in some way in Bumblebee. What if I want to use a specific model? That I don't know how to proceed.

I used a small Image Classification model -300Mb downloaded - embeded into the app as this tends to be much smaller. However, the Image Classification is not as good - same image - even for a 1300x1000px image:

Screenshot 2023-10-17 at 20 02 51

Replicate exposes an endpoint. You also need to be careful with the data you submit to get the right balance between speed and accuracy: you might pay too much or pay for nothing if you don't deliver a properly sized image :)

When you read this "official" example, they naturally stress that the navigator should resize pics instead of a/the server. However, In this git repo, the proposed JS code is a bit ... wordy.

So down to earth I tried to follow the repo recommendations - at least for a WebApp version - and I looked at how to do this. In fact, you can get a bunch of resized images from the browser with a Promise.all because the browser is efficient at doing this: a form accepts an image and you "just" inject it into different resizing canvas that you set, and call canvas.toBlob. You can target a thumbnail, or a "ML" sized image (512px) or 3 different sizes to match mobile, pad and full screen 1440px pretty quickly for example.

One point is the naming: you need a unique base identifier for all these files. It turns out that JS can produce a SHA1 easily, no library, so I used this as a unique naming base, modulo some size identifier. You can also convert into WEBP just like this, and this saves a lot.
You can upload directly to a bucket, and pass down to the server the 512px file to do the ML stuff. The bucket does his stuff and returns a response back to the client - a bunch of URLs - and the client forwards the responses to the server where you update the socket. Meanwhile, the server did the ML stuff to produce a caption/prediction. It remains to save all this into the DB. With the SHA1 naming, we have a common identifier, almost collision free, so we can update the DB record easily. All easy async client-side and server-side. The main difficulty is the "hook".

The shinstagram source.
He uses R2 but I did not get the details how he uses the CDN to serve the files.

@LuchoTurtle
Copy link
Member

Thanks for the detailed write-up @ndrean , it is really super insightful! Once I'm cleared with other tasks, I want to give dwyl/image-classifier#1 a whirl and, since you've put much more time and effort than I into this, I might ask ya for some pointers!

I am just looking at Image Classification - namely a weighted list of predictions - whilst you wanted Image-to-text, more ambitious. I just wondered what you would do with the generated text for an image because you need to further process this response to extract some keys points, if this is what you want.

Not necessarily. I actually want a list of keywords that describe an image, just like you want. However, I believe that one may yield fair results by using a combination of an image captioning model like BLIP and a regular LLM to better extract keywords. In the same way https://zhaohengyuan1.github.io/image2paragraph.github.io/ uses three models to densely and accurately describe an image, one may do something like:

Use BLIP to describe the image -> feed into an LLM to gather relevant keywords with context from the image.

Then again, this is pure speculation on my part.

I generated an image with Stable Diffusion, and submit it to BLIP. The result is pretty good! 😁, but it is not classification!

Your results are awesome! Though why do you say it's not considered "classification"? Is it because it's not yielding a set of weighted predictions in lieu of a simple phrase?

This works for me because the model is "deciphered" in some way in Bumblebee. What if I want to use a specific model? That I don't know how to proceed.

Apparently, you can't use any HuggingFace transformer with Bumblebee, which is a shame. I don't know the specifics but, according to https://github.com/elixir-nx/bumblebee#model-support, it "has to be implemented in Bumblebee" (whatever that means).
However you can use https://jonatanklosko-bumblebee-tools.hf.space/apps/repository-inspector/36pihlb7tb7rvmovbnvrmjseud5mzdlxhbfaa6xywewlprok to check if a Transformer model from HuggingFace is supported or not.

So it's fair that you don't know how to use other models from HuggingFace because apparently it's not possible :p.

Image sizes

It's interesting, the deal with image sizes, as you pointed, spans to even image generation with Stable Difusion. Even when I'm doing img-to-img or inpainting, I yield much better results with 512px images.

Using multiple canvas with different sizes, injecting images there is a fun way of getting different-sized images, quite creative!

Thanks for the shinstagram source, I'll have to take a look at it! :D

@ndrean
Copy link
Author

ndrean commented Oct 17, 2023

Yes, I see, a mix. Sometimes I have good predictions, but more often BLIP is superior. For the moment, I don't know.

Thanks for the CanIUseThisModel, I did not know or found. Things are more clear.

Of course, I did not invent the 512px trick, I read it!

See also: https://github.com/elixir-nx/bumblebee/tree/main/examples/phoenix#user-images

[UPDATED] You can consume the data by sending it directly to a bucket when you run an external: presign_upload: it performs a XHR/fetch request to the bucket endpoint with a presigned url and consumes the data. This means you can't run a prediction any more. You may need to upload the data to the server.

  1. The HOOK: use this.upload to send the renamed & resized & WEBP converted files from the client.
  • I generate a unique name per picture, simply a SHA1, supposed to be unique: calcSHA1. MDN source of non-crypto usage
  • I decided (why not!) to produce 3 versions: a "thumbnail" with max size 200px (I might read this from a dataset, itself populated via an ENV VAR Phoenix side), a "machine-learning" version of target size 512px (width or height), and a pseudo full-screen with max 1440px if needed. You get files named .original_extension
  • process the entries through a canvas to produce 3 new versions in format WEBP with canvas.drawImage and canvas.toBlob. Each file will produce 3 files renamed "-m-[200/512/1440].webp"
  • this.upload is the secret! An undocumented function found here
const SIZES= [200, 512, 1440];

export default {
   /**
   * Renames a File object with its SHA1 hash and keep the extension
   * source: https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/digest#converting_a_digest_to_a_hex_string
   * @param {File} file - the file input
   * @returns {Promise<File>} a promise that resolves with a renamed File object
   */
  async setHashName(file) {
    const ext = file.type.split("/").at(-1);
    const SHA1name = await this.calcSHA1(file);
    return new File([file], `${SHA1name}.${ext}`, {
      type: file.type,
    });
  },
  /**
   * Calculates a SHA1 hash using the native Web Crypto API.
   * @param {File} file - the file to calculate the hash on.
   * @returns {Promise<String>} a promise that resolves to hash as String
   */
  async calcSHA1(file) {
    const arrayBuffer = await file.arrayBuffer();
    const hash = await window.crypto.subtle.digest("SHA-1", arrayBuffer);
    const hashArray = Array.from(new Uint8Array(hash));
    const hashAsString = hashArray
      .map((b) => b.toString(16).padStart(2, "0"))
      .join("");
    return hashAsString;
  },
  /**
   *
   * @param {File} file  - the file
   * @param {number[]} SIZES - un array of sizes to resize to image to
   * @returns {Promise<File[]>} a promise that resolves to an array of resized images
   */
  async processFile(file, SIZES) {
    return Promise.all(SIZES.map((size) => this.fReader(file, size)));
  },
  /**
   * Reads an image file, resizes it to a given max size, and converts into WEBP format et returns it
   * @param {File} file  - the file image
   * @param {number} MAX  - the max size of the image in px
   * @returns {Promise<File>} resolves with the converted file
   */
  fReader(file, MAX) {
    const self = this;

    return new Promise((resolve, reject) => {
      if (file) {
        const img = new Image();
        const newUrl = URL.createObjectURL(file);
        img.src = newUrl;

        img.onload = function () {
          URL.revokeObjectURL(newUrl);
          const { w, h } = self.resizeMax(img.width, img.height, MAX);
          const canvas = document.createElement("canvas");
          if (canvas.getContext) {
            const ctx = canvas.getContext("2d");
            canvas.width = w;
            canvas.height = h;
            ctx.drawImage(img, 0, 0, w, h);
            // convert the image from the canvas into a Blob and convert into WEBP format
            canvas.toBlob(
              (blob) => {
                const name = file.name.split(".")[0];
                const convertedFile = new File([blob], `${name}-m${MAX}.webp`, {
                  type: "image/webp",
                });
                resolve(convertedFile);
              },
              "image/webp",
              0.75
            );
          }
        };
        img.onerror = function () {
          reject("Error loading image");
        };
      } else {
        reject("No file selected");
      }
    });
  },
  resizeMax(w, h, MAX) {
    if (w > h) {
      if (w > MAX) {
        h = h * (MAX / w);
        w = MAX;
      }
    } else {
      if (h > MAX) {
        w = w * (MAX / h);
        h = MAX;
      }
    }
    return { w, h };
  },
  /**
   * Takes a FileList and an array of sizes, 
   * then renames them with the SHA1 hash, 
   * then resizes the images according to a list of given sizes, 
   * and converts them to WEBP format, 
   * and finally uploads them.
   * @param {FileList} files
   * @param {number[]} SIZES
   */
  async handleFiles(files, SIZES) {
    const renamedFiles = await Promise.all(
      [...files].map((file) => this.setHashName(file))
    );

    const fList = await Promise.all(
      renamedFiles.map((file) => this.processFile(file, SIZES))
    );

    // the "secret" to upload to the server. Undocumented Phoenix.JS function
    this.upload("images", fList.flat());
  },
  /*
  inspired by: https://github.com/elixir-nx/bumblebee/blob/main/examples/phoenix/image_classification.exs
  */
  mounted() {
    this.el.style.opacity = 0;

    this.el.addEventListener("change", async (evt) =>
      this.handleFiles(evt.target.files, SIZES)
    );

    // Drag and drop
    this.el.addEventListener("dragover", (evt) => {
      evt.stopPropagation();
      evt.preventDefault();
      evt.dataTransfer.dropEffect = "copy";
    });

    this.el.addEventListener("drop", async (evt) => {
      evt.stopPropagation();
      evt.preventDefault();
      return this.handleFiles(evt.dataTransfer.files, SIZES);
    });
  },
};

@LuchoTurtle
Copy link
Member

LuchoTurtle commented Nov 1, 2023

@ndrean

    t_img = Nx.from_binary(data, type) |> Nx.reshape({height, width, channels}, names: names)

I'm having trouble with this part. I keep stumbling upon this error when trying to reshape the tensor so I can feed it into the resnet-50 model.

** (ArgumentError) cannot reshape, current shape {11708} is not compatible with new shape {224, 224, 3}

I know for sure that the image is resized according to the model's specification (224x224) up until this point.
I don't know what I'm doing wrong, I'm trying to follow Bumblebee's guide to image classification.

Have you gotten this error before? 👀

@ndrean
Copy link
Author

ndrean commented Nov 1, 2023

@LuchoTurtle ah yes I remember now, I had this too, it was bug, and my code above was good with the bug until the maintainer corrected it, so its wrong now. but I did not correct it above...

The correct shape is a HWC tuple: width and height were inverted , you see what I did?

Make sure to have the latest version. I think this should work.

 {:ok, %Vix.Tensor{data: data, shape: shape, names: names, type: type}} =
          Image.write_to_tensor(image)

        t_img = Nx.from_binary(data, type) |> Nx.reshape(shape, names: names)

        Nx.Serving.batched_run(UpImg.Serving, t_img)

FYI you cannot deploy this on a small machine, you probably need 1GB RAM. I will probably come back to this as I want to finish this little project.

@nelsonic
Copy link
Member

nelsonic commented Nov 1, 2023

A 2GB RAM VPS instance on OVH IS €3.50/month —> dwyl/learn-devops#64 💭

@ndrean
Copy link
Author

ndrean commented Nov 1, 2023

Probably need a trial to see how to install on bare-metal (modulo Docker but...). I see they provide an IPv4, so you can put plenty of demo apps with subdomains I imagine. If I want to buy a domain say on Cloudfare, I will need to link OVH-Cloudfare. Should not be too complicated stuff.

Screenshot 2023-11-02 at 10 12 22

@ndrean
Copy link
Author

ndrean commented Nov 6, 2023

About "prompt engineering":
https://prmpts.ai/blog/what-is-prompt-engineering

Screenshot 2023-11-06 at 20 00 14

@LuchoTurtle
Copy link
Member

LuchoTurtle commented Nov 7, 2023

I dispute that "prompt engineering" is engineering at all 😅.

But I do understand that there's an art to it. Refining models' output to get what you want is not easy, per se, but rather a matter of trial and error and specificity. It's definitely a skill but I honestly can't see the "engineering" part of it - it can be boiled down to clarity in communication and having to work with some quirks that GPT or any other LLMs may have. But hey, maybe I'm an idiot and I'm spewing nonsense, I don't know 😅.

Although, I have to admit, I've dabbled with Stable Diffusion much more than LLMs (though I'm keen on biting the bullet and paying 20 quid a month for access to OpenAI's API after their dev day at https://openai.com/blog/new-models-and-developer-products-announced-at-devday).

From what I've tried, I think "prompt engineering" is much harder with diffusion models than LLMs. But even then, you can circumvent issues with inpainting and ControlNet to get more accurate results rather easily (though it's still very much trial and error, you can't ever get exactly what you want, just what's the closest to what you want).

For example, what I found to generate cool Ghibli-style images with Stable Diffusion, I found that I had to work much more than simply using ChatGPT or any other LLM.

This by itself is much more work to just yield fair results with generative art, something that is much more streamlined with LLMs (or downright not present) and prompting.
I found prompting in diffusion models absolutely chaotic. But I've seen patterns, and I've had luck trying to follow imageboard tags and I assume many models are trained with these imageboards in mind, because they perform much better when I use these tags.

For example, I tend to follow a pattern for positive prompts

establish style +  number of characters +  the camera and/or landscape and/or scene properties and using `"BREAK"` between different subjects that I want in the pictures (to prevent getting mixed up).

Adding weight to each tag and you can go from there.

Is this engineering?

So is this workflow engineering? I don't believe so. It's not deterministic by nature. It's just proper concise communication. It's a skill, but I don't think there's anything esoteric about it.

I liked this answer from https://news.ycombinator.com/item?id=36971327.

> Is Prompt Engineering a Thing?

Yes, it's a dumb name for the skill of modifying your prompts and questions to the LLM in a way that produces better results than if you just asked for what you wanted plainly. As language models get better, this might become obsolete.

> I'm trying to research the subject but I don't see much evidence that companies are racing to hire prompt engineers.

Because it's not really a job. Think of it like using the Google search engine - being able to search well is something you can get better at but being a "Google search-er" isn't a career or a job you'll see openings for.

All in all, aside from my obvious ramble and digression, it's still an interesting read @ndrean . Because although I don't think it's engineering, it's a highly valuable skill that I want to get better at!

@ndrean
Copy link
Author

ndrean commented Nov 10, 2023

Nice @LuchoTurtle , you look pretty advanced!

Do you use only a Livebook to test all this?

There is indeed some vocabulary to ingest to enter this world. Being able to name things that are really useful is a powerful skill 😀 but feels sometimes like much ado about "almost" nothing. Embedding, transformers, tokenizer, prompt-context etc on the other side are "real" concepts to be understood whilst so-called "engineering" is more like noise.

I am starting to watch/read this: https://www.coursera.org/learn/generative-ai-with-llms/lecture/ZVUcF/prompting-and-prompt-engineering

Playing with images gives an immediate wow effect. I highly recommend https://github.com/cbh123/emoji by the same guy who did Shinstagram. By the way, here is how he prompts engineers it.

I still have basic questions: how do you use these tools in practice to run this on production? Api based approach or embedded in your app somehow?

I do more modest down-to-earth things, more on the LLM side. My first step was image captioning. For example, to run this in practice, I embedded the model, ie download the data on a server as Bumblebee does this in fact. Then mount bind into the running container of your app. This is not totally straightforward: I can run the "base" model (1G) but not the large model (2G). I did not dig into this problem.

Another barrier is that few models can be used by the Elixir eco-system. I finally found something:

https://twitter.com/sean_moriarity/status/1715758666001928613

An explanation on how we add models to Bumblebee (@toranb asked on EEF slack and I thought it would be a good write up here).

The first thing to note is that almost all of the models have significant overlap in implementation details. A transformer is a transformer. There are…

— Sean Moriarity (@sean_moriarity) October 21, 2023

Lastly, another barrier IMO is LiveView. Compared to Streamlit, it is far behind. Liveview is still complicated and fragile: navigation, "liveview session" is obscur. I had some errors I still don't understand. For example, I used a separate "html.heex" file that for some reason gave me double renderings. When I put the same markup into the render function of the liveview, it worked. I also have some cache problems: you change the code but it doesn't render. Few headaches...

@ndrean
Copy link
Author

ndrean commented Nov 11, 2023

You can spend your life just watching youtube. However, this one is worth watching, you learn something: running ML in the browser, VERY instructive. This helps you to understand step by step this Huggingface world and consequently puts some light on the Elixir Bumblebee world (because honestly, they don't help you 😏).

https://www.youtube.com/watch?v=QdDoFfkVkcw

Screenshot 2023-11-11 at 10 54 06

@nelsonic
Copy link
Member

Very good video. Thanks for sharing @ndrean
Please have a read of: https://github.com/dwyl/image-classifier and share your thoughts. 🙏

As for the job/title of "Prompt Engineer" ...
While it's super "hot" right now to know how to refine queries to get pre-trained models to give useful results ...

image

I cannot help but think that this is something a 5-year-old child can do quite effectively.
So it's only a matter of time before the "Prompt Engineers" are replaced.

image

What might not be replaced as quickly - though will eventually - are specific subject-matter-experts who use the corpus of knowledge to answer specific questions that non-experts wouldn't even think of. 💭
But I honestly think as all knowledge gets sucked into ever more powerful LLMs
and the LLMs have all the questions and answers they will be able to auto-suggest the prompts.
So even a child will be able to prompt their way into a Nobel Prize. 😉

@ndrean
Copy link
Author

ndrean commented Nov 13, 2023

@nelsonic
I looked quickly into the https://github.com/dwyl/image-classifier repo. Looks good. A few remarks.

  1. Are you able to run this on Fly.io? Because I see that your Dockerfile uses the standard user "nobody" but Hugginface recommends a user 1000. https://huggingface.co/spaces/jonatanklosko/chai/blob/main/Dockerfile. This repo can be a reference: https://huggingface.co/docs/hub/spaces-sdks-docker#permissions. However, it downloads the model during the build stage, and I found this complicated. You opted to copy the model data from your host into the image https://github.com/dwyl/image-classifier/blob/d7205ca4a97a1d582436d5cc9d781eb80b6311b2/Dockerfile#L56, but you don't use ENV BUMBLEBEE_OFFLINE=true in the Dockerfile. I believe that it will download the image, wouldn't it? I believe your image should use a volume to grab the data and contain only the running code. But if it works this way (the model is small?), then why not, it is not meant to be scaled I presume. Another detail is that the .bumblebee data is also persisted in the Github repo, but shouldn't it be in an LFS? or not at all.

  2. You pass a base64 string to render the resized image, but why do you use a form to wrap the img tag? https://github.com/dwyl/image-classifier/blob/d7205ca4a97a1d582436d5cc9d781eb80b6311b2/lib/app_web/live/page_live.html.heex#L22

  3. Why do you need this pre-process-image?

  4. Shouldn't the async task be async_nolink, because if the serving fails, you may not want the main process to get killed.

  5. You also have the library stb_image instaed of Vix. This can further reduce the image size. An example.

@nelsonic
Copy link
Member

@ndrean Great feedback as always. CC: @LuchoTurtle (who is currently working on the Fly.io deployment/update ...)

@ndrean
Copy link
Author

ndrean commented Nov 13, 2023

ah ok, didn't look at who did it. So with Lucho, its in good hands :) I am interested to see your result as I want to deploy some thing similar but on a VPS (but using a bucket to save the images and SQLite to save the list of images/captions per user). Nothing huge but not obvious :)

@ndrean
Copy link
Author

ndrean commented Nov 13, 2023

@nelsonic @LuchoTurtle
Fly.io volumes

I would try to copy the .bumblebee data you downloaded via Bumblebee into a fly volume. I think this can be done in the fly.toml with (not totally sure):

[mounts] source=$(pwd)/.bumblebee destination=/my-volume

Then you can get rid of the .bumblebee copy command in both stages, use the "nobody" user as Phoenix does, and reference the new location in the runner stage with:

ENV BUMBLEBEE_CACHE_DIR=/my-volume ?
ENV BUMBLEBEE_OFFLINE=true

Now, you won't download the model but read it from the cache when the app starts.

However, not sure your image will fit in a 256MB machine....

@LuchoTurtle
Copy link
Member

LuchoTurtle commented Nov 13, 2023

Thanks for the feedback @ndrean , always appreciated!
By the way, thank you for the video! Watched it all the way through, and it was immensely useful!

1 - Thanks, I didn't know about the "nobody" user had any impact. Will change it :)
And yes, I was trying to cache the model and was hoping to do this all on fly.io. Meaning that on the first execution of the app in fly.io, the model would be downloaded into .bumblebee (hence why I created this directory) and then on subsequent runs, LiveView would fetch the local model from it. I thought setting BUMBLEBEE_OFFLINE was optional (I thought it was a flag to ALWAYS fetch locally) because I was under the impression that by setting the CACHE_DIR, it would use the local model. Apparently, it doesn't, hence why I'm trying to fix it.

2 - I wrap the <img> with a form so the user can click on the image again and upload another image if they want to.

3 - I was having trouble with the tensor dimensions initially. Because models usually work on a specific colourspace and without alpha (it's data that is not relevant), I wrote that little function that can be used anywhere. If flattens the alpha out, converts the colourspace and formats/reshapes the tensor to the correct format. That's how I got this to work :p

4 - OOh, interesting! Thank you for the suggestion :)

Regarding using volumes, I'm tempted to do so. I want to first try to get the model during the build stage (as you've mentioned) in the Dockerfile so it's easier to deploy. I'm aware that this will result (depending on the model used) on a bigger container size but that's ok, we can scale the fly.io machine up (yeah, 256MB is super low).

But if that doesn't work, I'll try the volume approach. Thank you kindly :D

@ndrean
Copy link
Author

ndrean commented Nov 13, 2023

No it won't download the model in the build stage unless we explicitly "pre-run" the Bumblebee.load and friends, in some mix command. Take a look at "Livebeats with whispers" at how they do it the Dockerfile: they are explicit. But this becomes intricate and I don't like this way to do: the model should be in a separate volume, and passed via an env var.

@ndrean
Copy link
Author

ndrean commented Nov 13, 2023

Another interesting repo to prepare yourself to lose your job??

https://github.com/KillianLucas/open-interpreter/

Screenshot 2023-11-13 at 15 06 49

@LuchoTurtle
Copy link
Member

Another interesting repo to prepare yourself to lose your job??

KillianLucas/open-interpreter

Screenshot 2023-11-13 at 15 06 49

Looks like https://github.com/Significant-Gravitas/AutoGPT :P

@LuchoTurtle
Copy link
Member

No it won't download the model in the build stage unless we explicitly "pre-run" the Bumblebee.load and friends, in some mix command. Take a look at "Livebeats with whispers" at how they do it the Dockerfile: they are explicit. But this becomes intricate and I don't like this way to do: the model should be in a separate volume, and passed via an env var.

Thank you for the reply. I was trying to get it to work with something similar to that. I want to give both options a whirl but I'm having trouble with actually getting my Dockerfile to work by running something like

RUN /app/bin/app eval 'App.Application.serving()'

But it's not working.

Trying to debug locally but even then, it's a pain and even dumping logs in intermediate Docker layers isn't allowing me to see the filesystem at each step of the build stage.

I see your POV, though. Having it in the dockerfile makes it too tightly coupled but I'm still wanting to give it a try to document both approaches 👌

@ndrean
Copy link
Author

ndrean commented Nov 13, 2023

I hate this "doesn't work for me", but here we are. Same for me, doesn't work because if I recall correctly, it says "can't find "/app/bin/app".

When I run a release version, Application.serving() works, but putting this in the Dockerfile (which minics what we do by hand, no?), well, doesn't....I did not find an answer.

@ndrean
Copy link
Author

ndrean commented Nov 13, 2023

Mine

ARG ELIXIR_VERSION=1.15.5
ARG OTP_VERSION=26.0.2
ARG DEBIAN_VERSION=bullseye-20230612-slim

ARG BUILDER_IMAGE="hexpm/elixir:${ELIXIR_VERSION}-erlang-${OTP_VERSION}-debian-${DEBIAN_VERSION}"
ARG RUNNER_IMAGE="debian:${DEBIAN_VERSION}"

FROM ${BUILDER_IMAGE} as builder

ARG MIX_ENV
RUN apt-get update -y && apt-get install -y build-essential git libmagic-dev curl\
  && apt-get clean && rm -f /var/lib/apt/lists/*_*

WORKDIR /app

RUN mix local.hex --force && \
  mix local.rebar --force

ENV MIX_ENV="prod"

COPY mix.exs mix.lock ./
RUN mix deps.get --only $MIX_ENV
RUN mkdir config

COPY config/config.exs config/${MIX_ENV}.exs config/
RUN mix deps.compile

COPY priv priv
COPY assets assets
COPY lib lib
RUN mix assets.deploy

RUN mix compile

# RUN  mix run -e  "UpImg.Ml.serve()" --no-start.  <----    fails!

COPY config/runtime.exs config/
COPY rel rel
RUN mix release

################################

FROM ${RUNNER_IMAGE}

ARG MIX_ENV

RUN apt-get update -y && apt-get install -y libstdc++6 openssl libncurses5 locales libmagic-dev \
  && apt-get clean && rm -f /var/lib/apt/lists/*_*

RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen

ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

WORKDIR "/app"
RUN chown nobody /app

ENV MIX_ENV="prod"

ENV BUMBLEBEE_CACHE_DIR=/app/bin/.bumblebee/blip
ENV BUMBLEBEE_OFFLINE=true

COPY --from=builder --chown=nobody:root /app/_build/${MIX_ENV}/rel/up_img ./

USER nobody

EXPOSE 4000

CMD ["/app/bin/server"]

@ndrean
Copy link
Author

ndrean commented Nov 13, 2023

When I use {:local, "/app/bin/.bumblebee/blip"}, I get an error "no config file found in the given repository."

However, when I inspect the running container with a docker exec -it test bash and ls, I find the bind mounted folder and its populated.

Then when I run serve with :local and an absolute hard-coded path, same error. When I use :hfwith a populated folder from a previous download, it works.

@LuchoTurtle
Copy link
Member

Yeah. I have a feeling that :local is meant to be used when you directly download the model and add it to the repo manually.

@LuchoTurtle
Copy link
Member

LuchoTurtle commented Nov 13, 2023

Okay, I think I figured it out.

  • :local is only supposed to be used when we download the model files directly to our git repo.
    When Bumblebee downloads the files as found in cached_download/2 (basically when calling Bumblebee.load_model/2), the files downloaded are not the same as if we'd download the model to our git repo - I'll call them hashed files.

  • :ht will cache downloads/hashed files in the same directory as set in BUMBLEBLEE_CACHE_DIR, as seen in cached_download/2. So we'll have an advantage if we download the model in the Dockerfile/put it in a volume and then make sure that our BUMBLEBLEE_CACHE_DIR points to it. That way, the model is fetched from this directory.

I had a lot of confusion on what the heck :local pertained to. So I hope this makes it more clear.

Now, I'm testing the container locally whilst downloading the model in the Dockerfile and now it seems to be working.
Here's the Dockerfile.

# Find eligible builder and runner images on Docker Hub. We use Ubuntu/Debian
# instead of Alpine to avoid DNS resolution issues in production.
#
# https://hub.docker.com/r/hexpm/elixir/tags?page=1&name=ubuntu
# https://hub.docker.com/_/ubuntu?tab=tags
#
# This file is based on these images:
#
#   - https://hub.docker.com/r/hexpm/elixir/tags - for the build image
#   - https://hub.docker.com/_/debian?tab=tags&page=1&name=bullseye-20231009-slim - for the release image
#   - https://pkgs.org/ - resource for finding needed packages
#   - Ex: hexpm/elixir:1.15.7-erlang-26.0.2-debian-bullseye-20231009-slim
#
ARG ELIXIR_VERSION=1.15.7
ARG OTP_VERSION=26.0.2
ARG DEBIAN_VERSION=bullseye-20231009-slim

ARG BUILDER_IMAGE="hexpm/elixir:${ELIXIR_VERSION}-erlang-${OTP_VERSION}-debian-${DEBIAN_VERSION}"
ARG RUNNER_IMAGE="debian:${DEBIAN_VERSION}"

FROM ${BUILDER_IMAGE} as builder

# install build dependencies (and curl for EXLA)
RUN apt-get update -y && apt-get install -y build-essential git curl \
    && apt-get clean && rm -f /var/lib/apt/lists/*_*

# prepare build dir
WORKDIR /app

# install hex + rebar
RUN mix local.hex --force && \
    mix local.rebar --force

# set build ENV
ENV MIX_ENV="prod"
ENV BUMBLEBEE_CACHE_DIR="/app/.bumblebee/"
ENV BUMBLEBEE_OFFLINE="false"

# install mix dependencies
COPY mix.exs mix.lock ./
RUN mix deps.get --only $MIX_ENV
RUN mkdir config

# copy compile-time config files before we compile dependencies
# to ensure any relevant config change will trigger the dependencies
# to be re-compiled.
COPY config/config.exs config/${MIX_ENV}.exs config/
RUN mix deps.compile

COPY priv priv

COPY lib lib

COPY assets assets

COPY .bumblebee/ .bumblebee

# compile assets
RUN mix assets.deploy

# Compile the release
RUN mix compile

# IMPORTANT: This downloads the HuggingFace models from the `serving` function in the `lib/app/application.ex` file. 
# And copies to `.bumblebee`.
RUN mix run -e 'App.Application.load_models()' --no-start --no-halt; exit 0
COPY .bumblebee/ .bumblebee

# Changes to config/runtime.exs don't require recompiling the code
COPY config/runtime.exs config/

COPY rel rel
RUN mix release

# start a new build stage so that the final image will only contain
# the compiled release and other runtime necessities
FROM ${RUNNER_IMAGE}

RUN apt-get update -y && \
  apt-get install -y libstdc++6 openssl libncurses5 locales ca-certificates \
  && apt-get clean && rm -f /var/lib/apt/lists/*_*

# Set the locale
RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen

ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

WORKDIR "/app"
RUN chown nobody /app

# set runner ENV
ENV MIX_ENV="prod"

# Adding this so model can be downloaded
RUN mkdir -p /nonexistent

# Only copy the final release from the build stage
COPY --from=builder --chown=nobody:root /app/_build/${MIX_ENV}/rel/app ./
COPY --from=builder --chown=nobody:root /app/.bumblebee/ ./.bumblebee

USER nobody

# If using an environment that doesn't automatically reap zombie processes, it is
# advised to add an init process such as tini via `apt-get install`
# above and adding an entrypoint. See https://github.com/krallin/tini for details
# ENTRYPOINT ["/tini", "--"]

# Set the runtime ENV
ENV ECTO_IPV6="true"
ENV ERL_AFLAGS="-proto_dist inet6_tcp"
ENV BUMBLEBEE_CACHE_DIR="/app/.bumblebee/"
ENV BUMBLEBEE_OFFLINE="true"

CMD ["/app/bin/server"]
  • so BUMBLEBEE_OFFLINE should only be set to true after after the image is created and the app is compiled in the Dockerfile. The reason it wasn't working before it's because BUMBLEBEE_OFFLINE was also being set during compilation. Somehow this messed up the process (though I'm not sure why).
  • setting BUMBLEBEE_CACHE_DIR ending with / also helped, though I doubt it since Path.join/1 is being used in cache_downloads/2.

Here's how my application.ex is looking.

defmodule App.Application do
  # See https://hexdocs.pm/elixir/Application.html
  # for more information on OTP Applications
  @moduledoc false

  use Application

  @impl true
  def start(_type, _args) do
    children = [
      # Start the Telemetry supervisor
      AppWeb.Telemetry,
      # Start the PubSub system
      {Phoenix.PubSub, name: App.PubSub},
      # Nx serving for image classifier
      {Nx.Serving, serving: serving(), name: ImageClassifier},
      # Adding a supervisor
      {Task.Supervisor, name: App.TaskSupervisor},
      # Start the Endpoint (http/https)
      AppWeb.Endpoint
      # Start a worker by calling: App.Worker.start_link(arg)
      # {App.Worker, arg}
    ]

    # See https://hexdocs.pm/elixir/Supervisor.html
    # for other strategies and supported options
    opts = [strategy: :one_for_one, name: App.Supervisor]
    Supervisor.start_link(children, opts)
  end

  def load_models do
    # ResNet-50 -----
    {:ok, _} = Bumblebee.load_model({:hf, "microsoft/resnet-50"})
    {:ok, _} = Bumblebee.load_featurizer({:hf, "microsoft/resnet-50"})
  end

  def serving do
    # ResNet-50 -----
    {:ok, model_info} = Bumblebee.load_model({:hf, "microsoft/resnet-50"})
    {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "microsoft/resnet-50"})

    Bumblebee.Vision.image_classification(model_info, featurizer,
      top_k: 1,
      compile: [batch_size: 10],
      defn_options: [compiler: EXLA],
      preallocate_params: true        # needed to run on `Fly.io`
    )

  end

  # Tell Phoenix to update the endpoint configuration
  # whenever the application is updated.
  @impl true
  def config_change(changed, _new, removed) do
    AppWeb.Endpoint.config_change(changed, removed)
    :ok
  end
end

The container now works locally without crashing and it seems to make cache_download/2 fetch from the local file in .bumblebee.

See the video below (you can skip the first 90 seconds, it's just showing docker downloading stuff).

8mb.video-xvz-pEBGL5b0.mp4

I've built the docker image with --no-cache on purpose. As you can see, if I restart the machine, it won't crash/manages to find the local model in .bumblebee.

The metadata_filename is the following piece of code that is found in cache_download/2:

    url = "https://huggingface.co/api/models/microsoft/resnet-50/tree/main" |> :erlang.md5() |> Base.encode32(case: :lower, padding: false)
    metadata_filename = url <> ".json"

    dbg(metadata_filename)

This yields the hashed file .json file (which is base32 encoded).

I digress. This should work for you too now 👌

UPDATE: This doesn't always work, for whatever reason. Even if the files are clearly inside the container and accessible, it errors out. I don't know how to fix this anymore.
At first I thought I had to had .bumblebee populated on my localhost so Dockerfile could copy them. And that seemed to work. But now it doesn't anymore, for whatever reason.

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

Yes, :local expects something else than what is downloaded.

But neither Fly.io nor livebeats whisper use exit 0 nor the COPY command that you use:

RUN mix run -e 'App.Application.load_models()' --no-start --no-halt; exit 0
COPY .bumblebee/ .bumblebee

Furthermore, it may download, but as the CACHE_DIR is set and read by Docker, the docker folder should be populated and you would not need to copy things.

Still at the same point as 2 weeks ago: same mix, same Dockerfile, same serving, same Application (including the ordering) but absolutely no clue why the "official" code fails. The good part is I feel less alone :)

@jeregrine
Copy link

COPY .bumblebee/ .bumblebee

Will copy from your context or local machine to the current image so doing it twice won't do anything. And docker is weird and finicky I'm sorry you're having these issues, but i find full paths work better than relative ones.

If you do:

FROM ${builder}
RUN mix run -e 'App.Application.load_models()' --no-start --no-halt; exit 0
FROM ${runner}
COPY --from=builder /app/.bumblebee/ /app/.bumblebee

This example is also not great because you might create very large dockerfiles which make deploys slower. One thing we've been trying out is adding a volume and downloading the model on first boot to said volume.

unless File.exists?(model) do
   App.Application.load_models()'
end

Once your app deploys can you fly ssh console into it and verify that the /app/.bumblebee files are there or not. If they are then your configuration is wrong.

@LuchoTurtle
Copy link
Member

LuchoTurtle commented Nov 14, 2023

Thanks for the feedback @jeregrine .

Yes, doing that will copy from my machine to the current image and it's redundant/duplicated unnecessarily.
Doing everything in the Dockerfile was actually working for a while but it stopped working after I've changed nothing. Weird stuff.

But yes, I'm aware this isn't the ideal solution - it creates gigantic image files, as you correctly stated. Having a volume is certainly the way to go and I'm currently exploring it. But I feel like my issue will still occur even with volumes. I'm testing stuff locally with Docker and I can see the model files being correctly downloaded, the env variables (BUMBLEBEE_CACHE_DIR and BUMBLEBEE_OFFLINE) are correctly set and still I get an error whilst loading the models that they are not found.

For example, in application.ex:

 def load_models do
    # ResNet-50 -----
    {:ok, _} = Bumblebee.load_model({:hf, "microsoft/resnet-50"})
    {:ok, _} = Bumblebee.load_featurizer({:hf, "microsoft/resnet-50"})
  end

  def serving do

    dbg(System.get_env("BUMBLEBEE_CACHE_DIR"))
    dbg(System.get_env("BUMBLEBEE_OFFLINE"))

    # ResNet-50 -----
    {:ok, model_info} = Bumblebee.load_model({:hf, "microsoft/resnet-50"})
    {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "microsoft/resnet-50"})

    Bumblebee.Vision.image_classification(model_info, featurizer,
      top_k: 1,
      compile: [batch_size: 10],
      defn_options: [compiler: EXLA],
      preallocate_params: true        # needed to run on `Fly.io`
    )

  end

where serving/0 is used by Nx when the supervision tree is initiated.

If I run my dockerfile, I can clearly see that the files are being correctly downloaded and placed under app/.bumblebee.

image

So, in theory, this should totally work. But it doesn't.

2023-11-14 15:06:33 [lib/app/application.ex:49: App.Application.serving/0]
2023-11-14 15:06:33 System.get_env("BUMBLEBEE_CACHE_DIR") #=> "/app/.bumblebee/"
2023-11-14 15:06:33 
2023-11-14 15:06:33 [lib/app/application.ex:50: App.Application.serving/0]
2023-11-14 15:06:33 System.get_env("BUMBLEBEE_OFFLINE") #=> "true"
2023-11-14 15:06:33 
2023-11-14 15:06:33 15:06:33.255 [info] TfrtCpuClient created.
2023-11-14 15:06:33 15:06:33.645 [notice] Application app exited: exited in: App.Application.start(:normal, [])
2023-11-14 15:06:33     ** (EXIT) an exception was raised:
2023-11-14 15:06:33         ** (MatchError) no match of right hand side value: {:error, "could not find file in local cache and outgoing traffic is disabled, url: https://huggingface.co/microsoft/resnet-50/resolve/main/preprocessor_config.json"}
2023-11-14 15:06:33             (app 0.1.0) lib/app/application.ex:54: App.Application.serving/0
2023-11-14 15:06:33             (app 0.1.0) lib/app/application.ex:16: App.Application.start/2

Regardless if the model is stored in a volume or not (I know that there are ephemeral storage considerations on fly.io), I'm doing this on my computer and on a Docker instance. I'm at a loss at what I could be doing wrong 😅

@jeregrine
Copy link

jeregrine commented Nov 14, 2023 via email

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

This is exactly the pb I had (and still have) when I download within the Dockerfile.

If I run the image with a bash command, and ls, this is what I see (and I can't run the image due to the error above)

Screenshot 2023-11-14 at 17 02 16

If I don't download but mount bind a volume containing this folder, the folder is fully populated and the image runs:

 docker run --rm -it -p 4000:4000 --mount src=sf,target=/app/.bumblebee/ --env-file .env-docker --name app-cont up-img

# other terminal
docker exec -it app-cont bash
Screenshot 2023-11-14 at 17 07 29

@LuchoTurtle
Copy link
Member

@ndrean
So, assuming you've created a volume called models on fly.io and you have this in your fly.toml.

[mounts]
  source = "models"
  destination = "/app/.bumblebee/"

At what stage do you download the models? Do you do it yourself manually? Do you run an external script that does this?

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

This is my question! If you create a volume, can you ssh into it, even if no app is running (considering we are in the same region), or the simple "mounts" in the fly.toml will populate it?

@LuchoTurtle
Copy link
Member

Could you do a File.ls("/app/.bumblebee/") |> IO.inspect and see what you get?

I get the following:

2023-11-14 16:14:43 [lib/app/application.ex:49: App.Application.serving/0]
2023-11-14 16:14:43 System.get_env("BUMBLEBEE_CACHE_DIR") #=> "/app/.bumblebee/"
2023-11-14 16:14:43 
2023-11-14 16:14:43 [lib/app/application.ex:50: App.Application.serving/0]
2023-11-14 16:14:43 System.get_env("BUMBLEBEE_OFFLINE") #=> "true"
2023-11-14 16:14:43 
2023-11-14 16:14:43 {:ok, ["huggingface"]}
2023-11-14 16:14:43 [lib/app/application.ex:51: App.Application.serving/0]
2023-11-14 16:14:43 File.ls("/app/.bumblebee/") #=> {:ok, ["huggingface"]}
2023-11-14 16:14:43 |> IO.inspect() #=> {:ok, ["huggingface"]}
2023-11-14 16:14:43 
2023-11-14 16:14:43 16:14:43.822 [info] TfrtCpuClient created.
2023-11-14 16:14:44 16:14:44.224 [notice] Application app exited: exited in: App.Application.start(:normal, [])
2023-11-14 16:14:44     ** (EXIT) an exception was raised:
2023-11-14 16:14:44         ** (MatchError) no match of right hand side value: {:error, "could not find file in local ca

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

@LuchoTurtle , File.ls("/app/bumblebee/huggingface") and compare to your data because I had a difference

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

My image is 580Mb though so I can't test it on a free machine.

@LuchoTurtle
Copy link
Member

I've added the following code to mimic cached_download/2 and, as you can see, the filename matches and can be found inside /app/.bumblebee/huggingface.

    url = "https://huggingface.co/api/models/microsoft/resnet-50/tree/main" |> :erlang.md5() |> Base.encode32(case: :lower, padding: false)
    metadata_filename = url <> ".json"
    dbg(metadata_filename)
    dbg(File.ls("/app/.bumblebee/huggingface") |> IO.inspect)

On startup, it yields...

2023-11-14T16:27:57.716 app[683d529c575228] mad [info] metadata_filename #=> "7p34k3zbgum6n3sspclx3dv3aq.json"

2023-11-14T16:27:57.717 app[683d529c575228] mad [info] {:ok,

2023-11-14T16:27:57.718 app[683d529c575228] mad [info] ["45jmafnchxcbm43dsoretzry4i.json",

2023-11-14T16:27:57.718 app[683d529c575228] mad [info] "7p34k3zbgum6n3sspclx3dv3aq.k4xsenbtguwtmuclmfdgum3enjuwosljkrbuc42govrhcudqlbde6ujc",

2023-11-14T16:27:57.718 app[683d529c575228] mad [info] "45jmafnchxcbm43dsoretzry4i.eiztamryhfrtsnzzgjstmnrymq3tgyzzheytqmrzmm4dqnbshe3tozjsmi4tanjthera",

2023-11-14T16:27:57.718 app[683d529c575228] mad [info] "7p34k3zbgum6n3sspclx3dv3aq.json",

2023-11-14T16:27:57.718 app[683d529c575228] mad [info] "6scgvbvxgc6kagvthh26fzl53a.ejtgmobrgyzwcmjtgiztgmztgezdmnzqgzsdmnbzmnstom3fmnsdontfgq2wimrugfrdimtegyzdgzdfme3ggnzsgm3dsmddmftgkmbxei",

2023-11-14T16:27:57.718 app[683d529c575228] mad [info] "6scgvbvxgc6kagvthh26fzl53a.json"]}

@LuchoTurtle
Copy link
Member

LuchoTurtle commented Nov 14, 2023

This is my question! If you create a volume, can you ssh into it, even if no app is running (considering we are in the same region), or the simple "mounts" in the fly.toml will populate it?

I don't think you can ssh into it without the app successfully running. I've tried and it kicks me out every time it attempts to restart the server (expected). As far as I can tell, [mounts] won't populate anything, it's just pointing where we want to the model to be. Though I don't quite understand how I'm meant to populate the volume in an automated manner lmao.

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

I think you can just reference the volume by its name.

This is why I decided to try a VPS

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

as you can see, the filename matches and can be found inside /app/.bumblebee/huggingface.

Does it matches all the files you have locally for this model? For me, no

@jeregrine
Copy link

jeregrine commented Nov 14, 2023 via email

@LuchoTurtle
Copy link
Member

LuchoTurtle commented Nov 14, 2023

Yeah, there's no use putting more effort into a dead-end. So I'm downloading the model on the first boot up and then reusing it on subsequent restarts.

  def start(_type, _args) do

	    # Checking if the models have been downloaded
	    models_folder_path = Path.join(System.get_env("BUMBLEBEE_CACHE_DIR"), "huggingface")
	    if not File.exists?(models_folder_path) or File.ls!(models_folder_path) == [] do
	      load_models()
	    end
	
		children = [
		...
	end

  def load_models do
    # ResNet-50 -----
    {:ok, _} = Bumblebee.load_model({:hf, "microsoft/resnet-50"})
    {:ok, _} = Bumblebee.load_featurizer({:hf, "microsoft/resnet-50"})
  end

  def serving do
    # ResNet-50 -----
    {:ok, model_info} = Bumblebee.load_model({:hf, "microsoft/resnet-50", offline: true})
    {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "microsoft/resnet-50", offline: true})

    Bumblebee.Vision.image_classification(model_info, featurizer,
      top_k: 1,
      compile: [batch_size: 10],
      defn_options: [compiler: EXLA],
      preallocate_params: true        # needed to run on `Fly.io`
    )

  end

Instead of using BUMBLEBEE_OFFLINE, I'm passing the :offline option when loading the models, so they're fetched locally.

@LuchoTurtle
Copy link
Member

as you can see, the filename matches and can be found inside /app/.bumblebee/huggingface.

Does it matches all the files you have locally for this model? For me, no

The filename from cached_downloads/2 matches one of them (as expected). Though I see two files are missing when compared to the local one 💭

Regardless, no use to keep trying with the Dockerfile-only approach anymore 😅

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

So like me, some files are missing. Also the versions of Bumblebee and Nx are quite unstable. I decided to fork your repo to run the same code as you did, but guess what, I can't even run mix ph.server because:

function EXLA.NIF.start_log_sink/1 is undefined (module EXLA.NIF is not available)

Starting to lose a bit my patience. But maybe more important, the documentation isn't reliable? Take a look: https://github.com/elixir-nx/bumblebee/tree/main/examples/phoenix#tips. What is really working?

@LuchoTurtle
Copy link
Member

LuchoTurtle commented Nov 14, 2023

So like me, some files are missing. Also the versions of Bumblebee and Nx are quite unstable. I decided to fork your repo to run the same code as you did, but guess what, I can't even run mix ph.server because:

function EXLA.NIF.start_log_sink/1 is undefined (module EXLA.NIF is not available)

Starting to lose a bit my patience. But maybe more important, the documentation isn't reliable? Take a look: elixir-nx/bumblebee@main/examples/phoenix#tips. What is really working?

That's odd, I've never had that error happen to me. Does clearing out the deps and running mix deps.get again fix it?
https://elixirforum.com/t/exla-nif-start-log-sink-1-issue-works-on-ubuntu-but-not-on-macbook-m2/58162/3 says it's a Linux-related issue, but it apparently has been solved.

From the link that you provided, from what I've gathered, I did follow the Configuring Nx chapter and it seems to work the same as before. I can't really quantify it because I saw negligible impact after configuring it like so 🤷‍♂️

It's a shame we can't really trust their docs when clearly some of the articles and guides we've discussed and tried to follow clearly don't work :/

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

Yes, I erased everything, mix.lock, deps, _build and restarted again. It works now.....

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

Then for with or without EXLA, yes, there is a huge difference.
I did try a few months ago a very simple neural network with Axon, just to compute a linear regression by a simple gradient, and the difference was HUGE. Now, with what we are doing, no idea 😁

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

I understand (?) that we are more of less loading the coefficients of some kind of process that is used to define some Axon operations to build the neural network defined by the model. But what are we doing exactly, that I have absolutely no idea. But for sure, it is not rocket science 😁

@ndrean
Copy link
Author

ndrean commented Nov 14, 2023

I recalled I made a Livebook last year to test Nx and Axon. The idea was to use a very simple example: linear regression. You start with the very well known matrix formulas to compute the exact solution: whether by inverting a matrix, or using formulas. The best fitting linear curve passing through a bunch of points is of course the linear curve which gives the minimal total euclidean dsitance between the curve and the pionts. You compare this to a smiple gradient descent, and since e are crazy, we can even use a NN! You "pompously" train your NN with your points, to build the coefficients of your NN. Then you can use it: given an input x, it finds an y.
If you are interested just to see what is this about, I can paste somewhere the Livebook

@LuchoTurtle
Copy link
Member

Do show!

For my first NN I did something similar with Stochastic Gradient Descent -> https://github.com/LuchoTurtle/bike-sharing-patterns/blob/master/Your_first_neural_network.ipynb.

Curious to see what you've done :)

@nelsonic
Copy link
Member

@LuchoTurtle FYI: that link is 404 ... 🔗 🙈
image
It it public? 🌎

@LuchoTurtle
Copy link
Member

Should be now, thanks 👌

@ndrean
Copy link
Author

ndrean commented Nov 15, 2023

Here: https://github.com/ndrean/linear_regression_nx_axon
But this is terribly super basic work....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants