-
-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ideas from gurobipy-pandas #3212
Comments
So the type of |
Yes. Here's something I've just been playing with: model = Model(HiGHS.Optimizer)
@variable(model, x[foods.name] >= 0)
foods.x = x.data
@objective(model, Min, foods.cost' * foods.x,);
# A few ways of writing the same thing
for row in eachrow(limits)
@constraint(model, row.min <= sum(foods[!, row.name] .* foods.x) <= row.max)
end
for df in DataFrames.groupby(limits, :name)
@constraint(model, df.min[1] <= sum(foods[!, df.name[1]] .* foods.x) <= df.max[1])
end
for df in DataFrames.groupby(
DataFrames.leftjoin(
DataFrames.stack(foods, [:calories, :protein, :fat, :sodium]),
limits;
on = [:variable => :name],
),
:variable,
)
@constraint(model, df.min[1] <= sum(df.value .* df.x) <= df.max[1])
end The groupby stuff makes it harder (for me) to read. I think we have a lot of mileage out of the existing {Dense,Sparse}AxisArray stuff, and a lot of functionality that comes for free in Julia. But the practice of adding a column of a variables to a DataFrame is a useful tip. I wonder if there are other models with more complicated data frames. |
I agree; the first one (
As far as I can tell, AnyMOD takes this idea to the limit:
|
FYI, I always enjoy re-reading this paper by Robert 4er (Also, love the screenshots from his 1990's Mac user interface.) |
cc @leonardgoeke: any thoughts on ways JuMP could improve re dataframes? |
Overall, I’m very happy with the options JuMP and DataFrames are providing already. To make their combination more accessible, you could facilitate the conversion of containers. In the example code below creating and storing a variable in a DataFrame is significantly more complex than using the JuMP containers. This example is still simplified, because it is not sparse, and I would rather map the content of a and b to integers and use those in the DataFrame to improve the performance of join or groupby.
Also, I miss an in-place multiplication. In the process of combining variables to expressions and ultimately creating constraints, I can use add_to_expression! to sum efficiently but I’m not aware of a similar option for multiplication.
|
I'd write your first example differently: julia> using JuMP, DataFrames
julia> a = ["high","low"]
2-element Vector{String}:
"high"
"low"
julia> b = ["red","blue"]
2-element Vector{String}:
"red"
"blue"
julia> # Option 1
model = Model();
julia> df = DataFrame(
vec([
(a = i, b = j, var = @variable(model, base_name = "x[$i,$j]")) for
(i, j) in Iterators.product(a,b)
])
)
4×3 DataFrame
Row │ a b var
│ String String Variable…
─────┼──────────────────────────────
1 │ high red x[high,red]
2 │ low red x[low,red]
3 │ high blue x[high,blue]
4 │ low blue x[low,blue]
julia> # Option 2
model = Model();
julia> @variable(model, x[a, b])
2-dimensional DenseAxisArray{VariableRef,2,...} with index sets:
Dimension 1, ["high", "low"]
Dimension 2, ["red", "blue"]
And data, a 2×2 Matrix{VariableRef}:
x[high,red] x[high,blue]
x[low,red] x[low,blue]
julia> df = DataFrame(
vec([(a = i, b = j, var = x[i, j]) for (i, j) in Iterators.product(a,b)])
)
4×3 DataFrame
Row │ a b var
│ String String Variable…
─────┼──────────────────────────────
1 │ high red x[high,red]
2 │ low red x[low,red]
3 │ high blue x[high,blue]
4 │ low blue x[low,blue]
julia> # Option 3
model = Model();
julia> @variable(model, x[a, b])
2-dimensional DenseAxisArray{VariableRef,2,...} with index sets:
Dimension 1, ["high", "low"]
Dimension 2, ["red", "blue"]
And data, a 2×2 Matrix{VariableRef}:
x[high,red] x[high,blue]
x[low,red] x[low,blue]
julia> df = DataFrame(
vec([
(i = i, j = j, ai = ai, bj = bj, var = x[ai, bj]) for
((i, ai), (j, bj)) in Iterators.product(enumerate(a),enumerate(b))
])
)
4×5 DataFrame
Row │ i j ai bj var
│ Int64 Int64 String String Variable…
─────┼────────────────────────────────────────────
1 │ 1 1 high red x[high,red]
2 │ 2 1 low red x[low,red]
3 │ 1 2 high blue x[high,blue]
4 │ 2 2 low blue x[low,blue] For the doubling, you can use: julia> model = Model();
julia> @variable(model, x)
x
julia> aff_expr = 2 * x + 1
2 x + 1
julia> map_coefficients_inplace!(x -> 2x, aff_expr)
4 x + 2 |
Is there anything actionable here? I think the conclusion is that JuMP already provides the necessary functionality to work with DataFrames.jl. |
I'm going to close this. If anyone has any further suggestions, please comment below and I will re-open the issue. |
Hello, this is not so much a suggestion as a question, so if not appropriate in this closed issue please redirect me, thanks! @jd-foster I am wondering if you know of any code implementations of the method described in the Fourer paper, especially related to generating synthetic data for the "hierarchical" or "relational" database schemas (much more interested in the relational schema, at least for now). I'd like to test some ideas I have regarding in memory DBs and JuMP but I'm not familiar with steel production so I really do not know how to come up with some synthetic data with parameters that resembles what he mentioned in the article. |
Please ask these types of questions on https://discourse.julialang.org/c/domain/opt/13. For SQLite examples, see: https://jump.dev/JuMP.jl/stable/tutorials/linear/multi/ |
I'm currently listening to @simonbowly talk about his cool work integrating gurobipy and pandas: https://github.com/Gurobi/gurobipy-pandas
I don't know if we can make it easy for JuMP to work with DataFrames without an extension package (although things might just work and it just requires some documentation), but it'd be nice to improve this:
JuMP.jl/docs/src/tutorials/linear/diet.jl
Lines 102 to 106 in 451740a
by adding
x
as a column to the data frame instead of a separate variable.At minimum, it'd allow
but potentially some variant of
You could also imagine building constraints/expressions via split-apply-combine. I'd imagine there are lots of models where the variables correspond to a row in a dataframe.
The text was updated successfully, but these errors were encountered: