Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize SIGMOD 2024 paper ~(if accepted)~ #8373

Closed
2 of 5 tasks
alamb opened this issue Nov 30, 2023 · 54 comments
Closed
2 of 5 tasks

Finalize SIGMOD 2024 paper ~(if accepted)~ #8373

alamb opened this issue Nov 30, 2023 · 54 comments
Assignees
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Nov 30, 2023

UPDATE: Final paper: https://dl.acm.org/doi/10.1145/3626246.3653368 (alternate download)

¯### Is your feature request related to a problem or challenge?

@JayjeetAtGithub @Dandandan @yjshen @ozankabak @sunchao and @viirya and submitted a paper to the SIGMOD 2024 conference, which was tracked by #6782

If our paper is accepted, this ticket tracks follow on work items to complete prior to the final copy

For the Industrial Track the dates are:

  • All deadlines below are 11:59 PM Pacific Time.
  • Paper submission: Thursday, November 30, 2023
  • Notification of accept/reject: Wednesday, January 31, 2024
  • Camera-ready deadline: Thursday, March 28, 2024

Describe the solution you'd like

Here are the items I know so far:

  • Fix the (currently) non working email for @JayjeetAtGithub (jchakraborty@influxdata.com currently does not work)
  • Clean up bibliography into a consistent style (sometimes all authors are listed, sometimes just the first one is -- they should all be the same)

Nice to haves:

Describe alternatives you've considered

No response

Additional context

No response

@alamb alamb added the enhancement New feature or request label Nov 30, 2023
@alamb
Copy link
Contributor Author

alamb commented Jan 9, 2024

Here is what we submitted:
DataFusion_Query_Engine___SIGMOD_2024.pdf

@vertexclique
Copy link
Contributor

I can check results and update them, please assign it to me.

@alamb
Copy link
Contributor Author

alamb commented Feb 1, 2024

FWIW the notification deadline was yesterday but I have not heard anything one way or the other (and the CMT tool doesn't say one way or the other). I will email the chairs tomorrow if we haven't heard by then

@alamb
Copy link
Contributor Author

alamb commented Feb 2, 2024

I emailed the chairs today and they said the notification will be delayed a few days. Will post updates here as I have them.

@ozankabak
Copy link
Contributor

Thanks for the update 🚀

@viirya
Copy link
Member

viirya commented Feb 2, 2024

Thank you @alamb

@alamb
Copy link
Contributor Author

alamb commented Feb 4, 2024

The paper was accepted to SIGMOD! 🎉

I'll spend some time reviewing the comments later this week and we can organize action items for the final draft

From: Microsoft CMT <email@msr-cmt.org>
Date: Sun, Feb 4, 2024 at 11:28 AM
Subject: SIGMOD 2024 Industry Track decision for Paper 1
To: Andrew Lamb 

Dear Andrew  Lamb,
 
It is our great pleasure to inform you that your paper #1 "Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine" has been Accepted to the conference. Congratulations!

The papers will be presented at SIGMOD 2024, in Santiago, Chile in June, so please plan for at least one author of the paper to attend the conference. This might require a visa, so please consult the following page at your earliest convenience https://2024.sigmod.org/visa_chile.shtml .

We hope that you will find the reviews helpful in revising accordingly the camera-ready version of the manuscript. Please note that the papers will appear in PACMMOD, like the Research Track papers of SIGMOD 2024. The formatting guidelines for the camera-ready papers are available at: https://dl.acm.org/journal/pacmmod/author-guidelines#formatting , under the "Length and Format for Camera-Ready Papers" section.
 
Congratulations again and looking forward to seeing you at SIGMOD 2024!

Danica and Ippokratis.

@viirya
Copy link
Member

viirya commented Feb 4, 2024

Cool! Congrats to all!

@sunchao
Copy link
Member

sunchao commented Feb 4, 2024

This is great news! congrats all!

@JayjeetAtGithub
Copy link
Contributor

Congratulations everyone !

@matthewmturner
Copy link
Contributor

Great news! Congratulations to all involved!

@liurenjie1024
Copy link
Contributor

Congratulations everyone !

@alamb alamb changed the title Finalize SIGMOD 2024 paper (if accepted) Finalize SIGMOD 2024 paper ~(if accepted)~ Feb 5, 2024
@alamb
Copy link
Contributor Author

alamb commented Feb 6, 2024

Here is the reviewer feedback

Reviewer #2
Questions

  1. Is the paper readable and well organized?
    Definitely - very clear

  2. Does this paper present a significant addition to the body of work in the area of data management research?
    Definitely - a significant addition

  3. Is the paper likely to have a broad impact on the data management community?
    SIGMOD attendees will learn something interesting from the paper
    The paper is likely to influence research in the community

  4. Overall rating
    Accept

  5. Reviewer’s confidence
    Expert

  6. Strong points

  7. Good presentation of the Apache Arrow DataFusion open-source project.

  8. DataFusion efficiently implements operators that can be used by various other data systems, avoiding their cumbersome re-implementation.

  9. Good experimental results versus DuckDB (which is an extremely well optimized embeddable analytics database).

  10. I really appreciate how the DataFusion community was involved even in writing this paper. See here: Write DataFusion paper for (SIGMOD / VLDB / ICDE) #6782

  11. Weak points

  12. Minor: although well-engineered, the algorithms behind the supported operators are not new. DataFusion implements well-known techniques.

  13. Overall comments
    The paper describes the functionality of DataFusion, a very well-designed and implemented library based on Apache Arrow, which implements a variety of operators used in SQL. Similar to Arrow, DataFusion is an embeddable library (built in Rust), which can easily be embedded in broader data systems that require analytical operations. The paper includes a nice experimental evaluation versus DuckDB, demonstrating good results.

Reviewer #5
Questions

  1. Is the paper readable and well organized?
    Definitely - very clear
  2. Does this paper present a significant addition to the body of work in the area of data management research?
    Mostly - the contributions are above the bar
  3. Is the paper likely to have a broad impact on the data management community?
    SIGMOD attendees will learn something interesting from the paper
  4. Overall rating
    Accept
  5. Reviewer’s confidence
    Expert
  6. Strong points
  7. The paper is well written.
  8. Extensive evaluation using 3 popular benchmarks.
  9. An active community-driven project.
  10. Weak points
  11. The DataFusion project is a combination and integration of other well-known components/systems; as such, its overall technical novelty is limited.
  12. The experimental evaluation didn't compare against many other popular OLAP systems in the field.
  13. The support for complex analytical queries (e.g., multi-way join as those found in TPC-DS) is limited.
  14. Overall comments
    This paper is well written and the DataFusion project has a good momentum in the community. The idea of building an OLAP engine using a decoupled, component-based approach is interesting (versus tightly coupled designs). The paper has described most elements in DataFusion, but didn't offer enough details to demonstrate sufficient technical novelty (that goes beyond integration of various existing componentshe). How to better suit the cloud environment where most OLAP engines are running on nowadays is also not discussed in the paper.

Reviewer #7
Questions

  1. Is the paper readable and well organized?
    Mostly - the presentation has minor issues, but is acceptable
  2. Does this paper present a significant addition to the body of work in the area of data management research?
    Mostly - the contributions are above the bar
  3. Is the paper likely to have a broad impact on the data management community?
    SIGMOD attendees will learn something interesting from the paper
  4. Overall rating
    Reject
  5. Reviewer’s confidence
    Knowledgeable
  6. Strong points
  • Presents the technologies that power DataFusion and provides motivating use cases for using DataFusion, making a compelling argument over reuse in analytic systems using commodity OLAP engines and a paradigm shift in that direction.
  • Provides extensive evaluation of DataFusion's performance.
  • Presents DataFusion's architecture, extension APIs and features.
  1. Weak points
  • One of the main claims of the paper is that DataFusion is catalyzing the development of new data systems. The presentation and the evaluation of the paper would benefit from elaborating further on this claim.
  • Section 5.1 Engine overview and Figure 2 need to be more extensive to be able to follow the rest of Section 5.
  • The LLVM analogy distracts from the paper.
  • The paper claims in section 7.4 that "..DataFusion can be customized for these different environments using the MemoryPool trait to control memory allocations, the DiskManager trait for managing temporary files (if any), and a CacheManager for caching information such as directory contents and per-file metadata.". More technical details on this topic would be helpful.
  • The "Single Core Efficiency" section could benefit from running TPC-H across multiple threads and configuration settings. The authors mention a caveat of restricting duckDB performance for some benchmarks by using single thread.
  1. Overall comments
    I would like to thank the authors for their work. Please find some additional minor comments below:
  • Please move the figure out of the first page, or to the bottom of the first page. It is distracting to read the caption of Figure 1 before the abstract.

  • Please update the axes in Figure 7 to be legible.

  • One of the main topics of the paper is that DataFusion catalyzes the development of new data systems. Evaluation in that direction would help support the claims in the paper further. One related angle could be the ease of developing systems (applications) on top of DataFusion, potentially including the overhead in terms of lines of code or engineering hours in developing a simple system/application with DataFusion and using a different stack or being customly built. Similarly, performance evaluation of systems relying on DataFusion could help in this direction as well.

  • Similarly, the content of the paper would benefit from doing a deep dive into the query engine and a limited set of features based on how they are used by systems developed on DataFusion.

@alamb
Copy link
Contributor Author

alamb commented Feb 7, 2024

It appears we have about 2 months to complete the final draft

Camera-ready deadline: Thursday, March 28, 2024

Here is a summary of my suggested action items based on the reviewer feeback above

  • Add more examples / better explanation of systems built on DataFusion (we have some good new examples I know of since -- Arroyo, Comet, and LanceDB comes to mind)
  • Please move the figure out of the first page, or to the bottom of the first page. It is distracting to read the caption of Figure 1 before the abstract.
  • "Section 5.1 Engine overview and Figure 2 need to be more extensive to be able to follow the rest of Section 5."
  • "The LLVM analogy distracts from the paper." - I happen to like this analogy (and I think @ozankabak does too), but maybe we can make this section shorter / more concise.
  • Extend section 7.4's description with technical details about MemoryPool, DiskManager, and CacheManager for caching information such as directory contents and per-file metadata."
  • ?The authors mention a caveat of restricting duckDB performance for some benchmarks by using single thread." -- We should make what we measured clearer
  • work in "One related angle could be the ease of developing systems (applications) on top of DataFusion, potentially including the overhead in terms of lines of code or engineering hours in developing a simple system/application with DataFusion and using a different stack or being customly built. Similarly, performance evaluation of systems relying on DataFusion could help in this direction as well."

Here are some other notes I have

The main criticism / weakness cited is that DataFusion doesn't demonstrate sufficient technical novelty other than integration of various existing ideas. I think this is a very valid point, and maybe we should re-emphasize the point more that it isn't technical novelty of any part, but the overall system.

How to better suit the cloud environment where most OLAP engines are running on nowadays is also not discussed in the paper.

This is a good point that would be good to work in

Similarly, the content of the paper would benefit from doing a deep dive into the query engine and a limited set of features based on how they are used by systems developed on DataFusion.

I agree this would be an interesting point, but given that we are already at the 12 page limit I am not sure how to do so in this particular paper. Maybe these would make good follow on papers or blog posts (@appletreeisyellow and I could potentially write one on how InfluxDB uses PruningPredicates 🤔 )

@sunchao
Copy link
Member

sunchao commented Feb 7, 2024

To update the draft I'm assuming we can just reuse the same overleaf project? we'd be happy to touch a bit more on the Comet side, and update the sentence 😂

DataFusion is used by several Spark native runtimes, including Blaze[ 10] and at least one project that is not yet open-source.

@viirya
Copy link
Member

viirya commented Feb 7, 2024

Yea, as Comet now is open sourced, we can explicitly mention the project (with project link) and more details about it.

@alamb
Copy link
Contributor Author

alamb commented Feb 8, 2024

To update the draft I'm assuming we can just reuse the same overleaf project? we'd be happy to touch a bit more on the Comet side, and update the sentence 😂

Yes, please, let's use the same overleaf project

Yea, as Comet now is open sourced, we can explicitly mention the project (with project link) and more details about it.

Yes please that would be great -- and it will also address some of the reviewer feedback suggesting more details on usecases

@ozankabak
Copy link
Contributor

"The LLVM analogy distracts from the paper." - I happen to like this analogy (and I think @ozankabak does too), but maybe we can make this section shorter / more concise.

I think this analogy is very useful -- let's keep it. In my experience it also resonates with technical folks very well. Since this feedback seems like an outlier in terms of reception, I suggest we improve other aspects of the paper.

@alamb
Copy link
Contributor Author

alamb commented Feb 8, 2024

@JayjeetAtGithub is there any chance you can update your email address to one that works (rather than the influxdata one that does not)?

Also, it would be great if someone could work on cleaning up the bibliography.

@alamb
Copy link
Contributor Author

alamb commented Feb 8, 2024

Also, we maybe can add some other users like Seafowl (now part of enterprise DB), which I think could potentially be described as a postgres analytics acclerator (aka it is to postgres what comet is to spark). Maybe @gruuya can correct me if I got that wrong

@gruuya
Copy link
Contributor

gruuya commented Feb 8, 2024

which I think could potentially be described as a postgres analytics acclerator (aka it is to postgres what comet is to spark)

Yeah, basically that's what we strive for, thanks!

@JayjeetAtGithub
Copy link
Contributor

@alamb I updated my affiliation and email to that of UC Santa Cruz, my university.

@alamb
Copy link
Contributor Author

alamb commented Feb 19, 2024

An update here: I plan to take a pass through the draft the week of March 4 and implement the bulk of any feedback that was not yet implemented. After that week I'll likely take a few proofreading passes, but I don't expect to do any major revisions

I also don't plan to rerun benchmarks again due to lack of time. While the benchmark runs themselves are nicely automated thanks to @JayjeetAtGithub, analyizing the results takes significant time and research.

@alamb
Copy link
Contributor Author

alamb commented Mar 18, 2024

I also make the labels and series colors consistent in the scalability chart: JayjeetAtGithub/datafusion-duckdb-benchmark#26

@alamb
Copy link
Contributor Author

alamb commented Mar 19, 2024

work in "One related angle could be the ease of developing systems (applications) on top of DataFusion, potentially including the overhead in terms of lines of code or engineering hours in developing a simple system/application with DataFusion and using a different stack or being customly built. Similarly, performance evaluation of systems relying on DataFusion could help in this direction as well."

Similarly, the content of the paper would benefit from doing a deep dive into the query engine and a limited set of features based on how they are used by systems developed on DataFusion.

I added these (great) ideas to the "future work" section which I think would make very excellent future papers

Screenshot 2024-03-19 at 6 34 30 AM

@alamb
Copy link
Contributor Author

alamb commented Mar 19, 2024

Update: I plan to complete the final outstanding item from the reviewers (adding some technical details about memory pool and related APIs) tomorrow, and then I will move on to wordsmithing / honing for a few days. I am sure we (at least I) could obsess over the content indefinitely, but I think we need to just "ship it" eventually and we are getting very close

@alamb
Copy link
Contributor Author

alamb commented Mar 20, 2024

Update: @viirya cleaned up some of the language, and I tweaked the images to make the internal margins even:
Screenshot 2024-03-20 at 6 03 07 AM

Screenshot 2024-03-20 at 6 03 37 AM

I also received a bunch of instructions from the ACM, so changed the title so the "A" wasn't capitalized

Screenshot 2024-03-20 at 6 04 18 AM

Extend section 7.4's description with technical details about MemoryPool, DiskManager, and CacheManager for caching information such as directory contents and per-file metadata."

I reworked this section with more details / references:

Screenshot 2024-03-20 at 6 40 45 AM

@alamb
Copy link
Contributor Author

alamb commented Mar 20, 2024

I also took a pas through the bibliography to make the style consistent

@alamb
Copy link
Contributor Author

alamb commented Mar 20, 2024

TLDR -- please make any updates you would like / need by this weekend (Sunday March 24, 2024).

Also, fellow authors, please ensure your names / affiliations are correct on the paper as I will submit that as part of the paper as well

Update:

  1. The current copy of the paper (with all edits) can be downloaded here: DataFusion_Query_Engine___SIGMOD_2024 (9).pdf
  2. I received an email (below) which suggests the deadline is actually April 12.
  3. I still plan to submit a draft / final manuscript on March 28.
  4. I plan to take one lass editing pass through the paper this week/weekend to tighten up the language and make it more concise
  5. Starting next Monday, March 23 I plan to just read / update for typos and required formatting for submission on the 28th

Dear Andrew Lamb,

We are writing to you regarding the camera ready submission process. Thank you for your patience as we worked with ACM SIGMOD officers and Sheridan Communications, our publishers, to agree on the process and formatting. We apologize for the previously communicated change in formatting - upon extensive discussion we agreed that camera ready versions of the paper should use the same 2-column format as the initial submissions.

By now you should have received initial email from Sheridan including the information regarding how to submit the camera-ready publication and how to review and edit the paper metadata. You will also be receiving the ACM rightsreview form soon. The deadline for submitting the camera-ready manuscript and the ACM rightsreview form is April 12th. Please let us know if you didn't receive the email from Sheridan or have any other concerns about the process.

We apologize for the confusion and are looking forward to seeing you at SIGMOD 2024!

Danica and Ippokratis.

@alamb
Copy link
Contributor Author

alamb commented Mar 20, 2024

Here is an email about ACM formatting guidelines in case anyone following along is interested in how this process works


Dear Author,

Please remember to submit "Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine," for publication in the proceedings and ACM DL.


SIGMOD'24 Authors. You are receiving this initial email request to format and submit your final version (paper and promo video) per the information below (on or before EoD April 12th)

  1. IMPORTANT: Only the first designated contact/corresponding author will be sent the ACM rightsreview form email in about 24 hours.

  2. Review https://www.scomminc.com/pp/acmsig/sigmod-pods2024.htm carefully as ACM is imposing all data and author fields must match on the PDF and ACM rightsreview form (stricter compliance to the template has been requested by ACM).
    Kindly coordinate the following with your co-authors to avoid having to repeat the process. Any discrepancies will void a completed form.
    A. Title in Initial Caps;
    B. How each authors' full name with middle initials or names should appear (on both PDF & ACM form) & update the form to include all author names, dept/lab, affiliations, and email addresses;
    C. How your ''dept/labs'' and ''institution/corp affiliations'' should be listed consistently (on both PDF & ACM form);
    D. Any funding or grants to be acknowledged.
    E. Upon completing the rightsreview form, the contact author will receive the ACM copyright block, conference data, ISBN, and DOI to include on the first page with the other required ACM sections per the ACM SIG (sigconf style) template https://www.scomminc.com/pp/acmsig/sigmod.htm

  3. Submit your final version only (pdf, source file, promo video, & optional thumbnail) to the link below on or before April 12th. Material submitted before the deadline will be processed and can be finalized for the ACM DL.


The following is a direct link to submit based on your unique submission ID: (REMOVED)

Thank you,

Sheridan Communications

@alamb
Copy link
Contributor Author

alamb commented Mar 22, 2024

In order to keep consistent emails I changed the author emails to @apache.org for committers:

Screenshot 2024-03-22 at 4 30 36 PM

@alamb
Copy link
Contributor Author

alamb commented Mar 23, 2024

I took another pass through the paper. In addition to some word smithing and whitespace engineering, I increased the size of the abstract both so the front page didn't look as empty but also to summarize the content of the paper (in addition to its conclusion / main point) to help readers decide if the paper was interesting to them

Here is the current text

Apache Arrow DataFusion\cite{DataFusion} is a fast, embeddable, and extensible query engine written in Rust\cite{Rust} that uses Apache Arrow\cite{Arrow} as its memory model. In this paper we describe the technologies on which it is built, and how it fits in long term database implementation trends. We then enumerate the features of a modern OLAP engine, and outline optimizations required for high performance. Next we describe DataFusion's architecture and extension APIs to illustrate the interfaces used in modular query engines to integrate with the systems built on them. Finally, we demonstrate open standards and extensible design do not preclude state-of-the-art performance using a series of experimental comparisons to DuckDB\cite{DuckDB}.

While the individual techniques used in DataFusion have been previously described many times, it differs from other industrial strength engines by providing competitive performance \textit{and} an open architecture that can be customized using more than 10 major extension APIs. This flexibility has led to use in many commercial and open source databases, machine learning pipelines, and other data-intensive systems. We anticipate that the accessibility and versatility of DataFusion, along with its competitive performance, will further the proliferation of high-performance custom data infrastructures tailored to specific needs assembled from modular components\cite{ComposableManifesto, ComposableCodex}.

Here is what it looks like

Screenshot 2024-03-23 at 5 51 20 PM

@alamb
Copy link
Contributor Author

alamb commented Mar 23, 2024

I made it through Section 6 today, and I plan to start at Section 7 tomorrow for a final read through / polish.

Starting Monday I just plan to do whitespace engineering / proofreading

@alamb
Copy link
Contributor Author

alamb commented Mar 24, 2024

Ok, I did a final read / wordsmith / whitespace engineering on the last few sections. I will plan to do a proofreading pass or two over the next few days, but don't plan to change anything unless there is some grammatical issue.

The current copy of the paper is here:
DataFusion_Query_Engine___SIGMOD_2024 (1).pdf

(getting very close)

@alamb
Copy link
Contributor Author

alamb commented Mar 27, 2024

I am on the home stretch -- I plan to do a final proofreading of Section 8, 9, 10, and 11 and then submit the draft tomorrow

@Dandandan
Copy link
Contributor

Thank you Andrew! Sorry I couldn't spent more time on the paper.
I'll try to review the draft this evening.

@alamb
Copy link
Contributor Author

alamb commented Mar 27, 2024

No problem! I think we are all quite busy -- I also want to submit the paper so I can let it go (my OCD tendencies would be to edit it indefinitely, which is not good :) )

@alamb
Copy link
Contributor Author

alamb commented Mar 28, 2024

Here is the digital rights form that was submitted: 13731_1_1.pdf

Here is the final draft:
DataFusion_Query_Engine___SIGMOD_2024-FINAL.pdf

Here are the source files:
DataFusion Query Engine - SIGMOD 2024-SOURCE.zip

I am working to upload this to the CMT tool -- once I get confirmation they got it and it is accepted I'll (finally!) close this issue. Thanks all

@alamb
Copy link
Contributor Author

alamb commented Mar 28, 2024

Screenshot 2024-03-28 at 9 30 05 AM

😅

@alamb
Copy link
Contributor Author

alamb commented Apr 6, 2024

We needed to make a few tweaks on the final manuscript to conform to the publisher's rules

---------- Forwarded message ---------
From: New Form Needed <submissions@scomminc.com>
Date: Fri, Apr 5, 2024 at 9:39 AM
Subject: ACM Proceedings (sigmodpods NF) - Submission Follow Up, ID No: modIP001
To: <alamb@apache.org>
Cc: <yjshen@apache.org>, <dheres@apache.org>, <jayjeetc@ucsc.edu>, <ozankabak@apache.org>, <viirya@apache.org>, <sunchao@apache.org>, <Jyoti.Leeka@microsoft.com>, <venkatesh.emani@microsoft.com>


Dear Author(s),
For the submission of "Apache Arrow DataFusion: a Fast, Embeddable, Modular Analytic Query Engine" to the proceedings and ACM Digital Library.

Please fix the following on your submission and re-submit on or before April 12th

1. A. The completed ACM rightsreview form was voided, per ACM policy and instructions provided all authors names, accents, initials, authorship order, dept/labs, affiliations, titles must match precisely on the PDF and ACM form: https://www.scomminc.com/pp/acmsig/sigmod-pods2024.htm#AO
https://www.scomminc.com/pp/acmsig/sigmod-pods2024.htm#GA 
The following on your PDF and the rightsreview form do not match (see the following to see the differences). The ACM rightsreview form link will be sent to the designed contact author again.

Form -- Liang-Chi Hsieh: Apple;
Chao Sun: Apple

Files -- Chao Sun: Apple;
Liang-Chi Hsieh: Apple

 B. To update the rightsreview form is best done on a computer by the designated contact author, not a mobile device;  C. Update the rightsreview form data accordingly to match your final version;  D. Then at the bottom click the check box, then click ''Save Paper and Author Details'';  E. Then click ''Proceed to eRights Form'';  F. Then complete the form entirely until the designated contact/corresponding author receives a new ACM confirmation email.

2. Authors 6 & 7. Per ACM policy, we can not allow bundled/bunched or grouped authors in a long string: https://www.scomminc.com/pp/acmsig/sigmod-pods2024.htm#GA and [www.scomminc.com/pp/acmsig/ACM-sample-sigconf-section6.pdf](http://www.scomminc.com/pp/acmsig/ACM-sample-sigconf-section6.pdf)

3. To make the ACM even page headers consistent for this year's conference publication (per the instructions: https://www.scomminc.com/pp/acmsig/sigmod.htm#Le), kindly use the following after authors and affiliation info in your .tex file:

\renewcommand{\shortauthors}{Andrew Lamb et al.}
%% No italics

4. Due to a recent change in the ACM guidelines, the $15.00 within the ACM CR block is no longer needed & not included in the ACM code provided, please comment this out on your new version.
%\acmPrice{15.00}

Then review the new version of your submission for any typos and then submit/upload all the files (required pdf, required source, etc) to update your pdf/submission on or before deadline above to the following link:


<REDACTED>

Thank you,

New Form Needed

Sheridan Communications

I made the requested edits (layouts of author names, remove $15, and update the short authors and submitted a new draft:

Screenshot 2024-04-05 at 7 58 55 PM

Here is the next draft

DataFusion Query Engine - SIGMOD 2024-SOURCE-mk2.zip
DataFusion_Query_Engine___SIGMOD_2024-FINAL-mk2.pdf

(BTW I am sharing this not because I think anyone really cares, but because I thought others might be interested in how this process works)

@alamb
Copy link
Contributor Author

alamb commented Apr 6, 2024

Screenshot 2024-04-05 at 8 06 07 PM

@alamb
Copy link
Contributor Author

alamb commented Apr 23, 2024

And apparently I still didn't get it entirely correct:

---------- Forwarded message ---------
From: Tim Pollitt <submissions@scomminc.com>
Date: Tue, Apr 23, 2024 at 8:46 AM
Subject: ACM Proceedings (sigmodpods TP) - Submission Follow Up, ID No: modIP001
To: <alamb@apache.org>
Cc: <yjshen@apache.org>, <dheres@apache.org>, <jayjeetc@ucsc.edu>, <ozankabak@apache.org>, <viirya@apache.org>, <sunchao@apache.org>


Dear Author(s),
For the re-submission of "Apache Arrow DataFusion: a Fast, Embeddable, Modular Analytic Query Engine" to the proceedings and ACM Digital Library.

Please fix the following on your submission and re-submit on or before April 25th

1. Please update the author order on your PDF to match what was listed on your ACM Rightsreview form:

Andrew Lamb, Yijie Shen, Daniël Heres, Jayjeet Chakraborty, Mehmet Ozan Kabak, Liang-Chi Hsieh, & Chao Sun


*Kindly note that per ACM policy, this data must match in both locations.

**Set reminders for the chairs' requested optional presentation video (see original prep instructions link https://www.scomminc.com/pp/acmsig/sigmod.htm).

 Then review your PDF submission for any typos and then submit/upload all the files (required pdf, required source, optional thumbnail etc) to update your pdf/submission on or before deadline provided above to:


<LINK>
Thank you,

Tim Pollitt

Sheridan Communications

@alamb
Copy link
Contributor Author

alamb commented Apr 23, 2024

"third time's the charm"

Screenshot 2024-04-23 at 11 34 59 AM

DataFusion Query Engine - SIGMOD 2024-FINAL-mk3.zip

DataFusion_Query_Engine___SIGMOD_2024-FINAL-mk3

DataFusion_Query_Engine___SIGMOD_2024-FINAL-mk3.pdf

@alamb
Copy link
Contributor Author

alamb commented Apr 29, 2024

Needed to tweak the title to have a A rather than a

---------- Forwarded message ---------
From: Tim Pollitt <submissions@scomminc.com>
Date: Mon, Apr 29, 2024 at 8:47 AM
Subject: ACM Proceedings (sigmodpods TP) - Submission Follow Up, ID No: modIP001
To: <alamb@apache.org>
Cc: <yjshen@apache.org>, <dheres@apache.org>, <jayjeetc@ucsc.edu>, <ozankabak@apache.org>, <viirya@apache.org>, <sunchao@apache.org>


Dear Author(s),
For the re-submission of "Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine" to the proceedings and ACM Digital Library.

Please fix the following on your submission and re-submit on or before May 1st

1.   Kindly make the ''A'' after the colon in the title a capital letter ''A''. This is a valid exception to the rule.

*Set reminders for the chairs' requested optional presentation video (see original prep instructions link https://www.scomminc.com/pp/acmsig/sigmod.htm).

 Then review your PDF submission for any typos and then submit/upload all the files (required pdf, required source, optional thumbnail etc) to update your pdf/submission on or before deadline provided above to:


http://www.scomminc.com/acm/submissions/submission.cfm?grid=sigmodpods&eid=5E5C547F61070302&eid2=6702

Thank you,

DataFusion-Thumbnail-mk4 jog
DataFusion Query Engine - SIGMOD 2024-FINAL-mk4.zip
DataFusion_Query_Engine___SIGMOD_2024-FINAL-mk4.pdf

Screenshot 2024-04-29 at 9 43 16 AM

@viirya
Copy link
Member

viirya commented Apr 29, 2024

Needed to tweak the title to have a A rather than a

I remember we have discussed this A/a issue before in emails. We got the answer now. 😄

Thanks for dealing with the tweak.

@alamb
Copy link
Contributor Author

alamb commented Apr 30, 2024

Needed to tweak the title to have a A rather than a

I remember we have discussed this A/a issue before in emails. We got the answer now. 😄

Thanks for dealing with the tweak.

No problem -- it seems like this process is still quite manual 😆 -- hopefully we have it all right now 🤞

@alamb
Copy link
Contributor Author

alamb commented May 1, 2024

I think it is finally accepted 🎉



---------- Forwarded message ---------
From: Tim Pollitt <submissions@scomminc.com>
Date: Wed, May 1, 2024 at 2:44 PM
Subject: ACM Proceedings (sigmodpods TP) - Submission Follow Up, ID No: modIP001
To: <alamb@apache.org>
Cc: <yjshen@apache.org>, <dheres@apache.org>, <jayjeetc@ucsc.edu>, <ozankabak@apache.org>, <viirya@apache.org>, <sunchao@apache.org>


Dear Author(s),
For the re-submission of "Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine" to the proceedings and ACM Digital Library.

See below **** for info related to your final pdf.

In addition, authors are requested by the chairs to submit an optional presentation video.

(1) For more info, see instructions for details, and inquiry link to chairs:   https://www.scomminc.com/pp/acmsig/sigmod-video.htm by May 6th.

(2) You will need the DOI assigned to your submission: 10.1145/3626246.3653368

(3) Video descriptions over the 1,024 ACM character count will be deleted/removed by ACM.

*************

 We received notice of your completed ACM form, so everything appears to be in order with this submission, and will be utilized for the ACM DL publication. No revisions are needed, nor will be accepted. This submission is being moved onto the next stages of production (but kindly note the signed ACM forms still go through a double-check by ACM for various permissions).


Thank you,

Tim Pollitt

Sheridan Communications
                       

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

10 participants