-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ideate new approach to formatting table/diagram structure in template #85
Comments
@ShiqiYang2022; @simonepandit; and @Erick11293: tagging you here as for #84. @jc-cisneros and I will handle all of the implementation steps here, but if you have any suggestions/ideas please do let us know in this thread! |
@snairdesai Thanks! I'm looking forward to this discussion. One other desideratum I'd add to your list: So far, I find the easiest & most flexible interface for editing tables to be spreadsheets like Excel or Google Sheets. I think we want whatever solution we settle on here to be close to that level of simplicity & flexibility. My prior continues to be that we may want to use Excel / Google Sheets rather than keeping everything in Latex/Overleaf. But I'm open to alternatives provided they can hit the desideratum. |
@ShiqiYang2022 I am adding you to this issue, as I think this would be a good time to revisit. Let me know if you have any thoughts on the above. |
@ShiqiYang2022 @snairdesai @jc-cisneros One other thought here: Whatever we decide here, there may be cases where we still want to output |
@gentzkow: Thanks for the note here! As an update, @ShiqiYang2022, @jc-cisneros and I met today to discuss this. We came up with 4-5 potential options, but want to explore the feasibility of each a bit more. We will post additional updates here shortly. |
Great. Thanks. To record, my preferred idea so far is to use Excel / Google Sheets / or similar to build tables. I'm imagining something like:
It would be great if you guys could make this one of the options you test. Let's also make sure we don't spend a ton of time exploring options before we've had a chance to sync on them. Fine to take a few hours to check feasibility, but I'd like to check in on what you're thinking about at that point. |
Thanks @gentzkow. Based on this preference, our primary proposal (which we will test for feasibility) is as follows: (1) PI (or, as needed, RA) creates an Excel spreadsheet in the repository (for example, in (2) RA replaces the random numbers in the "raw" tab with the correct computations produced through code (exactly as you note). We would need to ensure the order of the scalars exactly matches the corresponding placement in the "table tab". What's nice about this is that depending on the (3) The "table tab" then fills the computed numbers by reference (exactly as you note). (4) The RA exports the Excel "table tab" sheet to a We don't yet know if this is fully possible (especially linking the scalars across the sheets and how comprehensive the translations from Excel to We will prioritize the testing of this proposal above the others we considered -- it seems most likely to deliver what we hope. |
@snairdesai Thanks! First, I didn't mean to suggest that this is the only option you should consider. Happy to hear the other options too if at least one of you thinks they may be better. Second, a few clarifications on what I'm thinking, using the numbering above. (1) I think we would want this to be (2) I think the data import / updating should be automatic. E.g., if we can have the code output directly in (4) I'm imagining we export to |
Thanks @gentzkow! We will think through these clarifications further. Here are our initial replies for now:
Definitely -- our plan is still to explore the other options. Some other considered approaches are in the dropdown below: (1) Try to sync editing with (2) Find an equivalent software to (3) RA manually creates (4) Think of ways to reduce the complexity of autofilling with Additional ProposalsWe do think the approach you outlined is closest to what we need.
Populating this in
I think we were thinking about this process in reverse. In our proposal, the PI would first come up with a proposed skeleton format for a table we want to build in Excel, and the RAs would then work towards computing, and then autofilling the required values into the Excel sheet. It seems that your proposal from 2. (above) is for the RA to first produce the outputs in In our proposal, the RA may start with a clearer sense of the intended deliverable. By referencing a skeleton table with the intended format we hope to arrive at, the RA knows exactly what scalars/outputs to produce, and can write code with this in mind (even if we don't need to automate the format itself in code). Of course, there's nothing preventing us from creating a desired skeleton in advance of creating outputs in either approach. Other than this, they both seem like viable procedures.
Here, we just meant that in our proposal, where the PI directly edits a skeleton Excel sheet in the repo , we can easily track (non-formatting) updates to the underlying spreadsheet and any new placeholder scalars which the PI adds or deletes when proposing changes (if it is not stored with Let us know if our understanding here matches your proposed approach. |
Thanks @snairdesai. Getting closer to convergence.
|
Thanks @gentzkow! Got it, all of your points here make sense.
Yes, we were thinking of creating separate CSV/txt versions along with the Excel sheets (which would not preserve the formatting, but should maintain and track the general structure if we populate placeholders). However, we realize now that by simply tracking the results produced from the code (the matrix of numbers from the We will keep you posted on progress on this proposal and the additional ones outlined above. For this proposal, we'll start by testing the exports with VBA (including table sizing). |
Yes! I agree. If we are going to output from code directly to |
TODOs for me:
|
@snairdesai Just a reminder that we should iterate on this often -- I'd like to know what you have in mind before you go too deep into a particular solution. |
EDITED @gentzkow: Thanks for the reminder! Let me provide an update on the (exploratory) work I've done here so far. I've been focusing on your proposed approach above -- and in particular exporting from Excel to PDF and cropping the resulting PDF to only include the table object (from your point below):
I've written a (rough) function in WIP Procedure (follows comment above)
I will post a minimal example which you can test shortly, to see if this method is on the right track. |
Brilliant! This sounds excellent. : ) |
@ShiqiYang2022 thanks! addressing the parentheses issue separately sounds good to me. |
Lingering TODOs and Tasklist:
|
@jc-cisneros @ShiqiYang2022: as I work in parallel through the final action items, I think we are at a good point for the two of you to test this process, and tinker with the skeleton tables to see if any additional functionalities should be added. I don't think we should fully build out every imaginable functionality at this stage, but it would definitely help to iterate on necessities to get to a minimal working version (we should already be quite close). Per #85 (comment), let's not worry about the file path issue here. |
The action items above have been completed. Following testing by @ShiqiYang2022 and @jc-cisneros, I'll work on ensuring this works across OS/chips. |
@ShiqiYang2022 @jc-cisneros Let me know whenever you get a chance to test this out -- thanks for the support! |
@snairdesai flagging that I got the error below:
|
@jc-cisneros I think you'll just need to comment this line out in |
@snairdesai, that indeed fixed it. Got a successful run on my end! Aside from the small inconvenience of manually having to interact with Excel windows to grant permissions, it works smoothly. @snairdesai the idea is that this would be the default practice, right (i.e., using Excel to create table skeletons)? |
Great, thanks @jc-cisneros! That's good to hear. Some quick replies:
Agree that this is a bit annoying (especially since permissions must be granted for each additional file in the process). Users can also grant disk access for Excel directly (see here). While this is annoying and we could potentially find a workaround, this is a built-in security feature on Mac devices which I might recommend we do not try bypass.
This I'm not sure about - I think the process is straightforward enough once users run through this a couple of times. I suppose it would depend on users' preferences. |
Thanks @snairdesai! I've modified the skeleton and confirm that all the changes are properly reflected in the final output. This properly replaces LyX + tablefill. |
Thanks @jc-cisneros! @ShiqiYang2022 Let us know when you get a chance to run through this a final time and we can proceed to merge (maybe it would be useful to test this on Windows as well as your Mac, I definitely expect some bugs to arise there). |
@snairdesai Confirming the testing works well on my end! I am attaching the logs below.
(template) SIEPR-C02G50GUML86:template shiqiyang$ python run_all.py
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/private/utility.py:140: SyntaxWarning: invalid escape sequence '\#'
array = [line for line in array if not re.match('\#',line)]
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/private/movedirective.py:109: SyntaxWarning: invalid escape sequence '\*'
if re.findall('\*', self.source) != re.findall('\*', self.destination):
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/private/movedirective.py:109: SyntaxWarning: invalid escape sequence '\*'
if re.findall('\*', self.source) != re.findall('\*', self.destination):
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/private/movedirective.py:112: SyntaxWarning: invalid escape sequence '\*'
if re.search('\*', self.source):
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/private/movedirective.py:126: SyntaxWarning: invalid escape sequence '\*'
if re.search('\*', self.source):
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/private/movedirective.py:156: SyntaxWarning: invalid escape sequence '\*'
regex = regex.split('\*')
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/private/movedirective.py:186: SyntaxWarning: invalid escape sequence '\*'
f = re.sub('\*', w, f, 1)
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/run_program.py:1044: SyntaxWarning: invalid escape sequence '\s'
regex = "end of do-file[\s]*r\([0-9]*\);"
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/tablefill.py:195: SyntaxWarning: invalid escape sequence '\{'
if not is_table and re.search('label\{tab:', doc[i]):
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/tablefill.py:222: SyntaxWarning: invalid escape sequence '\{'
if re.search('end\{tabular\}', doc[i], flags = re.IGNORECASE):
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/tablefill.py:252: SyntaxWarning: invalid escape sequence '\.'
if re.search('\.lyx', template):
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/tablefill.py:254: SyntaxWarning: invalid escape sequence '\.'
elif re.search('\.tex', template):
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/tablefill.py:261: SyntaxWarning: invalid escape sequence '\s'
""".. Fill tables for template using inputs.
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/textfill.py:185: SyntaxWarning: invalid escape sequence '\e'
'\end_layout'
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/textfill.py:188: SyntaxWarning: invalid escape sequence '\e'
'\end_layout\n' \
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/textfill.py:189: SyntaxWarning: invalid escape sequence '\e'
'\end_inset\n' \
/Users/shiqiyang/Documents/github_folders/template/lib/gslab_make/gslab_make/textfill.py:190: SyntaxWarning: invalid escape sequence '\e'
'\end_layout'
{'root': '..', 'config': '/Users/shiqiyang/Documents/github_folders/template/config.yaml', 'lib': '/Users/shiqiyang/Documents/github_folders/template/lib', 'config_user': '/Users/shiqiyang/Documents/github_folders/template/config_user.yaml', 'input_dir': 'input', 'external_dir': 'external', 'output_dir': 'output', 'output_local_dir': 'output_local', 'makelog': 'log/make.log', 'output_statslog': 'log/output_stats.log', 'source_maplog': 'log/source_map.log', 'source_statslog': 'log/source_stats.log', 'versions_log': 'log/versions.log'}
Cleared:
Removed: One other thing to flag is that, when we run the repository, it is important for us to run it without other Excel files opening. The program will close all existing Excel files instead of only the file which produce the table. |
Thanks @jc-cisneros and @ShiqiYang2022! I'll update the instructions to notify users to save any work on Excel (and close windows if they wish). After this, I think we are ready to merge (given that these action items have been addressed). Next steps (for future issues):
cc @gentzkow |
Thread continues in the associated PR (#97). |
The purpose of this issue (#85) is to address a request from @gentzkow following ongoing discussions around our approach to formatting and autofilling
.tex
and.tikz
tables and diagrams intemplate
and other lab projects. In order to populate draft tables/diagrams, we currently either:kable
orxtable
) which converts dataframe objects into.tex
outputs, which are then written to table formats inOverleaf
. Here, we are directly defining the table skeleton in R.gslab-make
Python library and thegs.tablefill
function to port scalars generated from R code into skeleton files fromLyX
. Here, we are generating the desired table skeleton externally inLyX
, computing the scalars in R (or another computing language), and populating the skeleton fromLyX
with these scalars.LyX
package development.The purpose of this thread will be a discussion between lab members to propose alternative approaches (with the above benchmarks in mind for comparison). The requirements of any proposals must be:
template
structure.At baseline, @jc-cisneros and I would hope for the following:
Overleaf
).Given the above, if possible, we were thinking of finding a way to build skeletons on Overleaf in
.tex
/.tikz
which would enable replicators/lab members to modify skeletons visually (i.e., without needing to touch the raw.tex
/.tikz
source code, as is possible when viewing documents inLyX
), and would enable us to compute relevant scalars in our programming language of choice, and then populate the skeletons with these scalars.The text was updated successfully, but these errors were encountered: