Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/2023.07 os test refactor test cases #21

Merged
merged 172 commits into from
Sep 1, 2023
Merged
Show file tree
Hide file tree
Changes from 167 commits
Commits
Show all changes
172 commits
Select commit Hold shift + click to select a range
56f6463
Update Dockerfile
DaBeIDS Apr 26, 2023
9db2bd4
Update Dockerfile
DaBeIDS Apr 26, 2023
a76b658
Update Dockerfile
DaBeIDS Apr 27, 2023
e75676d
Update entry.sh
DaBeIDS Apr 27, 2023
1bf10e2
Update Dockerfile
DaBeIDS Apr 27, 2023
97cd5b8
Update Dockerfile
DaBeIDS Apr 27, 2023
80e62b9
Update requirements.txt
DaBeIDS Apr 27, 2023
59c1f66
Update requirements.txt
DaBeIDS Apr 28, 2023
829f9a5
Update Dockerfile
DaBeIDS Apr 28, 2023
c350bf1
Update Dockerfile
DaBeIDS Apr 28, 2023
c3b9354
Update Dockerfile
DaBeIDS Apr 28, 2023
cc9de86
Update requirements.txt
DaBeIDS Apr 28, 2023
339b2a8
Update Dockerfile
DaBeIDS Apr 28, 2023
a589176
Update Dockerfile
DaBeIDS Apr 28, 2023
8aab0f7
Update Dockerfile
DaBeIDS Apr 28, 2023
5b1515d
Update Dockerfile
DaBeIDS Apr 28, 2023
19d181e
Update Dockerfile
DaBeIDS Apr 28, 2023
79c15df
Update entry.sh
DaBeIDS Apr 28, 2023
dc95d4e
Add files via upload
DaBeIDS May 2, 2023
7493cc1
Update Dockerfile
DaBeIDS May 2, 2023
a8645f1
Update Dockerfile
DaBeIDS May 2, 2023
dc9858d
Update Dockerfile
DaBeIDS May 2, 2023
67c6e4a
Delete entry.sh
DaBeIDS May 2, 2023
cde4829
Update Dockerfile
DaBeIDS May 2, 2023
07f493c
Update entry.sh
DaBeIDS May 2, 2023
3bd9de6
Update Dockerfile
DaBeIDS May 2, 2023
742c708
Update Dockerfile
DaBeIDS May 2, 2023
069c06c
Update Dockerfile
DaBeIDS May 2, 2023
a03b673
Update entry.sh
DaBeIDS May 2, 2023
142257c
Update Dockerfile
DaBeIDS May 2, 2023
560e052
Update Dockerfile
DaBeIDS May 2, 2023
7a16a9d
Update Dockerfile
DaBeIDS May 2, 2023
ce063d9
Update Dockerfile
DaBeIDS May 2, 2023
53783ae
Update Dockerfile
DaBeIDS May 2, 2023
92dc413
Update Dockerfile
DaBeIDS May 2, 2023
513b274
Update Dockerfile
DaBeIDS May 2, 2023
36ad87b
Update Dockerfile
DaBeIDS May 2, 2023
50df119
Update Dockerfile
DaBeIDS May 2, 2023
51513af
Update Dockerfile
DaBeIDS May 2, 2023
65e8356
Add files via upload
DaBeIDS May 3, 2023
7c5971d
Add files via upload
DaBeIDS May 3, 2023
21c2094
Delete data_extractor/data/TEST directory
DaBeIDS May 3, 2023
f790cca
Add files via upload
DaBeIDS May 3, 2023
7e9170a
Delete data_extractor/data/TEST/interim/rb/work directory
DaBeIDS May 3, 2023
660f517
Add files via upload
DaBeIDS May 3, 2023
74001b7
Delete DOCKER_TEST
DaBeIDS May 3, 2023
06a9046
Add files via upload
DaBeIDS May 3, 2023
1dcd048
Update train_on_pdf.py
DaBeIDS May 4, 2023
315e8a2
Update settings.yaml
DaBeIDS May 4, 2023
c32d6df
Update infer_on_pdf.py
DaBeIDS May 4, 2023
5be52d0
Update settings.yaml
DaBeIDS May 4, 2023
768cd9c
Create requirements.txt
DaBeIDS May 4, 2023
16221b0
Update DOCKER_TEST
DaBeIDS May 4, 2023
f8d1800
Update entry.sh
DaBeIDS May 4, 2023
137b530
Update DOCKER_TEST
DaBeIDS May 4, 2023
16fe7a2
Update DOCKER_TEST
DaBeIDS May 4, 2023
752e217
Delete requirements.txt
DaBeIDS May 4, 2023
3b4badd
Update DOCKER_TEST
DaBeIDS May 4, 2023
21934bf
Update DOCKER_TEST
DaBeIDS May 4, 2023
b620e92
Update DOCKER_TEST
DaBeIDS May 4, 2023
4cea4a3
Update DOCKER_TEST
DaBeIDS May 4, 2023
ecf5ccf
Update DOCKER_TEST
DaBeIDS May 4, 2023
a5c5ae5
Update DOCKER_TEST
DaBeIDS May 4, 2023
605a901
Update DOCKER_TEST
DaBeIDS May 4, 2023
c3fb8cc
Delete entry.sh
DaBeIDS May 5, 2023
afe6215
Add files via upload
DaBeIDS May 5, 2023
0aae09e
Update train_on_pdf.py
DaBeIDS May 8, 2023
46046b3
Update settings.yaml
DaBeIDS May 8, 2023
6a85f4a
Update settings.yaml
DaBeIDS May 8, 2023
b1efce1
Update settings.yaml
DaBeIDS May 8, 2023
927bd5a
Update DOCKER_TEST
DaBeIDS May 8, 2023
e4b49e0
Create requirements.txt
DaBeIDS May 8, 2023
ad62786
Update requirements.txt
DaBeIDS May 8, 2023
930c6aa
Update DOCKER_TEST
DaBeIDS May 8, 2023
117a6c7
Update requirements.txt
DaBeIDS May 8, 2023
6e35891
Update requirements.txt
DaBeIDS May 8, 2023
9aab837
Update infer_on_pdf.py
DaBeIDS May 8, 2023
d98dfed
Update requirements.txt
DaBeIDS May 8, 2023
05070ec
Update requirements.txt
DaBeIDS May 8, 2023
620ac5d
Update DOCKER_TEST
DaBeIDS May 8, 2023
d13e22b
Update DOCKER_TEST
DaBeIDS May 8, 2023
f906e65
Add a default folder for the models
DaBeIDS May 8, 2023
3e286b3
Update DOCKER_TEST
DaBeIDS May 9, 2023
9563cf3
Update requirements.txt
DaBeIDS May 9, 2023
9595fe8
Update requirements.txt
DaBeIDS May 12, 2023
2b18868
Update requirements.txt
DaBeIDS May 12, 2023
8a4bd81
Update requirements.txt
DaBeIDS May 12, 2023
b461dd9
Update requirements.txt
DaBeIDS May 12, 2023
26b6619
Update requirements.txt
DaBeIDS May 12, 2023
a0bbbf9
Update requirements.txt
DaBeIDS May 12, 2023
35381e8
Update requirements.txt
DaBeIDS May 12, 2023
6ded6ad
Update requirements.txt
DaBeIDS May 12, 2023
2f4e74c
Update requirements.txt
DaBeIDS Jun 21, 2023
f329a5a
Add files via upload
DaBeIDS Jun 23, 2023
13201c8
Update Dockerfile
DaBeIDS Jun 23, 2023
b3e4839
Update DOCKER_TEST file for s3 communication
tobias-watzel Jul 4, 2023
f574c70
Update requirements.txt in TEST_SETUP
tobias-watzel Jul 4, 2023
b7c26cb
Adding user-friendly shell /bin/bash
tobias-watzel Jul 4, 2023
2522b2c
Adding VS Code folders for remote access
tobias-watzel Jul 4, 2023
57a734b
Update train_on_pdf.py
DaBeIDS Jul 5, 2023
0128ef2
Update train_on_pdf.py
DaBeIDS Jul 5, 2023
113d9aa
Update train_on_pdf.py
DaBeIDS Jul 5, 2023
ad0ff88
Update train_on_pdf.py
DaBeIDS Jul 5, 2023
e15a4ab
Create s3_settings.yaml
DaBeIDS Jul 5, 2023
196e750
Update s3_settings.yaml
DaBeIDS Jul 5, 2023
f6e580c
Update train_on_pdf.py
DaBeIDS Jul 5, 2023
7aa2b5c
Update Dockerfile
DaBeIDS Jul 5, 2023
ad422b9
Update train_on_pdf.py
DaBeIDS Jul 5, 2023
813a87d
Update train_on_pdf.py
DaBeIDS Jul 5, 2023
a0eeef1
Update extraction_server.py
DaBeIDS Jul 5, 2023
3caa7f9
Update train_on_pdf.py
DaBeIDS Jul 5, 2023
12bf6c3
Update extraction_server.py
DaBeIDS Jul 5, 2023
c8fe7e2
Update train_on_pdf.py
DaBeIDS Jul 5, 2023
2ed59e3
Update extraction_server.py
DaBeIDS Jul 5, 2023
f6f8e74
Update train_on_pdf.py
DaBeIDS Jul 5, 2023
e3ad05b
Update extraction_server.py
DaBeIDS Jul 5, 2023
7398c9d
Update Dockerfile
DaBeIDS Jul 5, 2023
3affd50
Update extraction_server.py
DaBeIDS Jul 5, 2023
e30de25
Update s3_settings.yaml
DaBeIDS Jul 5, 2023
3536d39
Update extraction_server.py
DaBeIDS Jul 5, 2023
774cbda
Update kpi_mapping.py
DaBeIDS Jul 13, 2023
1844353
Update extraction_server.py
DaBeIDS Jul 13, 2023
20b77ee
Update Dockerfile
DaBeIDS Jul 13, 2023
271ffcd
Update extraction_server.py
DaBeIDS Jul 14, 2023
07d7801
Init pytest and creating test cases for upcoming refactoring
tobias-watzel Jul 14, 2023
277105b
Ongoing creation of test cases
tobias-watzel Jul 17, 2023
243555b
Ongoing creation of test cases
tobias-watzel Jul 20, 2023
0be5b87
Ongoing creation and modification of test cases
tobias-watzel Jul 21, 2023
bb95b17
Ongoing creation of test cases
tobias-watzel Jul 21, 2023
9fe96b7
Reset branch to restore old status
tobias-watzel Aug 2, 2023
942f2a8
Ongoing test cases for save_train_info
tobias-watzel Aug 2, 2023
9c677b2
Adding test cases for save_train_info
tobias-watzel Aug 2, 2023
16c0af8
Finished added test cases for save_train_info
tobias-watzel Aug 3, 2023
25b9cc8
Adding test cases for run_router function
tobias-watzel Aug 3, 2023
41f91b6
Backup
tobias-watzel Aug 4, 2023
7894ecf
Finalizing test cases for train_on_pdf
tobias-watzel Aug 7, 2023
2adfcac
Finishing tests for run_router function
tobias-watzel Aug 8, 2023
d2c5032
Adding first test cases for train_on_pdf script
tobias-watzel Aug 8, 2023
ff72c06
Ongoing test creation for train_on_pdf function
tobias-watzel Aug 9, 2023
bf1d721
Ongoing creation of tests
tobias-watzel Aug 10, 2023
ab2aa6e
Ongoing test creation
tobias-watzel Aug 11, 2023
209c359
Ongoing creation of test cases
Aug 17, 2023
14d1c44
Ongoing test creation
tobias-watzel Aug 21, 2023
d79cdec
Finishing test cases for train_on_pdf script
tobias-watzel Aug 25, 2023
a4272d6
Adding missing file in test folder
tobias-watzel Aug 25, 2023
b04860b
Create .gitignore and protect credentials from being leaked
HeatherAck May 17, 2023
7e40138
Update README.md with status
HeatherAck Jun 16, 2023
486d5fe
Update train_on_pdf.py
DaBeIDS Jul 5, 2023
94df7b2
Update inference_server.py
DaBeIDS Jul 20, 2023
2b75604
Update inference_server.py
DaBeIDS Jul 20, 2023
dbc3769
Feature/2023.04 os test (#12)
DaBeIDS Jul 25, 2023
fe01400
Feature/2023.04 os test (#14)
DaBeIDS Jul 25, 2023
b2c78db
Feature/2023.04 os test (#16)
DaBeIDS Aug 4, 2023
7065b4b
Feature/2023.04 os test (#16) (#17)
tobias-watzel Aug 4, 2023
85ed336
Feature/2023.08 os test (#19)
DaBeIDS Aug 8, 2023
a36b4d8
Update infer_on_pdf.py
DaBeIDS Aug 11, 2023
6557a44
Modifying tests for taking the last updates on train_on_pdf.py into a…
tobias-watzel Aug 25, 2023
df2c484
Ongoing modification of test cases
tobias-watzel Aug 25, 2023
3e7f56d
Adapted the tests of the function generate_text_3434
tobias-watzel Aug 28, 2023
02e7273
Adapting tests for save_train_info function
tobias-watzel Aug 28, 2023
2adce18
Onging test adaption for train_on_pdf script...
tobias-watzel Aug 28, 2023
2974930
Modifying test cases for save_train_info function
tobias-watzel Aug 29, 2023
6713b00
Finishing adaptation of test cases for train_on_pdf script
tobias-watzel Aug 29, 2023
d3ca75b
Solving minor issues
tobias-watzel Aug 29, 2023
8435800
Some cosmetics and consistency changes
tobias-watzel Aug 29, 2023
9e0ba65
Minor changes for better readability
tobias-watzel Aug 30, 2023
47331d8
Some cleanup and finishing tests
tobias-watzel Aug 31, 2023
2b09749
Resolving some conflics
tobias-watzel Aug 31, 2023
6b12f91
Resolving conflicts
tobias-watzel Aug 31, 2023
395c914
Merge branch 'develop' into feature/2023.07-os-test-refactor-test-cases
tobias-watzel Aug 31, 2023
1b6a3fe
Some minor fixes
tobias-watzel Aug 31, 2023
53326fa
Fixing last conflicts
tobias-watzel Aug 31, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Credentials / Secrets
credentials.env
8 changes: 8 additions & 0 deletions .idea/corporate_data_extraction.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

272 changes: 272 additions & 0 deletions .idea/inspectionProfiles/Project_Default.xml

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

45 changes: 45 additions & 0 deletions .idea/workspace.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
# corporate_data_extraction
The code in this repo is undergoing testing. It is not verified as fully functional yet.
The code in this repo has been verified as working in Linux and Windows environments. Team is actively testing it in an Openshift evironment.
1 change: 1 addition & 0 deletions data_extractor/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__pycache__
41 changes: 41 additions & 0 deletions data_extractor/code/coordinator/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
FROM ubuntu:20.04
SHELL ["/bin/bash", "-c"]

# no prompt during installation:
ARG DEBIAN_FRONTEND=noninteractive

RUN apt-get update
RUN apt-get install -y apt-utils wget python3 python3-pip vim

COPY ./code/coordinator/requirements_coordinator.txt /app/code/requirements.txt
COPY ./code/coordinator/entry_coordinator.sh /app/code/entry.sh
COPY ./code/coordinator/server_coordinator.py /app/code/server_coordinator.py
COPY ./code/infer_on_pdf.py /app/code/infer_on_pdf.py
COPY ./code/train_on_pdf.py /app/code/train_on_pdf.py
COPY ./code/kpi_mapping.csv /app/code/kpi_mapping.csv
COPY ./code/setup_project.py /app/code/setup_project.py
COPY ./code/config_path.py /app/code/config_path.py
COPY ./code/s3_communication.py /app/code/s3_communication.py
COPY ./data /app/data
COPY ./models /app/models

WORKDIR /app/code

RUN pip install -r requirements.txt

RUN chgrp -R 0 /app/code && chmod -R g=u /app/code
RUN chmod -R a+x /app/code

RUN mkdir -p /app/server_logs

RUN chmod -R 777 /app/server_logs
RUN chmod -R 777 /app/data
RUN chmod -R 777 /app/models

# Adding vs code server
RUN mkdir -p /.vscode-server
RUN mkdir -p /.vscode-remote
RUN chmod -R 777 /.vscode-server
RUN chmod -R 777 /.vscode-remote

CMD ./entry.sh
8 changes: 8 additions & 0 deletions data_extractor/code/coordinator/entry_coordinator.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

date_time=$(date +'%d_%h_%Y:%H_%M')
log_file_path="/app/server_logs/coordinator_logs_${date_time}.txt"
cd /app/code
python3 server_coordinator.py > $log_file_path 2>&1 &

sleep infinity
6 changes: 6 additions & 0 deletions data_extractor/code/coordinator/requirements_coordinator.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
requests~=2.31.0
Flask==2.3.2
pandas~=2.0.3
pyyaml~=6.0.1
boto3~=1.28.8
openpyxl~=3.1.2
150 changes: 150 additions & 0 deletions data_extractor/code/coordinator/server_coordinator.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
import argparse
import pathlib
import json
import os
import traceback

from flask import Flask, Response, request


ROOT = pathlib.Path(__file__).resolve().parent.parent.parent.parent
DATA_FOLDER = ROOT / "data"
MODEL_FOLDER = ROOT / "models"

app = Flask(__name__)


@app.route("/liveness")
def liveness():
return Response(response={}, status=200)


@app.route("/train")
def train():
""" This function should start the train_on_pdf.py with given parameters as a web access point.

:return:
"""
parser_train = argparse.ArgumentParser(description='End-to-end training')

parser_train.add_argument('--project_name',
type=str,
default=None,
help='Name of the Project')

parser_train.add_argument('--s3_usage',
type=str,
default=None,
help='Do you want to use S3? Type either Y or N.')

# Read arguments from direct python call
args_train = parser_train.parse_args()
try:
project_name = args_train.project_name
s3_usage = args_train.s3_usage
except AttributeError:
pass

# Read arguments from wget call
if project_name is None or s3_usage is None:
try:
project_name = request.args.get("project_name")
s3_usage = request.args.get("s3_usage")
except AttributeError:
pass

# Read arguments from payload if given
if project_name is None or s3_usage is None:
try:
args_train = json.loads(request.args['payload'])
project_name = args_train["project_name"]
s3_usage = args_train["s3_usage"]
except Exception:
msg = "Project name or s3_usage where not given via command or payload. Please recheck your call."
return Response(msg, status=500)

cmd = 'python3 train_on_pdf.py' + \
' --project_name "' + project_name + '"' + \
' --s3_usage "' + s3_usage + '"'
print("Running command: " + cmd)
try:
os.system(cmd)
except Exception as e:
msg = "Error during train_on_pdf.py \nException:" + str(repr(e) + traceback.format_exc())
return Response(msg, status=500)
msg = "train_on_pdf.py was executed without any error. Check the results please."
return Response(msg, status=200)


@app.route("/infer")
def infer():
""" This function should start the infer_on_pdf.py with given parameters (either via cli arguments or via
payload) as a web access point.

:return: Response type containing a message and the int for the type of message (200 if ok, 500 if error)
"""
parser_infer = argparse.ArgumentParser(description='End-to-end inference')

parser_infer.add_argument('--project_name',
type=str,
default=None,
help='Name of the Project')

parser_infer.add_argument('--s3_usage',
type=str,
default=None,
help='Do you want to use S3? Type either Y or N.')

parser_infer.add_argument('--mode',
type=str,
default='both',
help='Inference Mode (RB, ML, both, or none - for just doing postprocessing)')

args_infer = parser_infer.parse_args()
project_name = args_infer.project_name
s3_usage = args_infer.s3_usage
mode = args_infer.mode

# Read arguments from wget call
if project_name is None or s3_usage is None:
try:
project_name = request.args.get("project_name")
s3_usage = request.args.get("s3_usage")
mode = request.args.get("mode")
except AttributeError:
pass

# Read arguments from payload if given
if project_name is None or s3_usage is None:
try:
args_infer = json.loads(request.args['payload'])
project_name = args_infer["project_name"]
s3_usage = args_infer["s3_usage"]
mode = args_infer["mode"]
except Exception:
msg = "Project name, mode or s3_usage where not given via command or payload. Please recheck your call."
return Response(msg, status=500)

cmd = 'python3 infer_on_pdf.py' + \
' --project_name "' + project_name + '"' + \
' --mode "' + mode + '"' + \
' --s3_usage "' + s3_usage + '"'
print("Running command: " + cmd)
try:
os.system(cmd)
except Exception as e:
msg = "Error during infer_on_pdf.py \nException:" + str(repr(e) + traceback.format_exc())
return Response(msg, status=500)
msg = "infer_on_pdf.py was executed without any error. Check the results please."
return Response(msg, status=200)


if __name__ == "__main__":
parser = argparse.ArgumentParser(description='coordinator server')
parser.add_argument('--port',
type=int,
default=2000,
help='Port to use for the coordinator server')
args = parser.parse_args()
port = args.port
app.run(host="0.0.0.0", port=port)
26 changes: 8 additions & 18 deletions data_extractor/code/esg_data_pipeline/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,35 +16,25 @@ RUN apt-key del 7fa2af80
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub

# Added poppler-utils, default-jre installations
RUN apt-get update && apt-get install -y git vim ninja-build poppler-utils default-jre \
RUN apt-get update && apt-get install -y git wget vim ninja-build poppler-utils default-jre \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

# Install mmdetection
RUN conda clean --all
RUN git clone --branch v1.2.0 https://github.com/open-mmlab/mmdetection.git /mmdetection
WORKDIR /mmdetection
ENV FORCE_CUDA="1"
RUN pip install cython --no-cache-dir
RUN pip install "git+https://github.com/open-mmlab/cocoapi.git#subdirectory=pycocotools"
RUN pip install --no-cache-dir -e .

# Reinstall openmm-mmcv to use the right version, somehow the above installation
# of mmdetection installs mmcv of an updated version which is not compatible for
# mmdetection v1.2
RUN pip install mmcv==0.5.7

COPY ./esg_data_pipeline /app/code/esg_data_pipeline
COPY ./s3_communication.py /app/code/esg_data_pipeline/esg_data_pipeline/s3_communication.py

RUN chgrp -R 0 /app/code/esg_data_pipeline && chmod g=u /app/code/esg_data_pipeline
RUN chmod -R a+x /app/code/esg_data_pipeline
RUN chgrp -R 0 /app/code && chmod g=u /app/code
RUN chmod -R 777 /app/code

WORKDIR /app/code/esg_data_pipeline

RUN pip install -e .

RUN mkdir -p /app/server_logs
RUN chmod -R 777 /app/server_logs

USER 1234
RUN mkdir -p /app/data
RUN chmod -R 777 /app/data

CMD ./entry.sh

Expand Down
2 changes: 1 addition & 1 deletion data_extractor/code/esg_data_pipeline/entry.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@ python3 extraction_server.py > $log_file_path 2>&1 &
#cd /esg_data_pipeline/notebooks
#jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser --allow-root --NotebookApp.token='' --NotebookApp.password=''

sleep infinity
sleep infinity
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def run(self, extraction_folder, annotation_excels, output_folder):

df_result = pd.DataFrame(examples_list).reset_index(drop=True)
# Drop the unnecessary column.
df_result.drop(["Index"], axis=1, inplace=True)
# df_result.drop(["Index"], axis=1, inplace=True)

# Map the KPI to KPI questions
importlib.reload(kpi_mapping)
Expand Down
Loading