Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom evaluation data in Playground #309

Merged
merged 67 commits into from
Jul 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
505dd2e
Bump next from 13.5.6 to 14.1.1 in /playground
dependabot[bot] May 10, 2024
80cc39a
Merge branch 'develop' into dependabot/npm_and_yarn/playground/next-1…
maximiliansoelch May 11, 2024
7769e87
use unique keys across artemis instances
dmytropolityka May 12, 2024
88e5ba7
set values from .env and .ini
dmytropolityka May 13, 2024
cfc9f1f
change authorization process
dmytropolityka May 19, 2024
a376fcb
extend database to store artemis url
dmytropolityka May 19, 2024
d356749
fix bugs
dmytropolityka May 19, 2024
1757ea8
Merge branch 'develop' into feature/decouple-artemis-from-athena
dmytropolityka May 19, 2024
7df5183
fix linting issues
dmytropolityka May 19, 2024
6f837a1
add further deployments
dmytropolityka May 19, 2024
749aeae
remove dotenv
dmytropolityka May 19, 2024
cf8bd83
remove dotenv and unused code
dmytropolityka May 19, 2024
9348bf3
adjust deployment units
dmytropolityka May 19, 2024
f014359
make deployments not compulsory
dmytropolityka May 19, 2024
76df5d9
bugs
dmytropolityka May 20, 2024
bc293fa
more logging
dmytropolityka May 20, 2024
61dfd0a
more logging
dmytropolityka May 20, 2024
92b5424
fix bug
dmytropolityka May 20, 2024
2db5fa9
rename unique constraint
dmytropolityka May 20, 2024
c95634f
Update __main__.py
dmytropolityka May 27, 2024
646d51e
Merge branch 'develop' into dependabot/npm_and_yarn/playground/next-1…
maximiliansoelch May 27, 2024
1f552cc
Bump mysql2 from 3.9.7 to 3.9.8 in /playground
dependabot[bot] May 30, 2024
af34b89
rename artemis_url to lms_url
dmytropolityka Jun 2, 2024
4ae0c27
Merge remote-tracking branch 'origin/dependabot/npm_and_yarn/playgrou…
dmytropolityka Jun 2, 2024
e0be309
make playground work for multi-instance setup; add non-graded feedbac…
dmytropolityka Jun 2, 2024
8fb8fd1
Merge remote-tracking branch 'origin/dependabot/npm_and_yarn/playgrou…
dmytropolityka Jun 3, 2024
ea0696e
Merge branch 'develop' into feature/decouple-artemis-from-athena
maximiliansoelch Jun 6, 2024
c74b112
rename artemis_url into lms_url, further occurrences
dmytropolityka Jun 7, 2024
0f45254
rename artemis into lms, further occurrences
dmytropolityka Jun 7, 2024
4324ea6
Merge branch 'develop' into feature/decouple-artemis-from-athena
dmytropolityka Jun 7, 2024
7f1a317
update user and key
dmytropolityka Jun 7, 2024
2cb90bc
Revert "update user and key"
dmytropolityka Jun 9, 2024
d93daf7
remove unnecessary yarn lock file
dmytropolityka Jun 9, 2024
8fad620
Merge remote-tracking branch 'origin/feature/playground-self-learning…
dmytropolityka Jun 9, 2024
ce88507
Merge branch 'feature/decouple-artemis-from-athena' into feature/play…
dmytropolityka Jun 9, 2024
d763ed1
update experiments
dmytropolityka Jun 9, 2024
cd2474b
remove server config prior to deployment
dmytropolityka Jun 9, 2024
c997586
adapt dockerfile
dmytropolityka Jun 21, 2024
6f60c8b
add debug statement
dmytropolityka Jun 24, 2024
0aeab86
modify dockerfile
dmytropolityka Jun 24, 2024
892057e
change signature of suggest_feedback for other modules
dmytropolityka Jul 2, 2024
17e872a
Merge remote-tracking branch 'origin/feature/playground-self-learning…
dmytropolityka Jul 2, 2024
b28265d
add localhost to server configuration
dmytropolityka Jul 5, 2024
b4388f5
add custom data modes
FelixTJDietrich Jul 5, 2024
8a009bc
fix upload
FelixTJDietrich Jul 5, 2024
b11f18d
do the rest of data upload export deletion
FelixTJDietrich Jul 5, 2024
48cb04a
don't require exercises for export and deletion
FelixTJDietrich Jul 5, 2024
fb39403
Merge branch 'feature/playground-self-learning-feedback' into feature…
FelixTJDietrich Jul 5, 2024
21b1249
modules have information whether they support non graded feedback req…
dmytropolityka Jul 7, 2024
8c2bc65
add differentiation whether to include to client code
dmytropolityka Jul 7, 2024
6229a4b
fix mypy error
dmytropolityka Jul 7, 2024
790c07c
Merge branch 'develop' into feature/playground-self-learning-feedback
FelixTJDietrich Jul 15, 2024
62b66cd
Merge branch 'develop' into feature/playground-self-learning-feedback
FelixTJDietrich Jul 15, 2024
653e8e5
Merge branch 'develop' into feature/playground-self-learning-feedback
FelixTJDietrich Jul 16, 2024
d05e1ba
add wip documentation
FelixTJDietrich Jul 16, 2024
f84c7a3
Merge branch 'feature/playground-self-learning-feedback' into feature…
dmytropolityka Jul 17, 2024
ba8d0e9
Update Dockerfile
dmytropolityka Jul 17, 2024
a96758a
Merge branch 'feature/playground-self-learning-feedback' into feature…
dmytropolityka Jul 17, 2024
fb795cd
update default port
dmytropolityka Jul 19, 2024
059bbe8
revert merged dockerfile
dmytropolityka Jul 19, 2024
f2f6708
Merge branch 'feature/playground-self-learning-feedback' into feature…
dmytropolityka Jul 22, 2024
3f32097
fix linting issue
dmytropolityka Jul 22, 2024
770a520
Merge branch 'develop' into feature/import-evaluation-data
FelixTJDietrich Jul 26, 2024
000573f
Update docs
FelixTJDietrich Jul 26, 2024
1af0ed3
Update assessment_module_manager/modules.ini
FelixTJDietrich Jul 26, 2024
c18f263
Update assessment_module_manager/modules.docker.ini
FelixTJDietrich Jul 26, 2024
61b3b9d
Update playground/src/components/selectors/data_mode_select.tsx
FelixTJDietrich Jul 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,5 @@

## What is Athena?
Athena is an advanced system designed to assist educators by providing (semi-)automated assessments for various types of academic exercises. Through its integration with learning management systems (LMS), Athena offers an efficient and innovative way to evaluate students' work in large courses. The system has been expanded from its original focus on textual exercises to now include support for programming exercises and has plans for future support of additional exercise types such as UML modeling and mathematics.

**Documentation:** [ls1intum.github.io/Athena/](ls1intum.github.io/Athena/)
3 changes: 2 additions & 1 deletion assessment_module_manager/modules.docker.ini
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,5 @@ url = http://module-modeling-llm:5008
type = modeling
supports_evaluation = false
supports_non_graded_feedback_requests = false
supports_graded_feedback_requests = true
supports_graded_feedback_requests = true

3 changes: 2 additions & 1 deletion assessment_module_manager/modules.ini
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,5 @@ url = http://localhost:5008
type = modeling
supports_evaluation = false
supports_non_graded_feedback_requests = false
supports_graded_feedback_requests = true
supports_graded_feedback_requests = true

25 changes: 23 additions & 2 deletions docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,24 @@

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXAUTOBUILD = sphinx-autobuild
ALLSPHINXLIVEOPTS = $(ALLSPHINXOPTS) -q \
--port 0 \
--host 0.0.0.0 \
--open-browser \
--delay 1 \
--ignore "*.swp" \
--ignore "*.pdf" \
--ignore "*.log" \
--ignore "*.out" \
--ignore "*.toc" \
--ignore "*.aux" \
--ignore "*.idx" \
--ignore "*.ind" \
--ignore "*.ilg" \
--ignore "*.tex" \
--watch source

SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
Expand All @@ -13,11 +31,14 @@ help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

livehtml:
sphinx-autobuild "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
$(SPHINXAUTOBUILD) -b html $(ALLSPHINXLIVEOPTS) $(SOURCEDIR) $(BUILDDIR)
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)."

.PHONY: help Makefile
# .PHONY: livehtml

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
7 changes: 7 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,13 @@ Athena will use the information it is given and provide the automatic suggestion
overview/athena
overview/playground

.. toctree::
:caption: User Guide
:includehidden:
:maxdepth: 2

user_guide/index

.. toctree::
:caption: Setup
:includehidden:
Expand Down
2 changes: 1 addition & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Sphinx==6.2.1
sphinx-rtd-theme==1.2.0
sphinx-rtd-theme==2.0.0
sphinx-autobuild==2021.3.14
docutils==0.19
sphinxcontrib-bibtex==2.5.0
2 changes: 2 additions & 0 deletions docs/run/docker.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _run_docker:

From Docker
===========================================

Expand Down
2 changes: 2 additions & 0 deletions docs/run/local.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _run_local:

From the Command Line
===========================================

Expand Down
2 changes: 2 additions & 0 deletions docs/run/playground.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _run_playground:

Run the Playground
===========================================

Expand Down
2 changes: 2 additions & 0 deletions docs/setup/install.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _setup_install:

Python and Poetry Setup
===========================================

Expand Down
52 changes: 52 additions & 0 deletions docs/user_guide/conduct_experiment.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
.. _conduct_experiment_guide:

=============================
Conducting an Experiment
=============================

To conduct an experiment in the Athena Playground, follow these steps:

1. **Define Experiment:**
- Scroll to the Evaluation Mode section.
- In "Define Experiments", choose execution modes, exercise types, and manage training and evaluation data.
- Alternatively, import an experiment configuration using the "Import" button.
- When done, press "Define Experiment".
- Export the experiment configuration using the "Export" button for future reference.

.. figure:: ../images/playground/evaluation_mode/define_experiment.png
:width: 500px
:alt: Define Experiment Interface of the Athena Playground

Evaluation Mode: Define Experiment Interface of the Athena Playground

2. **Configure Modules:**
- Select and configure the modules you wish to include in your experiment.
- Ensure each module is set up with appropriate parameters for effective comparison.
- Import module configurations using the "Import" button, if needed.
- Export the module configurations using the "Export" button for future reference.

.. figure:: ../images/playground/evaluation_mode/configure_modules.png
:width: 500px
:alt: Configure Modules Interface of the Athena Playground

Evaluation Mode: Configure Modules Interface of the Athena Playground

3. **Conduct Experiment:**
- Press "Start Experiment" to begin the experiment.
- The steps performed include sending submissions, sending feedback for training submissions, generating feedback suggestions, and running automatic evaluations.
- If training submissions are provided, you will need to manually continue the experiment by pressing "Continue".
- If automatic evaluations is enabled, for instance LLM-as-a-judge for text exercises, you will also need to manually confirm it.
- Export and import the experiment results as needed using the "Export" and "Import" buttons, respectively.

.. figure:: ../images/playground/evaluation_mode/conduct_experiment_text.png
:width: 500px
:alt: Conduct Experiment Interface for a Text Exercise of the Athena Playground

Evaluation Mode: Conduct Experiment Interface for a Text Exercise of the Athena Playground

4. **Annotate Feedback Suggestions:**
- Annotate the generated feedback suggestions with "Accept" or "Reject" as a tutor would.

5. **Export Results:**
- At the end of the experiment, or at any time during the experiment, export the results using the "Export" button.
- Make sure that you also exported the experiment configuration and module configurations to have a complete record of the experiment.
Loading
Loading