ECAS-DATASET
├─client # All generated data of Client Agents
│ ├─embodied_memory # Generated embodied memory of 100 Client Agnets
│ │ ├─ori_data # All the raw data that reflects the memory generation process
│ │ └─database.db # Memory stored in database
│ └─profiles # Generated profiles of 451 Client Agnets
- Starting a virtual environment with anaconda:
conda create -n ecas python=3.11.0
- Then install requirements for initing client and chatting with client agent.
pip install -r requirements.txt
To init clients, remove dulplicates and assign names to candidates, the related three Python scripts: client_init.py
, init_method.py
and names.py
are used for processing candidate data from D4 dataset.
-
Due to the privacy of real patient data, you need get REAL data from D4 dataset website, and and put them in the
./utils/D4
folder. -
Client initialization and duplicate removal.
- Initializes candidate data by generating folders and saving dialogue and profile information from D4 dataset.
- Removes duplicate candidate entries and moves non-duplicates to a new folder.
- Assign Chinese and English names to the non duplicate candidates.
- Copy the D4 raw data to
full_portrait.json
in each folder in./client/proofiles
, - To get the processed data, You just need run:
python client_init.py
By using our generated embodied memory, you can select one of 100 clients's profile and memory to chat.
-
You need set the
api_key
in./utils/utils.py
(we use the OpenAI models). -
You can use the
--cn
setting to select a client agent in the/client/embodied_memory/ori_data
path for answering, for example (--cn
parameter defaults totest0
)
# use default
python question_answer.py
# use parameter
python question_answer.py --cn test0