Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pgsync container getting exit due to invalid schema.json #579

Open
hrik01 opened this issue Oct 11, 2024 · 3 comments
Open

pgsync container getting exit due to invalid schema.json #579

hrik01 opened this issue Oct 11, 2024 · 3 comments

Comments

@hrik01
Copy link

hrik01 commented Oct 11, 2024

PGSync version: latest

Postgres version: 16

Elasticsearch/OpenSearch version: 2.17.1

Redis version: 7.4.0

Python version: 3.10.12

Problem Description: i am creating a setup on my machine locally , and i have made the containers of PGSync, opensearch, redis, postgres under the same network and updated the runserver.sh file acc to my hostand host ports but my pgsync container is getting exited again and again showing the logs that schema.json not valid where i have made it validate from different online jason validator sites

Error Message (if any):
wait-for-it.sh: waiting 60 seconds for strapiDB:5432
wait-for-it.sh: strapiDB:5432 is available after 0 seconds
PostgreSQL is up
wait-for-it.sh: waiting 60 seconds for 192.168.1.2:9400
wait-for-it.sh: 192.168.1.2:9400 is available after 0 seconds
OpenSearch is up
wait-for-it.sh: waiting 60 seconds for redis:6379
wait-for-it.sh: redis:6379 is available after 0 seconds
Redis is up
Error: ./schema.json is not valid JSON. Exiting.
wait-for-it.sh: waiting 60 seconds for strapiDB:5432
wait-for-it.sh: strapiDB:5432 is available after 0 seconds
PostgreSQL is up
wait-for-it.sh: waiting 60 seconds for 192.168.1.2:9400
wait-for-it.sh: 192.168.1.2:9400 is available after 0 seconds
OpenSearch is up
wait-for-it.sh: waiting 60 seconds for redis:6379
wait-for-it.sh: redis:6379 is available after 0 seconds
Redis is up
Error: ./schema.json is not valid JSON. Exiting.

I am providing the commands that i used till yet for the setup.

Setup:

Go to the github repository of PGSync and download the files of pgsync for reference.

Run the following commands to setup Strapi, Postgres, PGSync, Opensearch, Redis:

Commands :

Create Network for the container for easy communication
$ sudo docker network create Strapi

Create Strapi container
$ sudo docker run -d
--name strapi
--network Strapi
-p 1337:1337
-e HOST=0.0.0.0
-e PORT=1337
-e APP_KEYS=8a45kNmvx0DsPv5CrdJH595+1Snn444SqEuq3mTQ844=,X8U6WyFa2kC+2P+I79F2QSy82xGG+HwirI5iF6wXYZk=
-e API_TOKEN_SALT=AbsqpmZ171iyl9394ZbYCg==
-e ADMIN_JWT_SECRET=4UXCd43bus3S0vwbEIa4sA==
-e TRANSFER_TOKEN_SALT=suRgGs3Kpn4+3POvtwsIFQ==
-e DATABASE_CLIENT=postgres
-e DATABASE_HOST=strapiDB
-e DATABASE_PORT=5432
-e DATABASE_NAME=npi
-e DATABASE_USERNAME=pgsync
-e DATABASE_PASSWORD=redhat
-e DATABASE_SSL=false
-e DATABASE_FILENAME=.tmp/data.db
-e JWT_SECRET=PJi80mJC2xfK6qRH81dSyg==
-e NODE_ENV=production
-v /home/hriritk/strapi/my-project/package.json:/opt/package.json:rw
-v /home/hriritk/strapi/my-project/config:/opt/app/config:rw
-v /home/hriritk/strapi/my-project/src:/opt/app/src:rw
-v /home/hriritk/strapi/my-project/public/uploads:/opt/app/public/uploads:rw
-v /home/hriritk/strapi/my-project/yarn.lock:/opt/yarn.lock:rw
-w /opt/app
strapi:latest
yarn develop
Create Postgres container
$ docker run -d
--name strapiDB
--network Strapi
--hostname postgres
-p 5432:5432
-v /home/hriritk/strapi/postgres-data:/var/lib/postgresql/data:rw
-e POSTGRES_USER=pgsync
-e POSTGRES_PASSWORD=redhat
-e POSTGRES_DB=npi
Postgres:latest

Create container of Redis
$ sudo docker run -d
--name redis
--network Strapi
redis
redis-server --requirepass redhat

Create container of Opensearch
$ sudo docker run -d
--name opensearch
--network Strapi
-p 9400:9200 -p 9600:9600
-e discovery.type=single-node
-e bootstrap.memory_lock=true
-e DISABLE_INSTALL_DEMO_CONFIG=true
-e DISABLE_SECURITY_PLUGIN=true
opensearchproject/opensearch:latest

Create container of PGSync
Create Dockerfile for the image of PGSync

Base image

FROM python:3.9

Define the working directory

ARG WORKDIR=/code
RUN mkdir -p $WORKDIR
WORKDIR $WORKDIR

Install PGSync from the GitHub repository

RUN pip install git+https://github.com/toluaina/pgsync.git

Copy the wait-for-it script (for waiting on dependencies like PostgreSQL)

COPY ./wait-for-it.sh wait-for-it.sh

COPY ./schema.json schema.json

Make sure the wait-for-it script is executable

RUN chmod +x wait-for-it.sh

Add the runserver script (your custom script to start the pgsync server)

COPY ./runserver.sh runserver.sh

Make sure the runserver script is executable

RUN chmod +x runserver.sh

Expose any required ports (if applicable)

EXPOSE 5000

Command to run the server script (it will wait for PostgreSQL and OpenSearch)

CMD ["./runserver.sh"]

Before building the image of PGSync we need to make 2 scripts for pgsync container that are going to be copied in the container of postgresql

Runserver.sh
$ vim runserver.sh
#!/bin/sh
#set -u

Wait for PostgreSQL to be ready

#./wait-for-it.sh strapiDB:5432 -t 60 --strict -- echo "PostgreSQL is up"

Wait for OpenSearch to be ready

#./wait-for-it.sh 192.168.1.8:9400 -t 60 --strict -- echo "OpenSearch is up"

Wait for Redis to be ready

#./wait-for-it.sh my-project_redis_1:6379 -t 60 --strict -- echo "Redis is up"

Define the configuration file path for pgsync

#CONFIG_FILE="./schema.json"
#./schema.json

Run schema setup or any other initialization (if needed)

Ensure your schema setup is already handled. If not, include commands here.

If schema files like schema.py or schema.json are not relevant, these can be omitted.

Run PGSync with the configuration file

#pgsync --config $CONFIG_FILE --daemon

Optional: Any other bootstrap tasks or custom logic can be added here.

#!/bin/sh

#!/bin/sh
set -u

Wait for PostgreSQL to be ready

./wait-for-it.sh strapiDB:5432 -t 60 --strict -- echo "PostgreSQL is up"

Wait for OpenSearch to be ready

./wait-for-it.sh 192.168.1.8:9400 -t 60 --strict -- echo "OpenSearch is up"

Wait for Redis to be ready

./wait-for-it.sh my-project_redis_1:6379 -t 60 --strict -- echo "Redis is up"

Define the configuration file path for pgsync

CONFIG_FILE="./schema.json"

Check if the schema.json file exists

if [ ! -f "$CONFIG_FILE" ]; then
echo "Error: $CONFIG_FILE not found. Exiting."
exit 1
fi

Validate schema.json is valid JSON format

if ! jq empty "$CONFIG_FILE" > /dev/null 2>&1; then
echo "Error: $CONFIG_FILE is not valid JSON. Exiting."
exit 1
fi

Run PGSync with the configuration file

pgsync --config "$CONFIG_FILE" --daemon

Optional: Any other bootstrap tasks or custom logic can be added here.

wait-for-it.sh
$ vim wait-for-it.sh
#!/usr/bin/env bash

Use this script to test if a given TCP host/port are available

cmdname=$(basename $0)

echoerr() { if [[ $QUIET -ne 1 ]]; then echo "$@" 1>&2; fi }

usage()
{
cat << USAGE >&2
Usage:
$cmdname host:port [-s] [-t timeout] [-- command args]
-h HOST | --host=HOST Host or IP under test
-p PORT | --port=PORT TCP port under test
Alternatively, you specify the host and port as host:port
-s | --strict Only execute subcommand if the test succeeds
-q | --quiet Don't output any status messages
-t TIMEOUT | --timeout=TIMEOUT
Timeout in seconds, zero for no timeout
-- COMMAND ARGS Execute command with args after the test finishes
USAGE
exit 1
}

wait_for()
{
if [[ $TIMEOUT -gt 0 ]]; then
echoerr "$cmdname: waiting $TIMEOUT seconds for $HOST:$PORT"
else
echoerr "$cmdname: waiting for $HOST:$PORT without a timeout"
fi
start_ts=$(date +%s)
while :
do
if [[ $ISBUSY -eq 1 ]]; then
nc -z $HOST $PORT
result=$?
else
(echo > /dev/tcp/$HOST/$PORT) >/dev/null 2>&1
result=$?
fi
if [[ $result -eq 0 ]]; then
end_ts=$(date +%s)
echoerr "$cmdname: $HOST:$PORT is available after $((end_ts - start_ts)) seconds"
break
fi
sleep 1
done
return $result
}

wait_for_wrapper()
{
if [[ $QUIET -eq 1 ]]; then
timeout $BUSYTIMEFLAG $TIMEOUT $0 --quiet --child --host=$HOST --port=$PORT --timeout=$TIMEOUT &
else
timeout $BUSYTIMEFLAG $TIMEOUT $0 --child --host=$HOST --port=$PORT --timeout=$TIMEOUT &
fi
PID=$!
trap "kill -INT -$PID" INT
wait $PID
RESULT=$?
if [[ $RESULT -ne 0 ]]; then
echoerr "$cmdname: timeout occurred after waiting $TIMEOUT seconds for $HOST:$PORT"
fi
return $RESULT
}

process arguments

while [[ $# -gt 0 ]]
do
case "$1" in
: )
hostport=(${1//:/ })
HOST=${hostport[0]}
PORT=${hostport[1]}
shift 1
;;
--child)
CHILD=1
shift 1
;;
-q | --quiet)
QUIET=1
shift 1
;;
-s | --strict)
STRICT=1
shift 1
;;
-h)
HOST="$2"
if [[ $HOST == "" ]]; then break; fi
shift 2
;;
--host=)
HOST="${1#
=}"
shift 1
;;
-p)
PORT="$2"
if [[ $PORT == "" ]]; then break; fi
shift 2
;;
--port=)
PORT="${1#
=}"
shift 1
;;
-t)
TIMEOUT="$2"
if [[ $TIMEOUT == "" ]]; then break; fi
shift 2
;;
--timeout=)
TIMEOUT="${1#
=}"
shift 1
;;
--)
shift
CLI=("$@")
break
;;
--help)
usage
;;
*)
echoerr "Unknown argument: $1"
usage
;;
esac
done

if [[ "$HOST" == "" || "$PORT" == "" ]]; then
echoerr "Error: you need to provide a host and port to test."
usage
fi

TIMEOUT=${TIMEOUT:-15}
STRICT=${STRICT:-0}
CHILD=${CHILD:-0}
QUIET=${QUIET:-0}

check to see if timeout is from busybox?

TIMEOUT_PATH=$(realpath $(which timeout))
if [[ $TIMEOUT_PATH =~ "busybox" ]]; then
ISBUSY=1
BUSYTIMEFLAG="-t"
else
ISBUSY=0
BUSYTIMEFLAG=""
fi

if [[ $CHILD -gt 0 ]]; then
wait_for
RESULT=$?
exit $RESULT
else
if [[ $TIMEOUT -gt 0 ]]; then
wait_for_wrapper
RESULT=$?
else
wait_for
RESULT=$?
fi
fi

if [[ $CLI != "" ]]; then
if [[ $RESULT -ne 0 && $STRICT -eq 1 ]]; then
echoerr "$cmdname: strict mode, refusing to execute subprocess"
exit $RESULT
fi
exec "${CLI[@]}"
else
exit $RESULT
fi

Create schema.json in the same directory that dockerfile is going to copy it in the container
$ vim schema.json
[
{
"database": "npi",
"index": "calendar_holidays_index",
"nodes": {
"table": "calendar_holidays",
"schema": "public",
"columns": [
"id",
"review_email",
"alias",
"title",
"date",
"holiday_type",
"review",
"created_at",
"updated_at",
"published_at",
"created_by_id",
"updated_by_id",
"day",
"keywords",
"calendar_year",
"weight"
],
"indexes": [
{
"name": "calendar_holidays_pkey",
"columns": ["id"],
"type": "btree",
"primary": true
},
{
"name": "calendar_holidays_created_by_id_fk",
"columns": ["created_by_id"],
"type": "btree"
},
{
"name": "calendar_holidays_updated_by_id_fk",
"columns": ["updated_by_id"],
"type": "btree"
}
],
"children": []
}
}
]

Built the image of PGSync
$ sudo docker built -t pgsync-image .

Create the container of pgsync using the image we created
$ sudo docker run -d
--name pgsync
--network Strapi
--sysctl net.ipv4.tcp_keepalive_time=200
--sysctl net.ipv4.tcp_keepalive_intvl=200
--sysctl net.ipv4.tcp_keepalive_probes=5
-e PG_USER=pgsync
-e PG_HOST=strapiDB
-e PG_PORT=5432
-e PG_PASSWORD=redhat
-e LOG_LEVEL=INFO
-e ELASTICSEARCH_PORT=9400
-e ELASTICSEARCH_SCHEME=http
-e ELASTICSEARCH_HOST=opensearch
-e REDIS_HOST=redis
-e REDIS_PORT=6379
-e REDIS_AUTH=redhat
-e OPENSEARCH=true
-p 5000:5000
pgsync-image ./runserver.sh

After creating the containers go into the Postgresql Container and navigate to the Postgres.conf file and make some changes in the file.

$ sudo docker exec -it StrapiDB /bin/bash
$ cd /var/lib/postgres/
$ vim postgres.config

Enable logical decoding in postgres.conf
wal_level = logical

Ensure there is at least one replication slot defined in postgres.conf
max_replication_slots = 1

Add pgdump into the postgres for syncing data

Download the all_data.sql dump file into your machine

Import that dump file to postgres container
$ docker cp /home/hriritk/Downloads/alldata.sql strapiDB:/alldata.sql

Restore the alldata dump to the npi database: After copying the file
$ docker exec -it strapiDB psql -U pgsync -d npi -f /all_data.sql
Move the calender_holidays table from strapi_test to npi: First, dump the calender_holidays table from the strapi_test database:

$ docker exec -it strapiDB pg_dump -U pgsync -d strapi_test -t calender_holidays > calender_holidays.sql

Then, restore it into the npi database:

$ docker exec -i strapiDB psql -U pgsync -d npi < calender_holidays.sql

*** facing an error in creating a pgsync container as it is getting exited again and again due to invalid schema.json. ****



@hrik01
Copy link
Author

hrik01 commented Oct 11, 2024

Please help as this is a very urgent requirement !!!

@hrik01
Copy link
Author

hrik01 commented Oct 14, 2024

updated error :
wait-for-it.sh: waiting 60 seconds for strapiDB:5432
wait-for-it.sh: strapiDB:5432 is available after 0 seconds
PostgreSQL is up
wait-for-it.sh: waiting 120 seconds for 192.168.1.2:9400
wait-for-it.sh: 192.168.1.2:9400 is available after 0 seconds
OpenSearch is up
wait-for-it.sh: waiting 60 seconds for redis:6379
wait-for-it.sh: redis:6379 is available after 0 seconds
Redis is up
2024-10-14 06:14:57.487:WARNING:opensearch: HEAD http://opensearch:9400/calendar_holidays [status:N/A request:0.001s]
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection
raise err
File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

@toluaina
Copy link
Owner

toluaina commented Nov 6, 2024

sorry about the delay. Just in case this hasn't been resolved.
Can you use the opensearch docker file as a starting point?

It seems to me your setup is not allowing connections to OpenSearch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants