diff --git a/README.md b/README.md
index b2d37c5..9212037 100644
--- a/README.md
+++ b/README.md
@@ -14,13 +14,6 @@ The project provides parallel molecular dynamics simulations using graph neural
 
 The installation and usage of SEVENNet are split into two parts: training (handled by PyTorch) and molecular dynamics (handled by [`LAMMPS`](https://github.com/lammps/lammps)). The model, once trained with PyTorch, is deployed using TorchScript and is later used to run molecular dynamics simulations via LAMMPS.
 
-## Known issues
-
-* The pressure of the parallel version in LAMMPS is not supported yet.
-* When using parallel MD, if the simulation cell is too small (one of cell dimension < cutoff radius), the calculated force is incorrect.
-
-However, the second issue rarely matters since you can not fully utilize a GPU in this condition. In this case, using only a single GPU gives almost the same speed as multiple GPUs.
-Even though, we're looking for the solution.
 
 ## Requirements for Training
 
@@ -198,6 +191,11 @@ mpirun -np {# of MPI rank to use} {path to lammps binary} -in {lammps input scri
 
 If a CUDA-aware OpenMPI is not found (it detects automatically in the code), `e3gnn/parallel` will not utilize GPUs even if they are available. You can check whether `OpenMPI` is found or not from the standard output of the `LAMMPS` simulation. Ideally, one GPU per MPI process is expected. If the available GPUs are fewer than the MPI processes, the simulation may run inefficiently or fail. You can select specific GPUs by setting the `CUDA_VISIBLE_DEVICES` environment variable.
 
+## Future Work
+
+* Implementation of pressure output in parallel MD simulations.
+* Development of supprt for a tiled communication style (also known as recursive coordinate bisection, RCB) in LAMMPS.
+
 ## Citation
 If you use SevenNet, please cite (1) parallel GNN-IP MD simulation by SevenNet or its pre-trained model SevenNet-0, (2) underlying GNN-IP architecture NequIP 
 
diff --git a/pair_e3gnn/comm_brick.cpp b/pair_e3gnn/comm_brick.cpp
index f0773b1..474a16d 100644
--- a/pair_e3gnn/comm_brick.cpp
+++ b/pair_e3gnn/comm_brick.cpp
@@ -1071,6 +1071,7 @@ void CommBrick::forward_comm(PairE3GNNParallel *pair)
     buf_send_ = reinterpret_cast<float*>(buf_send);
     buf_recv_ = reinterpret_cast<float*>(buf_recv);
   }
+  if (nswap > 6) error->all(FLERR,"PairE3GNNParallel: Cell size is too small. Please use a single GPU or replicate the cell.");
 
   for (iswap = 0; iswap < nswap; iswap++) {
     if(sendproc[iswap] == me) continue;
diff --git a/pair_e3gnn/pair_e3gnn_parallel.cpp b/pair_e3gnn/pair_e3gnn_parallel.cpp
index a5417c1..261e155 100644
--- a/pair_e3gnn/pair_e3gnn_parallel.cpp
+++ b/pair_e3gnn/pair_e3gnn_parallel.cpp
@@ -296,6 +296,7 @@ void PairE3GNNParallel::compute(int eflag, int vflag) {
       if (Rij < cutoff_square) {
         // if given j is not local atom and inside cutoff
         if (tag_to_graph_idx[jtag] == -1) {
+          // if j is ghost atom inside cutoff but first seen
           tag_to_graph_idx[jtag] = graph_indexer;
           graph_index_to_i[graph_indexer] = j;
           node_type_ghost.push_back(map[jtype]);
@@ -642,7 +643,9 @@ void PairE3GNNParallel::init_style() {
   neighbor->add_request(this, NeighConst::REQ_FULL);
 }
 
-double PairE3GNNParallel::init_one(int i, int j) { return cutoff; }
+double PairE3GNNParallel::init_one(int i, int j) { 
+  return cutoff; 
+}
 
 void PairE3GNNParallel::comm_preprocess() {
   assert(!comm_preprocess_done);