Skip to content

Commit

Permalink
updates
Browse files Browse the repository at this point in the history
  • Loading branch information
johnathanchiu committed Oct 7, 2024
1 parent 0dd2dee commit 027472b
Show file tree
Hide file tree
Showing 2 changed files with 38 additions and 3 deletions.
31 changes: 31 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Recursive Segmentation Model

The ideas presented in this repository are largely based off the original paper from 1995: _Recursive XY cut using bounding boxes of connected components_ (https://ieeexplore.ieee.org/document/602059).

**_Disclaimer_**: _This is an unbenchmarked segmentation model. It works decently well for documents at first glance and will be extended to general images in the near future. I also need to find a better name for this package._

## Getting Started

This repository is pushed to a PyPI distribution.

## Examples

See `main.py` for examples on how to draw the images.

## Local Setup

```
pip install -r requirements.txt
```

## Additional Information

This algorithm works particularly well with documents that have a lot of diagrams and that are well spaced. It performs poorly on documents that are purely text-based perform poorly.

At the moment, I am looking to build out an ML model to determine when to split chunks in the page. The main principle would be to train a seq2seq model that outputs a binary sequence. The sequence input is the slices of the image and the output is a binary sequence where a 1 represents a split in the image and 0 otherwise.

Like any bounding box segmentation algorithm, the main limitation is the shape of the segmentation. Edge cases arise when the input image is not necessarily framed in a grid-shape. Take an example where an image contains "L" shaped objects. This makes it impossible to segment out the "L" shaped object defined by a bounding box. If anyone has any ideas on how to improve this, please feel free to suggest!

## Contributing

Feel free to contribute to this repository through Pull Requests and Issues. Reach out to me if you have any ideas surrounding this that you want to discuss!
10 changes: 7 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[project]
name = "xyimage"
name = "xy-segmentation"
version = "0.0.1"
description = "Recursive Segmentation Algorithm"
readme = "README.md"
Expand All @@ -19,5 +19,9 @@ packages = []
"__init__.py" = ["F401", "F821", "E402"]

[build-system]
requires = ["setuptools>=64"]
build-backend = "setuptools.build_meta"
requires = ["hatchling"]
build-backend = "hatchling.build"

[project.urls]
Homepage = "https://github.com/johnathanchiu/recursive-segmentation"
Issues = "https://github.com/johnathanchiu/recursive-segmentation/issues"

0 comments on commit 027472b

Please sign in to comment.