MarkushGenerator

python3.10 -m venv markushgenerator-env
source markushgenerator-env/bin/activate

PIP_USE_PEP517=0 pip install -e .

sudo apt-get install openjdk-17-jdk
sudo update-alternatives --config 'java'

Download the CDK library (version cdk-2.9.jar) from and move it to MarkushGenerator/lib/.

wget https://github.com/cdk/cdk/releases/download/cdk-2.9/cdk-2.9.jar -P ./lib/

Generation

The notebook MarkushGenerator/markushgenerator/draw.ipynb shows how to:

Each generated sample contains the:

CXSMILES.
Optimized CXSMILES.
Markush structure image.
OCR cells, containing the position and content of text written in the images. Some characters are currently omitted such as explicit carbons and implicit hydrogens. Atoms with charges are formatted as "atom, charge, numger of charges". Superscripts and subscripts are ignored.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
data		data
lib		lib
markushgenerator		markushgenerator
tests/cdk		tests/cdk
LICENSE.txt		LICENSE.txt
README.md		README.md
setup.py		setup.py