Installing seems to work only if python version is between 3.6 and 3.13.
I'm using python 3.13 is there any way to get this working?
Full error message:
pdf_scraper.py
Traceback (most recent call last):
File "/Users/mmarusiak/Documents/Personal/Programming/open-grant/scientific-context/scientific_context/pdf_scraper.py", line 1, in
from pix2text import Pix2Text
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/pix2text/init.py", line 7, in
from .doc_xl_layout import DocXLayoutParser
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/pix2text/doc_xl_layout/init.py", line 4, in
from .doc_xl_layout_parser import DocXLayoutParser
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/pix2text/doc_xl_layout/doc_xl_layout_parser.py", line 20, in
from ..layout_parser import LayoutParser, ElementType
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/pix2text/layout_parser.py", line 7, in
from cnstd import LayoutAnalyzer
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/cnstd/init.py", line 21, in
from .ppocr import PPDetector
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/cnstd/ppocr/init.py", line 23, in
from .rapid_detector import RapidDetector
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/cnstd/ppocr/rapid_detector.py", line 29, in
from rapidocr_onnxruntime.ch_ppocr_det import TextDetector
ModuleNotFoundError: No module named 'rapidocr_onnxruntime.ch_ppocr_det'
Running just your example script:
from pix2text import Pix2Text
from pix2text import Pix2Text
img_fp = "./materials/example.pdf"
p2t = Pix2Text.from_config()
doc = p2t.recognize_pdf(img_fp, page_numbers=[0, 1])
print(doc.text)
If there are no way to import pix2text in python 3.13 and newer maybe include some information about it? using OSX, installing via poetry.
Installing seems to work only if python version is between 3.6 and 3.13.
I'm using python 3.13 is there any way to get this working?
Full error message:
pdf_scraper.py
Traceback (most recent call last):
File "/Users/mmarusiak/Documents/Personal/Programming/open-grant/scientific-context/scientific_context/pdf_scraper.py", line 1, in
from pix2text import Pix2Text
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/pix2text/init.py", line 7, in
from .doc_xl_layout import DocXLayoutParser
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/pix2text/doc_xl_layout/init.py", line 4, in
from .doc_xl_layout_parser import DocXLayoutParser
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/pix2text/doc_xl_layout/doc_xl_layout_parser.py", line 20, in
from ..layout_parser import LayoutParser, ElementType
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/pix2text/layout_parser.py", line 7, in
from cnstd import LayoutAnalyzer
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/cnstd/init.py", line 21, in
from .ppocr import PPDetector
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/cnstd/ppocr/init.py", line 23, in
from .rapid_detector import RapidDetector
File "/Users/mmarusiak/Library/Caches/pypoetry/virtualenvs/scientific-context-omVrjgZG-py3.13/lib/python3.13/site-packages/cnstd/ppocr/rapid_detector.py", line 29, in
from rapidocr_onnxruntime.ch_ppocr_det import TextDetector
ModuleNotFoundError: No module named 'rapidocr_onnxruntime.ch_ppocr_det'
Running just your example script:
If there are no way to import pix2text in python 3.13 and newer maybe include some information about it? using OSX, installing via poetry.