Similar repositories to sdtblck/PDFextract:
sdtblck/PDFextract
github
similar
finetuneanon/gpt-neo_finetune_2.7B
github
similar
browsermt/students
github
similar
oscar-corpus/ungoliant
github
similar
oscar-corpus/goclassy
github
similar
salesforce/booksum
github
similar
ayaka14732/tpu-starter
github
similar
AMontgomerie/question_generator
github
similar
Vamsi995/Paraphrase-Generator
github
similar
bitextor/bitextor
github
similar
simonepri/lm-scorer
github
similar
bytedance/neurst
github
similar
webrecorder/warcio
github
similar
OpenNMT/CTranslate2
github
similar
Ki6an/fastT5
github
similar
facebookresearch/SentAugment
github
similar
google-research/deduplicate-text-datasets
github
similar
lucidrains/RETRO-pytorch
github
similar
facebookresearch/flores
github
similar
zacharywhitley/awesome-ocr
github
similar
PrithivirajDamodaran/Parrot_Paraphraser
github
similar
adbar/trafilatura
github
similar
ottokart/punctuator2
github
similar
nil0x42/duplicut
github
similar
UKPLab/EasyNMT
github
similar
grammarly/gector
github
similar
microsoft/fastformers
github
similar
pettarin/forced-alignment-tools
github
similar
VKCOM/YouTokenToMe
github
similar
google-research/multilingual-t5
github
similar
facebookresearch/denoiser
github
similar
dragnet-org/dragnet
github
similar
timoschick/pet
github
similar
fhamborg/news-please
github
similar
dair-ai/nlp_paper_summaries
github
similar
minimaxir/aitextgen
github
similar
snakers4/silero-models
github
similar
kermitt2/grobid
github
similar
EasyEngine/easyengine
github
similar
buriy/python-readability
github
similar