Machine learning Engineer (Computer Vision track) - Bangalore, India
Autonomize AI is seeking a highly motivated machine learning Engineer focused on Computer
Vision to join our core engineering team. You will play a pivotal role in shaping the future of
data-driven decision making in healthcare, by building machine learning based state-of-the-art
solutions for document understanding..
Autonomize is pioneering the use of AI to turn unstructured, dark data into actionable insights
for healthcare knowledge workers. We strive to simplify the journey from raw data to valuable
insights by leveraging the power of machine learning and computer vision. Our team comprises
serial entrepreneurs, data scientists, and engineers who are disrupting the healthcare industry
by bridging the gap between data and decision-making. This role provides a unique opportunity
to work with a global, remote team of experts who are committed to human-machine
collaboration, industry disruption, and most importantly, scaling AI.
About the Role
As a machine learning engineer, you will be at the forefront of designing, fine tuning and
developing solutions for document understanding leveraging techniques including but not limited
to, OCR, vision-to-text multimodal models, and LLMs etc.Your contributions will directly impact
the decision-making process in healthcare. Common problems you need to solve include how to
parse the key information from scanned documents for downstream applications, how to
fix/improve scanned document quality/resolution, and how to build next generation multimodal
models to improve document understanding performances.
Here is what we're looking for in our teammate:
Kick-ass technical competencies
● Hands-on coding: Strong Python / shell skills
● Expertise in OCR and computer vision based document extraction tasks (3 years
specialized experience)
● Working with OCR libraries like PaddlePaddle OCR, Tesseract, and Adobe PDF Library
for at least 2 years.
● Working with transformer based vision models and multi-modal models like Donut,
pix2struct, LayoutLM for at least 1 year.
● Familiarity with PDF processing tools such as PDFMiner, PyPDF2, and pdfplumber.
● Proficiency in text extraction and structuring tools like Tabula, Camelot, or Textract
● Experienced integrating OCR and layout detection into end-to-end document processing
pipelines
● Familiarity with LLMs: Experience in using Language Models for information extraction
● Experience in using frameworks like TensorFlow and/or PyTorch for model development
● Experience in integrating computer vision and NLP systems into full-stack applications,
preferably using Django or Flask
● Proven track record in scaling solutions on GPUs (e.g., Slurm) is a plus
● Data Technologies: Experience with Kafka, Spark, and MongoDB is a plus
Who you are as a person/leader
● Owner mentality: You take full ownership of your work.
● Curious and Experimental: You love solving complex problems and are always up for a
challenge.
● Team Player: You thrive in a collaborative environment and are committed to the team
and mission.
● Excellent Communication: English proficiency is a must.
Nice-to-have industry competencies
● Experience in Healthcare, Pharma, or Life Sciences
● Familiarity with cloud-based ML platforms like AWS SageMaker, GCP Vertex/AutoML
Perks
● Outsized Bet: Opportunity to start at the ground floor in a young, VC-backed company
● Learning: Learn from industry experts and founders
● Startup Perks: Ownership, Autonomy, Mastery
● Standard Benefits: Insurance, paid vacation, and a focus on physical/mental well-being
We are an equal opportunity employer.