Machine learning Engineer (Computer Vision track) - Bangalore, India

Autonomize AI is seeking a highly motivated machine learning Engineer focused on Computer

Vision to join our core engineering team. You will play a pivotal role in shaping the future of

data-driven decision making in healthcare, by building machine learning based state-of-the-art

solutions for document understanding..

Autonomize is pioneering the use of AI to turn unstructured, dark data into actionable insights

for healthcare knowledge workers. We strive to simplify the journey from raw data to valuable

insights by leveraging the power of machine learning and computer vision. Our team comprises

serial entrepreneurs, data scientists, and engineers who are disrupting the healthcare industry

by bridging the gap between data and decision-making. This role provides a unique opportunity

to work with a global, remote team of experts who are committed to human-machine

collaboration, industry disruption, and most importantly, scaling AI.

About the Role

As a machine learning engineer, you will be at the forefront of designing, fine tuning and

developing solutions for document understanding leveraging techniques including but not limited

to, OCR, vision-to-text multimodal models, and LLMs etc.Your contributions will directly impact

the decision-making process in healthcare. Common problems you need to solve include how to

parse the key information from scanned documents for downstream applications, how to

fix/improve scanned document quality/resolution, and how to build next generation multimodal

models to improve document understanding performances.

Here is what we're looking for in our teammate:

Kick-ass technical competencies

● Hands-on coding: Strong Python / shell skills

● Expertise in OCR and computer vision based document extraction tasks (3 years

specialized experience)

● Working with OCR libraries like PaddlePaddle OCR, Tesseract, and Adobe PDF Library

for at least 2 years.

● Working with transformer based vision models and multi-modal models like Donut,

pix2struct, LayoutLM for at least 1 year.

● Familiarity with PDF processing tools such as PDFMiner, PyPDF2, and pdfplumber.

● Proficiency in text extraction and structuring tools like Tabula, Camelot, or Textract

● Experienced integrating OCR and layout detection into end-to-end document processing

pipelines

● Familiarity with LLMs: Experience in using Language Models for information extraction

● Experience in using frameworks like TensorFlow and/or PyTorch for model development

● Experience in integrating computer vision and NLP systems into full-stack applications,

preferably using Django or Flask

● Proven track record in scaling solutions on GPUs (e.g., Slurm) is a plus

● Data Technologies: Experience with Kafka, Spark, and MongoDB is a plus

Who you are as a person/leader

● Owner mentality: You take full ownership of your work.

● Curious and Experimental: You love solving complex problems and are always up for a

challenge.

● Team Player: You thrive in a collaborative environment and are committed to the team

and mission.

● Excellent Communication: English proficiency is a must.

Nice-to-have industry competencies

● Experience in Healthcare, Pharma, or Life Sciences

● Familiarity with cloud-based ML platforms like AWS SageMaker, GCP Vertex/AutoML

Perks

● Outsized Bet: Opportunity to start at the ground floor in a young, VC-backed company

● Learning: Learn from industry experts and founders

● Startup Perks: Ownership, Autonomy, Mastery

● Standard Benefits: Insurance, paid vacation, and a focus on physical/mental well-being

We are an equal opportunity employer.