SoftBank Vision Fund

Career | Applied AI Scientist, GenAI and ML Prototyping | C2FO

Applied AI Scientist, GenAI and ML Prototyping

C2FO

Noida, IN / Uttar Pradesh, IN

Job Type: Full-Time
Function: Life Sciences R&D/Engineering
Industry: Fintech
Post Date: 05/28/2026
Website: c2fo.com
Company Address: 2020 West 89th Street , Second floor, Leawood, KS 66206, US

About C2FO

C2FO is the worldâ€™s on-demand working capital platform, providing fast, flexible and equitable access to low-cost capital to nearly 2 million businesses worldwide.

Job Description

Role Overview

We are looking for an Engineer/Data Scientist to lead the identification and rapid prototyping of AI solutions across our business — spanning both internal operations and customer-facing products.

This role sits at the earliest and most critical stage of our AI delivery lifecycle: Discovery and Proof of Concept. You will partner with Senior and Principal engineers and work directly with department heads and product owners to uncover where AI can create meaningful impact, then design and build working prototypes that demonstrate clear, measurable value. You will own the process from problem framing through to a validated, decision-ready POC — determining whether the right solution is a rule-based system, a traditional machine learning model, or an LLM-based agentic workflow.

Once a prototype is approved, you will work in close collaboration with the rest of the AI Platform Engineering team to translate your work into something that can scale into a production-grade application. You will not co-own productionisation and you will be a critical partner in making it successful.

This is a role for someone who is energised by ambiguity, moves fast without cutting corners, and knows how to make a compelling case for (or against) a technical approach based on evidence rather than enthusiasm.

Core Responsibilities

Business Discovery Run structured discovery sessions with department heads and product owners to identify and scope AI opportunities. Define a clear problem statement — including data availability and constraints — before any prototyping begins.
Rapid Prototyping Build functional POCs using the most appropriate approach for the problem: RAG pipelines, agentic workflows, predictive ML models, or rule-based systems. Prototypes must be credible enough to support a genuine build-or-not decision.
Stakeholder Management Act as the primary technical point of contact for business stakeholders throughout discovery and POC. Communicate trade-offs around accuracy, cost, and latency in plain terms — and be willing to recommend against building when the evidence calls for it.
Evaluation & Validation Define success criteria before building begins. Design and run evaluations appropriate to the POC type, and present findings clearly enough for a non-technical sponsor to make a confident go/no-go decision.
Technical Handoff Produce handoff documentation covering system design, prompt strategies, data requirements, known failure modes, and evaluation benchmarks — giving the AI Engineering team everything needed to take a validated POC into production.

Tech Stack & Technical Requirements

Core Languages & Frameworks

Proficiency in Python as the primary language for data science and ML development (Pandas, NumPy, Scikit-learn)
Familiarity with SQL for data querying and manipulation across modern data warehouses (e.g., BigQuery, Snowflake, PostgreSQL)
(Nice to have) Working knowledge of deep learning frameworks such as PyTorch or TensorFlow for model experimentation

LLM & Generative AI Tooling

Hands-on experience working with large language model APIs, including providers such as OpenAI, Anthropic, or Google
Strong command of prompt engineering techniques, including few-shot prompting, chain-of-thought reasoning, and structured output design
Experience with open-source LLMs (e.g., Mistral, LLaMA) and an understanding of when to apply open vs. proprietary models

Agentic Orchestration & RAG

Practical experience building RAG (Retrieval-Augmented Generation) pipelines, including chunking strategies, embedding models, and retrieval tuning
Familiarity with agentic orchestration frameworks such as LangChain, LangGraph, LlamaIndex, CrewAI, or AutoGen
Experience integrating vector databases (e.g., pgvector, Pinecone, Weaviate, ChromaDB) into search and retrieval workflows
Understanding of tool/function calling patterns for LLM-driven automation

Evaluation & Experimentation

Ability to define and implement "good enough" metrics and evaluation frameworks for POC validation
Experience with LLM evaluation libraries such as RAGAS, TruLens, or DeepEval
Familiarity with experiment tracking tools such as MLflow or Weights & Biases
Comfort with cost and latency profiling of LLM-based systems to inform feasibility decisions

Data & Infrastructure

Comfortable working within cloud environments (AWS, GCP, or Azure) for data access, compute, and API integration
Ability to integrate with REST APIs and third-party data sources during prototyping
Proficiency with standard development tools: Git, Jupyter notebooks, VS Code
Basic familiarity with Docker for packaging and sharing POC environments with engineering teams

Qualifications

Required Experience

4+ years of experience in data science, machine learning, or a closely related field, with a demonstrated track record of delivering end-to-end projects
2+ years of hands-on experience working with large language models or Generative AI solutions in a professional setting
Proven experience taking projects from business problem discovery through to a working prototype or proof of concept
Experience engaging directly with non-technical business stakeholders to gather requirements, set expectations, and communicate results clearly
Strong background in traditional ML approaches (classification, regression, clustering, NLP) alongside modern LLM-based methods

Education

Bachelor's degree in Computer Science, Statistics, Mathematics, Engineering, or a related quantitative field
A Master's or PhD is a plus, though equivalent industry experience is equally valued

Soft Skills & Ways of Working

Ability to translate complex technical outputs into clear business value — you are as comfortable in a boardroom as you are in a notebook
Strong stakeholder management skills, including the ability to set realistic expectations around LLM capabilities, limitations, and cost trade-offs
Excellent written communication skills for documenting prompt strategies, data requirements, and POC logic to enable clean technical handoffs
Self-directed with a high tolerance for ambiguity — you are energised by open-ended discovery, not slowed down by it
Structured thinker who can design evaluation criteria and define what "success" looks like before building begins

Nice to Have

Experience with fine-tuning or instruction-tuning LLMs on domain-specific datasets
Familiarity with responsible AI principles, including bias detection, fairness evaluation, and model transparency
Prior experience in a consulting, pre-sales engineering, or business-facing technical role
Knowledge of business process mapping (e.g., BPMN) to support structured discovery sessions

Commitment to Diversity and Inclusion. As an Equal Opportunity Employer, we not only value diversity and equality, but we also empower our team members to bring their authentic selves to work every day. Our goal is to create a workplace that reflects the communities we serve and our global, multicultural clients. We recognize the power of inclusion, emphasizing that each team member was chosen for their unique ability to contribute to the overall success of our mission.

We do not discriminate based on race, religion, color, sex, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status or any other basis covered by appropriate law. All employment decisions are based on qualifications, merit, and business needs.