Data Engineer · AI Specialist · 6+ Years

Data pipelines, AI agents & analytics that ship to production.

I’m Pranay — a Data Engineer, AI Automation Specialist, and Analytics Expert with 6+ years building intelligent data systems for startups, scale-ups, and enterprises. From ETL pipelines and cloud data warehouses to LLM-powered agents and BI dashboards — built to last, not to demo.

110+Projects delivered
100%Job Success · Top Rated
Portrait of Pranay Patodi
What I do

Engagements built around your data problem.

From a one-off pipeline to a full analytics stack — here’s where teams hire me.

AI & Generative AI Development

Custom AI built on GPT-4, Claude, and Gemini. RAG pipelines, multi-agent systems, document intelligence, and LLM-integrated backends — production grade, not demos.

OpenAIClaudeLangChainLangGraphCrewAIRAG

Data Engineering & Pipelines

End-to-end ETL/ELT in Python and SQL, orchestrated with Airflow, Prefect, or Dagster. Warehouse design on Snowflake, BigQuery, Redshift, and Databricks.

dbtAirflowSnowflakeBigQueryDatabricks

Web Scraping & Automation

Large-scale scrapers in Scrapy, Playwright, and Selenium. API integrations and workflow automation in n8n and Make — bots that quietly do the boring work.

ScrapyPlaywrightn8nMakeSelenium

Data Analytics & Science

KPI modeling, cohort and churn analysis, A/B testing, segmentation, and predictive models in Pandas / scikit-learn / XGBoost — tied back to a real business question.

Pandasscikit-learnXGBoostA/B testing

BI Dashboards & Reporting

Executive-ready dashboards in Tableau, Power BI, Looker Studio, Metabase, Plotly, and Dash. Built for clarity and speed of insight, not chart count.

TableauPower BILookerMetabase

Backend & API Development

Production APIs and internal tools in FastAPI, Django, or Node. Multi-tenant SaaS backends, webhook plumbing, and clean REST/GraphQL contracts.

FastAPIDjangoNodeRESTGraphQL
Tech stack

The toolbox.

I’m language- and vendor-agnostic, but here’s what I reach for most often.

Languages

  • Python
  • SQL
  • JavaScript / TypeScript
  • R
  • C / C++

Data Engineering

  • Apache Airflow
  • PySpark
  • dbt
  • AWS Glue
  • Kafka

Warehouses & DBs

  • Snowflake
  • Redshift
  • BigQuery
  • PostgreSQL
  • MySQL
  • MongoDB
  • DynamoDB

Cloud & Infra

  • AWS (EC2, S3, RDS, Lambda)
  • GCP
  • Docker
  • Terraform

BI & Visualization

  • Tableau
  • Power BI
  • Lightdash
  • Looker
  • Metabase

App Development

  • Django
  • FastAPI
  • React
  • Next.js
  • REST & GraphQL

ML & AI

  • scikit-learn
  • TensorFlow
  • Keras
  • spaCy
  • LangChain
  • OpenAI / Anthropic APIs

Tools

  • Git
  • GitHub Actions
  • Linear / Jira
  • Notion
  • Slack
Selected work

A few things I’ve built recently.

Mix of client engagements, research projects, and tools I’ve shipped end-to-end.

AI·BI
Upwork · ★ 5.0

AI-Powered BI Platform for Cannabis Operations

Full-stack analytics platform with AI-assisted insights, natural-language querying, and operational reporting tailored for multi-state cannabis operators.

LLMBIPythonSQL
CLIN
Upwork · ★ 5.0

AI Clinical Intelligence & Documentation Platform

Healthcare platform that generates AI-assisted documentation, monitors patient data, and surfaces care insights for clinicians.

OpenAIHealthcareRAGFastAPI
LEAD
Upwork · ★ 5.0

AI Lead Qualification & Auto Follow-Up

End-to-end lead intake, scoring, and automated multi-channel follow-up. Cut manual qualification effort to a fraction.

LangChainn8nCRMAutomation
ROI
Upwork · ★ 5.0 · $1.3K

Executive Healthcare Burnout-ROI Dashboards

Retool / Looker Studio dashboards for an executive audience — quantifying clinician burnout impact and intervention ROI in a POC engagement.

Looker StudioRetoolHealthcare
DBT
Upwork · 74 hrs

RevdUp Data Models

Designed and implemented dbt models powering analytics across the RevdUp product, including governed metrics and downstream BI feeds.

dbtSQLSnowflake
MAT
Upwork · ★ 5.0 · $4.2K

Material Database Build-out

Long-running engagement designing and maintaining a domain-specific materials database with import pipelines, schema evolution, and QA tooling.

PythonPostgreSQLETL
Past project

Health & Calorie Chatbot

Retrieval-based assistant with a food recommendation engine, automated meal logging, and entity extraction over user input patterns.

PythonspaCyNLP
ETL
Past project

6B-Record COVID-19 ETL

Big-data pipelines in PySpark and SQL processing 6 billion EHR records, surfacing patterns for downstream statistical and ML models.

PySparkSQLAWSHealthcare
Past project

Amazon Scraper & Repricer

Web app for dropshippers to source products via Amazon MWS, apply pricing rules, and re-list at scale. Cut listing time by ~25%.

PythonDjangoSelenium
See all projects on Upwork
About

Hi, I’m Pranay.

I’m a data engineer, AI automation specialist, and analytics expert with 6+ years building intelligent data systems for startups, scale-ups, and enterprises. I work with teams to solve complex data and automation challenges — from architecting cloud data warehouses to deploying LLM-powered agents into production workflows.

I have a Master’s in Computer Science from The George Washington University and a Bachelor’s in Computer Science Engineering from RGPV, India. On Upwork I’m Top Rated with 100% Job Success, 110+ projects delivered and 3,050+ hours logged.

If you need a pipeline that doesn’t silently break, an AI agent that actually works in production, or a dashboard your execs will open more than once — let’s talk.

  • 2021 — Present
    Senior Data Engineer
    Climate LLC — automating data generation pipelines that feed agronomy models for growers
  • 2013 — Present
    Principal Software Engineer (Long-term)
    MMF Infotech Technologies — data, ML, and product engineering across 79K+ hours
  • 2021 — 2024
    Research Assistant & Data Engineer
    The George Washington University — 6B-record ETL on EHR / COVID data
  • 2020
    Data Analyst Intern
    V-ETS, LLC (Washington, DC) — AWS pipelines for healthcare analytics
View Upwork profile LinkedIn
How we’ll work

Simple, transparent, no theatre.

Most engagements follow this rhythm. Anything else, we figure out together.

Discover

30-min call to understand your data, your goal, and your constraints. No charge, no obligation.

Scope

I write up a short proposal: what you’ll get, when, and what it costs. Fixed fee or hourly.

Build

Iterative delivery in a shared repo. You see progress weekly — no black-box six-week silences.

Hand-off

Documentation, tests, walkthroughs. Your team should be able to own and extend the work.

Let’s talk

Have a project in mind? Tell me about it.

Quick intro, scope of work, deadline if you have one. I’ll reply within a day with whether I’m the right fit and what working together could look like.