Skip to content
View pps121's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report pps121

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pps121/README.md

Email LinkedIn Portfolio Scholar ORCID


🧭 Research Vision

"AI systems must be aligned not only behaviourally but geometrically β€” with transparent, auditable internal representations that can be verified, not just tested."

I build differential-geometric and information-geometric frameworks to answer a question that keeps me up at night: What actually changes inside a model when we align it?

My research treats transformer hidden-state sequences as discrete curves on a Riemannian belief manifold equipped with the Fisher-Rao metric. The torsion tensor β€” antisymmetric component of cross-layer covariance β€” captures rotational mismatch invisible to attention patterns or activation norms. I call the resulting suppression pockets "brake layers": geometrically localised, alignment-specific, and falsifiable with existing causal patching tools.

Four active research threads:

  • πŸ”¬ GRAFT β€” Geometric alignment Audit
  • 🧬 MENTIS β€” Multi-scale latent torsion across NeurIPS-scale benchmarks
  • 🌱 AgriTalk β€” Calibrated NLU for agricultural robotics (PhD programme)
  • 🌐 nDNA / Semantic Helix β€” Cultural epistemic inheritance in merged LLMs
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  GRAFT Β· Key Results             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  T2 vs CKA discriminability:     β”‚
β”‚  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  8Γ—          β”‚
β”‚                                  β”‚
β”‚  Normative torsion amplification:β”‚
β”‚  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  20–46Γ—  β”‚
β”‚                                  β”‚
β”‚  Safe-prompt paradox p-value:    β”‚
β”‚  p < 10⁻³³  (OLMo Β· n=20,439)    β”‚
β”‚                                  β”‚
β”‚  AUC T2  : 0.89 [0.85–0.93]      β”‚
β”‚  AUC CKA : 0.61                  β”‚
β”‚                                  β”‚
β”‚  DPO subspace: 2–3 dims          β”‚
β”‚  RLHF subspace: 4–5 dims         β”‚
β”‚                                  β”‚
β”‚  "Alignment writes geometry."    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”¬ Active Research Projects

1 Β· GRAFT β€” Geometric Representations of Alignment's Fingerprint in Transformer Belief Trajectories

Mechanistic Interpretability | graft-belief-geometry

GRAFT is a post-hoc, gradient-free mechanistic audit toolkit that characterises preference alignment via three torsion probes (𝒯, T1, T2) and an ERA depth profiler β€” requiring only forward passes through publicly available checkpoints.

Result Value
T2 concept discriminability CV = 0.64 vs CKA = 0.08 β†’ 8Γ— better
T2 classification AUC 0.89 [0.85, 0.93] vs CKA 0.61
Normative torsion amplification 20–46Γ— larger than factual concepts
Alignment depth address β„“β˜… ∈ {14, 20, 29–30} β€” architecture-specific, falsifiable patching targets
Safe-prompt paradox Safe prompts drive larger Δτ than unsafe (p < 10⁻³³ OLMo; cross-dataset replicated)
Low-rank alignment signature DPO operates in 2–3 dim subspace vs RLHF's 4–5
Benchmark LITMUS Β· 20,439 prompts Β· 7 value axioms Β· 3 null-baseline controls
Models OLMo-2-7B · Llama-3-8B · Mistral-7B · Qwen-2.5-7B (IT→PA pairs)
πŸ“Œ Three pre-registered falsifiable hypotheses β€” all confirmed

H1 (Concept selectivity): Alignment torsion Ξ”_f is larger for normative concepts than factual ones; T2 spectral anisotropy is the dominant mechanistic signature (CV > 0.50; AUC > 0.85). βœ… Confirmed β€” CV = 0.64, AUC = 0.89, normative 20–46Γ— > factual; 3 null-baseline controls hold

H2 (Depth address): Alignment concentrates at an architecture-specific depth β„“β˜…, determined by architecture family β€” providing a surgical patching target. βœ… Confirmed β€” β„“β˜… reproducible per architecture family, compatible with ROME/ACDC patching

H3 (Safe-prompt paradox): Safe prompts produce larger alignment torsion Δτ than unsafe ones. βœ… Confirmed β€” p < 10⁻³³ (OLMo), p < 10⁻⁴ (Llama); cross-dataset replication on SafetyBench + WildGuard (n=150 each)


2 Β· MENTIS β€” What Belief Changes Under Alignment?

NeurIPS 2026 (in preparation) | Measuring Multi-Scale Latent Torsion in Language Models Team: Partha Pratim Saha Β· Samarth Raina Β· Mayur Parvatikar Β· Amit Dhanda Β· Vinija Jain Β· Aman Chadha Β· Amitava Das

MENTIS delivers a NeurIPS-scale empirical study of belief geometry across the full LITMUS benchmark, introducing 8 new torsion metrics and rigorous thermodynamic analysis across DPO, RLHF, and SFT checkpoints.

MENTIS Β· Headline Metrics
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  DPO torsion suppression (Mistral, Layer 27)
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  44.4%   Cohen's d = 0.741 ***
  Bonferroni-corrected p-value        7.7 Γ— 10⁻¹³

  H1 normative amplification          1500Γ—  (vs factual concepts)
  Thermodynamic gap (normative/factual)  10Γ—
  Entropy–torsion bridge (Mistral)    ρ = βˆ’0.387   p = 5.43 Γ— 10⁻³⁰

  DTW–Torsion lower bound
  DC(w) β‰₯ 0.875 Β· |Ξ£β€–Sα΄΅α΅€β€–_F βˆ’ Ξ£β€–Sᴾᴬ‖_F|

  17 geometric metrics Β· 3Γ—2 SFT/DPO model pairs Β· 500 unsafe prompts
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

3 Β· AgriTalk β€” Calibrated NLU for Agricultural Robotics

PhD Research Programme Β· GreenFieldData Competition | AgriTalk

AgriTalk proposes calibrated natural-language control interfaces for agricultural spray robots. Three pillars missing from all existing approaches: formal safety guarantees, mechanistic explainability via BVF attribution, streaming grounding under sensor dropout.

Contribution Core Guarantee Target Venue
C1 Conformal NLU (RAPS) P(y∈C(x)) β‰₯ 95% under seasonal shifts, HITL ≀ 25% EMNLP/ACL 2027
C2 BVF Attribution Kendall Ο„(IG, BVF) > 0.5 on safety-critical intents ACL 2029
C3 Temporal Streaming Architecture Grounding recall maintained at 10–50% sensor dropout VLDB 2028
C4 Conformal Trust Evaluation BVF explanations achieve superior trust calibration vs CoT FAccT 2029

5-layer safety stack: Input Sanitiser β†’ Staleness Verifier β†’ Conformal Predictor (RAPS) β†’ Attribution Sufficiency Gate β†’ Non-bypassable HITL for ABORT / EMERGENCY_STOP


4 Β· Semantic Helix / nDNA β€” Cross-Cultural AI

Active | Epistemic inheritance in merged LLMs via Fisher-Rao geometry

Unifies fine-tuning, alignment, distillation, and merging as measurable deformations of depth-wise semantic flow. Cultural nDNA measured via spectral curvature deviation Δκ_β„“ and thermodynamic length divergence Ξ”L_β„“ across 8 cultural axes: African Β· Latin American Β· South Asian Β· East Asian Β· Arabic Β· Indigenous Β· European Β· Pacific Islander.


πŸ“„ Publications

Year Venue Title Role
2026 MechInterp GRAFT: Geometric Representations of Alignment's Fingerprint in Transformer Belief Trajectories First author
2026 NeurIPS 2026 (prep) MENTIS: What Belief Changes Under Alignment? Multi-Scale Latent Torsion in LLMs First author Β· P.P. Saha, S. Raina, M. Parvatikar, A. Dhanda, V. Jain, A. Chadha, A. Das
2025 Preprint SPINAL: Scaling-law and Preference Integration in Neural Alignment Layers Co-author
2025 NeurIPS 2025 Workshop Prompting Away Stereotypes? Evaluating Bias in Text-to-Image Models for Occupations Co-author Β· arXiv
2025 Journal Enhancing Human Empathy in Conversations Using Transformer-Based Models Top contributor Β· DOI
2024 SpringerNature ICOMP'24 Collaborative Federated Learning Cloud Based System First author Β· Paper

πŸ“Š GitHub Activity

GitHub Stats Top Languages

Contribution Graph


πŸ› οΈ Technical Arsenal

πŸ”¬ Geometric & Mathematical ML

Fisher-Rao Metric
Cartan Torsion Tensor
Riemannian Manifolds
Frenet-Serret Framework
Persistent Homology
Spectral Methods (T2)
DTW Analysis
Information Geometry
Thermodynamic Length
Holonomy Defect

πŸ€– LLMs & Alignment

Mechanistic Interpretability
DPO / RLHF / SFT Probing
AI Safety & Alignment
Torsion-based Auditing
Belief State Geometry
Representation Engineering
Conformal Prediction (RAPS)
XAI / Explainable AI
AgentAI / Multi-Agent Systems

βš™οΈ Stack

PyTorch Β· Transformers (HF)
OLMo Β· Llama Β· Mistral
Qwen Β· DeepSeek Β· Zephyr
LangChain Β· LlamaIndex
NumPy Β· SciPy Β· Plotly
Docker Β· Azure Β· AWS
LaTeX Β· MetaFlow
dtaidistance Β· scikit-learn

πŸ† Recognition & Fellowships

Award Details
πŸ›‘οΈ BlueDot Impact Scholar AGI Strategy + Technical AI Safety (2025–2026) β€” catastrophic risk, power-seeking, geometric alignment evaluation
πŸ”¬ LASR Labs Progressed through initial selection β€” mechanistic interpretability research programme
πŸ“‹ NeurIPS 2025 Reviewer MTI-LLM Workshop
πŸ–₯️ 5Γ— Google Colab Pro A100/H100 300 GPU units each β€” Neuromatch Academy AI Safety grant
☁️ AWS AI & ML Scholar Udacity, 2025
πŸŽ“ Armenian LLM Summer School 2025 90% full scholarship
πŸ† SPAR Demo Day 2025 AI Safety & Alignment demonstration β€” Neuromatch / AI Safety cohort
🌐 Duke ML Summer School 2025 Competitive selection
🌐 Cohere Summer School 2025 Competitive selection
πŸ™οΈ University of Chicago DSI 2024 AI-Science Research Program Β· Eric & Wendy Schmidt Postdoctoral Fellowship
πŸŽ“ MLx Generative AI Fellowship Oxford ML Summer School 2024 & 2025 β€” competitive scholarship award
🌍 Athens NLP Summer School 2024 Competitive international selection β€” NLP & large language models
🧠 diiP Summer School 2024 Paris β€” Deep Learning & Interpretability in Practice Β· competitive selection
πŸ€– Neuromatch Academy Deep Learning Summer School β€” competitive global selection
πŸ—½ NYU AI Summer School 2022 New York University β€” competitive selection
πŸ€– AI4 IMPACT Scholar AI Singapore 2021 β€” selected AI practitioner programme
πŸ’‘ Google Developer's Program Google 2019 β€” Google Developer Expert community selection
πŸŽ“ Udacity Bertelsmann Tech Scholarship Google-sponsored β€” competitive global selection

πŸ—‚οΈ Key Repositories

Repository Description Status
graft-belief-geometry GRAFT: post-hoc geometric audit of alignment β€” MechInterp 🟒 Active
AgriTalk Calibrated NLU for agricultural spray robots β€” conformal prediction + BVF attribution 🟒 Active
torsional-belief-vector-field TBVF / MENTIS β€” Riemannian torsion framework for alignment auditing 🟒 Active
AutoResearchClaw Autonomous AI research pipeline: idea β†’ full conference-ready paper 🟒 Contributed
pps121.github.io Full academic portfolio β€” research, publications, CV 🟒 Live

πŸ’Ό 12 Years Across Industry & Academia

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
πŸŽ“  Lecturer in CS          Nalhati Govt. Polytechnic College      2021 – present
                             BITS Pilani M.Tech Β· GPA 9.08/10 Β· Top 5%
                             50+ students supervised Β· 4 active research projects

πŸ“š  Teaching Assistant      BITS Pilani (M.Tech Programme)         2021 – 2023
                             NLP Applications [Winter 2023] Β· Deep Learning [Fall 2021]
                             Deep Reinforcement Learning [Spring 2021]
                             Honorarium: USD $2,513.11 across 3 courses

🏭  Lead Data Scientist     Wipro Limited (Bangalore)              2021
                             Conversational AI Β· IBM Watson Β· 0.3M users

πŸ”¬  Senior Data Scientist   BirlaSoft / Johnson & Johnson R&D       2017 – 2019
                             Medical search engine Β· SciBERT/SpaCy Β· 0.1M+ users

πŸ›‘οΈ  Project Engineer        IIT Kanpur                             2016 – 2017
                             Threat intelligence system Β· cybersecurity research

🧬  Senior Systems Engg.    Infosys Technologies (Chennai)         2011 – 2015
                             DNA alignment algorithms Β· Multiple Myeloma genomics
                             Top 10 cancer-driving genes Β· 3 research papers
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

πŸ“¬ Let's Collaborate

I am actively seeking fully-funded PhD positions starting 2026 and open to:

  • 🀝 Research collaborations in mechanistic interpretability, geometric ML, or AI safety
  • πŸŽ“ PhD discussions with faculty at world-class AI safety & interpretability groups
  • πŸ”­ Partnerships validating geometric torsion findings against circuit-level causal analysis
  • 🌱 Applied work connecting geometric alignment theory to deployment-time safety monitoring

If you work at Anthropic Β· Redwood Research Β· ARC Evals Β· Oxford FHI Β· Cambridge LTL Β· MIT Β· CMU Β· or any world-class AI safety group β€” I would love to talk.

Email Me Full Portfolio

"Geometry is not decoration. It is the language in which alignment speaks."

Pinned Loading

  1. 100-Days-Of-ML-Code 100-Days-Of-ML-Code Public

    Forked from Avik-Jain/100-Days-Of-ML-Code

    100 Days of ML Coding

    Jupyter Notebook 1 1

  2. ACL2022_KnowledgeNLP_Tutorial ACL2022_KnowledgeNLP_Tutorial Public

    Forked from zcgzcgzcg1/ACL2022_KnowledgeNLP_Tutorial

    Materials for ACL-2022 tutorial: Knowledge-Augmented Methods for Natural Language Processing

    1

  3. annotated_research_papers annotated_research_papers Public

    Forked from AakashKumarNain/annotated_research_papers

    This repo contains annotated research papers that I found really good and useful

    1

  4. awesome-nlp awesome-nlp Public

    Forked from abi-aryan/awesome-nlp

    πŸ“– A curated list of resources dedicated to Natural Language Processing (NLP)

    1

  5. awesome-notebooks awesome-notebooks Public

    Forked from jupyter-naas/awesome-notebooks

    Ready to use data science templates, organized by tools to jumpstart your projects and data products in minutes. 😎 published by the Naas community.

    Jupyter Notebook 1

  6. google-research google-research Public

    Forked from google-research/google-research

    Google Research

    Jupyter Notebook