Large-scale Text Summarization

Summarize the Web

Worked on Large Language Models(LLMs) for Large-scale Text Summarization powering products used by billions everyday.
Project launched at Apple WWDC2024.


Open-domain Question Answering

Answering all your Questions

Worked on developing ML/NLP models to serve most relevant Answer to your Question and ensure Siri answers are based on most Authoritative sources.


Natural Language Understanding (NLU) for Question Answering

Developed NLU models for Question-Answering in Siri

Developed NLU models for answering knowledge seeking questions to Siri.
Specifically, worked on Multi-task Neural models for NLU.


SmartCompose

Worked on Neural Language Generation for Microsoft SmartCompose

Joint work with Chris Quirk, Peter Bailey and others at Microsoft AI Research

Developed and shipped a text-generation-feature to automatically complete emails in Microsoft Outlook based on what user has typed so far and context of the email.

Specifically, developed Neural Language Models for reranking and text generation- using prior context and additional signals from emails.


Search Query Entity Tagger for LinkedIn Search

Developed CRF based query tagger for LinkedIn Search

Before this project, LinkedIn search was using a Hidden Markov Model(HMM) based query tagger.

I developed a vital component in Search Query Understanding Pipeline that extracts LinkedIn ecosystem entities from your search query using Conditional Random Fields(CRF). Implemented Conditional Random Fields(CRF) library for LinkedIn Search Query Tagger to detect entities like Name, Company, Title, Location, Skill, Geo-location. In order to get this tagger in production - I designed and developed end-to-end pipeline to generate training dataset using SERP click-through chains, extract features, train CRF model and evaluate the model.

These tags are leveraged in downstream components in Query Understanding pipeline to provide most relevant Search Results to users.


Detecting Knowledge worth ingesting for Bing Knowledge Graph

Developed a NLP/ML framework for Bing's Knowledge Graph that is helping selectively ingest knowledge from the web

Joint work with Silviu Cucerzan at Microsoft AI Research

Worked on creating a NLP/ML framework for detecting whether information extracted from crowdsourced knowledge platforms like Wikipedia, Reddit is worth ingesting at any given moment of time. This project was crucial component in Satori- Bing' Knowledge graph as it checked every single knowledge piece getting ingested and preventing ingestion of misinformation, ephemeral and vandalism content from entering Satori Knowledge graph. This component is filtering 100s of millions of knowledge deltas in production and is helping selectively ingest knowledge and continuously grow knowledge graph.


Machine Learned Ranking for Bing and Office365

Developed and shipped ML rankers for Bing and search in Office 365 products

You can learn more about one of the project here : https://blog.linkedin.com/2017/september/250/adding-linkedin_s-profile-card-on-office-365-offers-a-simple-way


CMU Never-Ending-Language-Learner(NELL)

Worked on a component in project NELL

Natural Language Processing framework to detect glosses from large web corpus like Wikipedia and ClueWeb. The core of the framework is based on the filters, transformations, parsers, feature extractors, samplers and modelers in easy-to-use extensible framework design. This enriches NELL's knowledge.

Worked with Prof. William Cohen as advisor. You can read more about this project at: http://rtw.ml.cmu.edu/rtw/


One Laptop per Child

open source contributor for OLPC laptop's sugar desktop environment

Sugar Desktop Environment is being developed for One Laptop Per Child project in collaboration with SugarLabs. My goal was to develop Sugar Activities that makes learning experience fun on XO laptops.

As part of this effort, I have developed to:
  • Wikipedia Hindi - Wikipedia in Hindi for Sugar.
  • DevelopWeb - it is an Activity for Web Development using which children can develop Web Sites through HTML, Javascript and other web technologies. Children can learn quickly how to develop web pages in a step by step approach through examples provided for each HTML component.
  • Oopsy is a Sugar activity that will allow children to develop C/C++ programs, compile them and execute them to learn, explore and have fun!
  • Project Bhagmalpur : worked with Anish Mangal and Dr. Sameer Verma and Gonzalo Odiard to deploy XSCE school server at Bhagmalpur, India.
developed WikipediaHindi for offline access on XO laptop through XSCE school server
Download: https://activities.sugarlabs.org/en-US/sugar/addon/4632
a kid using OLPC laptop in Bhagmalpur, India
kids using Wikipedia Hindi at Bhagmalpur